“Marble: High-throughput Phenotyping from Electronic Health Records via Sparse Nonnegative Tensor Factorization” — Joyce Ho et al. 2014

Paper: link

Slide: link


  • sparse tensor
  • sparse factor matrices
  • Poisson regression, based on CP-APR
  • For count data

Phenotyping application background:

  • A major limitation of existing phenotype efforts is the need for human annotation of case and control samples, which require substantial time, effort, and expert knowledge to develop.
  • phenotyping can be viewed as a form of dimensionality reduction, where each phenotype forms a latent space
    • citation: G. Hripcsak and D. J. Albers. Next-generation phenotyping of electronic health records. JAMIA, 20(1):117–121, Dec. 2012.

Tensor factorization vs matrix factorization:

  • Matrix factorization, a common dimensionality reduction approach, is insufficient as it cannot concisely capture structured EHR source interactions, such as multiple medications prescribed to treat a single disease


  • Constraints on the factor matrices to minimize the number of non-zero elements
  • augmentation of the tensor approximation
    • Marble decomposes an observed tensor into two terms, a bias (or offset) tensor and an interaction (or signal) tensor (similar to CP-APR factorized tensor).
    • The bias tensor represents the baseline characteristics common amongst the overall population and also provides computational stability. The interaction term is compromised of concise, intuitive, and interpretable phenotypes in the data.
  • Marble achieves at least a 42.8% reduction in the number of non-zero elements compared to CP-APR without sacrificing the quality of the tensor decomposition.

CP Applications:

  • concept discovery: U. Kang, E. Papalexakis, A. Harpale, and C. Faloutsos. Gigatensor: Scaling tensor analysis up by 100 times-algorithms and discoveries. In KDD 2012, pages 316–324, 2012.
  • network analysis of fMRI data: I. Davidson, S. Gilpin, O. Carmichael, and P. Walker. Network discovery via constrained tensor analysis of fMRI data. In KDD 2013, Aug. 2013.
  • community discovery: Y.-R. Lin, J. Sun, H. Sundaram, A. Kelliher, P. Castro, and R. Konuru. Community discovery via metagraph factorization. ACM Transactions on Knowledge Discovery from Data, 5(3), Aug. 2011

Sparsity constrained factor matrices for sparse tensor decomposition:

  • Traditional sparsity-inducing penalties such as ℓ1 and ℓ2 regularization only deal with the standard least-squares minimization.
    • D. Wang and S. Kong. Feature selection from high-order tensorial data via sparse decomposition. Pattern Recognition Letters, 33(13):1695–1702, 2012.
  • Non-parametric Bayesian approaches to sparse Tucker decomposition
    • Z. Xu, F. Yan, Yuan, and Qi. Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis. In ICML 2012, pages 1023–1030. Alan, 2012.
  • A multi-layer NTF has been proposed to achieve sparse representations for various cost functions including KL divergence using a nonlinearly transformed gradient decent approach
    • Proposed sHOPCA, based on HOOI algorithm
    • A. Cichocki, R. Zdunek, S. Choi, R. Plemmons, and S.-I. Amari. Novel multi-layer non-negative tensor factorization with sparsity constraints. In ICANNGA 2007, pages 271–280. Springer, 2007.

Useful reference:

  • CP-APR: E. C. Chi and T. G. Kolda. On tensors, sparsity, and nonnegative factorizations. SIAM Journal on Matrix Analysis and Applications, 33(4):1272–1299, 2012.
  • Survey: A. Cichocki, R. Zdunek, A. H. Phan, and S.-I. Amari. Nonnegative matrix and tensor factorizations: Applications to exploratory multi-way data analysis and blind source separation. Wiley, 2009.


  • CMS data: the Centers for Medicare and Medicaid Services (CMS) provides the CMS Linkable 2008-2010 Medicare Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF), a public available dataset.
    • 10,000 pateints, 129 diagnoses, 115 procedures