[jira] [Created] (SYSTEMML-510) Generalize All Existing wdivmm Patterns

Mike Dusenberry (JIRA) Tue, 09 Feb 2016 10:23:39 -0800

Mike Dusenberry created SYSTEMML-510:
----------------------------------------


             Summary: Generalize All Existing wdivmm Patterns
                 Key: SYSTEMML-510
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-510
             Project: SystemML
          Issue Type: Sub-task
            Reporter: Mike Dusenberry


If we look at the inner loop of Poisson nonnegative matrix factorization (PNMF) 
in general, we update the factors as 
{code}
H = (H * (t(W) %*% (V/(W%*%H + 1e-17))))/t(colSums(W)
W = (W * ((V/(W%*%H + 1e-17)) %*% t(H)))/t(rowSums(H))
{code}.
Notice the addition of the "1e-17" epsilon term in the denominators.  
Mathematically, we need this in case any cell of W%*%H evaluates to zero so 
that we can avoid dividing by zero.  R needs this, but SystemML technically 
does not due to a fused operator, "wdivmm", that takes care of these situations 
(or this may be done in the general case?).  This fused operator is currently 
applied to the pattern {{t(W) %*% (V / %* (W %*% H))}}, amongst other similar 
patterns.  Ideally, this would easily apply to {{t(W) %*% (V/(W%*%H + 1e-17)}}, 
regardless of the unneeded epsilon term.  Currently, the addition of the 
epsilon term causes the algorithm to run in non-linear time (quad or 
exponential).  Initially, the behavior pointed towards the possibility of the 
optimizer avoiding the rewrite to the fused operator, resulting in naive 
computation, and non-linear growth in training time.  Further exploration seems 
to show that the rewrite is indeed still being applied, but there seems to also 
be a recursive nesting of the same rewrite over various regions of the above 
statements that is not found when the epsilon term is removed.

The following is the full PNMF DML script used:

{code}
V = read($X)
max_iteration = $maxiter
rank = $rank

n = nrow(V)
m = ncol(V)

range = 0.01
W = Rand(rows=n, cols=rank, min=0, max=range, pdf="uniform")
H = Rand(rows=rank, cols=m, min=0, max=range, pdf="uniform")

loglik0 = sum(V*log(W%*%H)) - as.scalar(colSums(W)%*%rowSums(H))

i=0
while(i < max_iteration) {
  # Addition of epsilon (1e-17) term causes script to run in non-linear time:
        H = (H * (t(W) %*% (V/(W%*%H + 1e-17))))/t(colSums(W))
        W = (W * ((V/(W%*%H + 1e-17)) %*% t(H)))/t(rowSums(H))
  # Removal of epsilon works correctly:
  #H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W))
  #W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H))
        i = i + 1;

        print(i + "")
}

loglik = sum(V*log(W%*%H+1e-17)) - as.scalar(colSums(W)%*%rowSums(H))
print("pnmf: " + loglik0 + " -> " + loglik)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (SYSTEMML-510) Generalize All Existing wdivmm Patterns

Reply via email to