Re: Documents of SystemML Algorithms Reference

2017-05-01 Thread Ethan Xu
https://apache.github.io/ > incubator-systemml/beginners-guide-python.html#invoke-systemmls-algorithms > > Thanks, > > Niketan Pansare > IBM Almaden Research Center > E-mail: npansar At us.ibm.com > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > &g

Link broken?

2017-04-19 Thread Ethan Xu
The download link from the SystemML Beginner Tutorial Step 4 ( https://systemml.apache.org/get-started.html) seems outdated and broken: http://www.apache.org/dyn/closer.cgi/pub/apache/incubator/systemml/0.12.0-incubating/systemml-0.12.0-incubating.zip The 0.13.0 version download link from the

Re: parfor fails

2016-04-16 Thread Ethan Xu
> Boehm---04/14/2016 07:53:43 PM---Hi Ethan, thanks for catching this issue. > The parfor script itself is perfectly fine > > From: Matthias Boehm/Almaden/IBM@IBMUS > To: dev@systemml.incubator.apache.org > Cc: "Ethan Xu" <ethan.

Re: 'sample.dml' replaces rows with 0's

2016-04-15 Thread Ethan Xu
Hi Ethan, > > I just tried the script on a toy data and I could reproduce this erroneous > behavior when run in Hadoop mode -- both local and Spark modes are good. I > will look into it. > > BTW, you forgot to attach the scripts. > > Shirish > > On Thu, Apr 14, 2016 at 5:02 PM,

Re: 'sample.dml' replaces rows with 0's

2016-04-14 Thread Ethan Xu
. There was no errors in either trials. Ethan On Thu, Apr 14, 2016 at 4:37 PM, Ethan Xu <ethan.yifa...@gmail.com> wrote: > Hello, > > I encountered an unexpected behavior from 'sample.dml' on a dataset on > Hadoop. Instead of splitting the data, it replaced rows of original dat

'sample.dml' replaces rows with 0's

2016-04-14 Thread Ethan Xu
Hello, I encountered an unexpected behavior from 'sample.dml' on a dataset on Hadoop. Instead of splitting the data, it replaced rows of original data with 0's. Here are the details: I called sample.dml in attempt to split is a 35 million by 2396 numeric matrix to two 80% and 20% subsets. The

Re: Logical indexing?

2016-04-04 Thread Ethan Xu
running an algorithm over a sample or fold). > > Regards, > Matthias > > > [image: Inactive hide details for Ethan Xu ---03/31/2016 11:31:32 AM---Ah > I missed the 'removeEmpty()' function. That's a smart ways to]Ethan Xu > ---03/31/2016 11:31:32 AM---Ah I missed the 'removeEmpty

Re: Logical indexing?

2016-03-31 Thread Ethan Xu
formations. > Here are some examples: > > # option 1: via permutation (aka selection) matrices > P = removeEmpty(target=diag(X[,1]>10), margin="rows"); > Y = P %*% X; > > # option 2: via removeEmpty > Ind = diag(X[,1]>10); > Y = removeEmpty(target=X, s

Logical indexing?

2016-03-31 Thread Ethan Xu
Does SystemML support logical indexing? For example if X is a numerical matrix with 2 columns and n rows (in my case n ~ 35 million). I'd like to split the matrix row-wise according to values of the first column. This is useful when I need to find distributions of subgroups of population. In R I

Re: Compatibility with MR1 Cloudera cdh4.2.1

2016-02-05 Thread Ethan Xu
ame :" + name + "doesn't contain 'r' or 'm'"); } Ethan On Fri, Feb 5, 2016 at 4:37 PM, Ethan Xu <ethan.yifa...@gmail.com> wrote: > Thanks tried that and moved a bit further. Now a new exception (still in > reduce phase of 'CSV-Reblock-MR')

Compatibility with MR1 Cloudera cdh4.2.1

2016-02-04 Thread Ethan Xu
Hello, I got an error when running the systemML/scripts/Univar-Stats.dml script on a hadoop cluster (Cloudera CDH4.2.1) on a 6GB data set. Error message is at the bottom of the email. The same script ran fine on a smaller sample (several MB) of the same data set, when MR was not invoked. The

Fixed hadoop configuration to run dml on large dataset

2016-02-04 Thread Ethan Xu
Thanks to help from the team, we fixed a hadoop classpath configuration so dml successfully invokes MapReduce jobs. I'm carrying the discussion here in case other people ran into the same problem. Problem description I was running a simple dml to carry out data transformation on a

Re: User friendly output of univariate statistics

2016-02-03 Thread Ethan Xu
t data into columns, such as in your table format example. It might be very nice to add a c-style "printf" statement, which would allow results to be written to the console in a more columnar format. Does anyone else have any thoughts? Deron On Tue, Feb 2, 2016 at 8:32 AM, Ethan Xu <etha..