Thanks Matthias, The script works when provided with the absolute path. Regards, Krishna
On Sat, Sep 16, 2017 at 12:33 PM, Matthias Boehm <[email protected]> wrote: > ok great - I also did some more debugging: (1) when the filename is > specified inside the dml script, the file is indeed written to a directory > ~ in HDFS, but (2) when passed from command line the ~ is of course > resolved before it's even passed into SystemML. > > So let's do the following: (1) use a small scenario of say 10K x 1K, (2) > run it with absolute file name, and see what happens. If this does not > work, I would suspect some permission issues next - maybe the bridge from > python to the jvm hides some error output. If this is also not the case, > please provide the -explain output and I have a closer look. > > Regards, > Matthias > > On Fri, Sep 15, 2017 at 11:52 PM, Krishna Kalyan <[email protected] > > > wrote: > > > Thank you so much for trying this Matthias. I will try this again with > > absolute path. > > > > Regards, > > Krishna > > > > On Sat, Sep 16, 2017 at 12:09 PM, Matthias Boehm <[email protected] > > > > wrote: > > > > > ok, I just tried it with multiple different memory configurations (for > > 6GB > > > driver mem I got the same number of spark instructions as you reported) > > and > > > it ran just fine and produced the outputs. So please, give it a try > > without > > > the ~ (i.e., use an absolute or relative path). > > > > > > Also, even with 2GB mem, this data generation for 1M x 1K ran in about > > 60s > > > (including spark context creation) in my environment. Since your log > > shows > > > a runtime of 5000s, you might want to reduce the data size a bit. > > > > > > Regards, > > > Matthias > > > > > > On Fri, Sep 15, 2017 at 11:08 PM, Krishna Kalyan < > > [email protected] > > > > > > > wrote: > > > > > > > Thanks for the reply, > > > > I have tested with systemml-standalone.py too. I am still faced with > > the > > > > same problem. Currently my spark is configured to work on local fs > > > instead > > > > of HDFS hence I did not have a problem with the ~. > > > > > > > > Regards, > > > > Krishna > > > > > > > > > > > > > > > > > > > > On Sat, Sep 16, 2017 at 7:24 AM, Matthias Boehm < > > [email protected]> > > > > wrote: > > > > > > > > > well, I don't think any HDFS fs implementation resolves '~' - so it > > has > > > > > probably created a directory called '~/open-source/scripts/PCA_ > data' > > > in > > > > > your user path in HDFS or current directory in local FS. > > > > > > > > > > Regards, > > > > > Matthias > > > > > > > > > > On Fri, Sep 15, 2017 at 5:47 PM, Krishna Kalyan < > > > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hello, > > > > > > I using PCA > > > > > > <https://github.com/apache/systemml/blob/master/scripts/ > > > > > > datagen/genRandData4PCA.dml> > > > > > > data > > > > > > generation scripts to generate data. Unfortunately they do not > > > produce > > > > > any > > > > > > output in the specified target directory. > > > > > > > > > > > > Command used: > > > > > > > > > > > > systemml/bin/systemml-spark-submit.py -f genRandData4PCA.dml > > -nvargs > > > > > > R=1000000 C=1000 OUT=~/open-source/scripts/PCA_data > > > > > > > > > > > > logs > > > > > > https://gist.github.com/krishnakalyan3/70796b13735743886e41d > > > 3da6b75d7 > > > > d5 > > > > > > > > > > > > This job also does not throw any errors during exection. > > > > > > > > > > > > Thank you so much, > > > > > > Krishna > > > > > > > > > > > > > > > > > > > > >
