Re: How do I save the dataframe data as a pdf file?

2017-12-12 Thread Anthony Thomas
esents a report and I want to create > pdf files. > I am using scala so hoping to find a easier solution in scala, if not I > will try out your suggestion . > > > On Tue, Dec 12, 2017 at 11:29 AM, Anthony Thomas <ahtho...@eng.ucsd.edu> > wrote: > >> Are

Re: How do I save the dataframe data as a pdf file?

2017-12-12 Thread Anthony Thomas
Are you trying to produce a formatted table in a pdf file where the numbers in the table come from a dataframe? I.e. to present summary statistics or other aggregates? If so I would guess your best bet would be to collect the dataframe as a Pandas dataframe and use the to_latex method. You can

Crash in Unit Tests

2017-09-29 Thread Anthony Thomas
Hi Spark Users, I recently compiled spark 2.2.0 from source on an EC2 m4.2xlarge instance (8 cores, 32G RAM) running Ubuntu 14.04. I'm using Oracle Java 1.8. I compiled using the command: export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m" ./build/mvn -DskipTests -Pnetlib-lgpl clean

Re: [MLLib]: Executor OutOfMemory in BlockMatrix Multiplication

2017-06-14 Thread Anthony Thomas
you end up manifesting 3n + logn > matrix blocks in memory at once, which is why it sucks so much. > > Sent from my iPhone > > On Jun 14, 2017, at 7:07 PM, Anthony Thomas <ahtho...@eng.ucsd.edu> wrote: > > I've been experimenting with MlLib's BlockMatrix for distribute

[MLLib]: Executor OutOfMemory in BlockMatrix Multiplication

2017-06-14 Thread Anthony Thomas
I've been experimenting with MlLib's BlockMatrix for distributed matrix multiplication but consistently run into problems with executors being killed due to memory constrains. The linked gist (here ) has a short example of