Re: Process large JSON file without causing OOM

2017-11-21 Thread Alec Swan
Pinging back to see if anybody could provide me with some pointers on hot to stream/batch JSON-to-ORC conversion in Spark SQL or why I get an OOM dump with such small memory footprint? Thanks, Alec On Wed, Nov 15, 2017 at 11:03 AM, Alec Swan <alecs...@gmail.com> wrote: > Thanks Steve

Re: Process large JSON file without causing OOM

2017-11-15 Thread Alec Swan
gt; > I'd suggest trying to run with `local[2]` and checking what's the memory > usage of the jvm process. > > On Mon, Nov 13, 2017 at 7:22 PM, Alec Swan <alecs...@gmail.com> wrote: > >> Hello, >> >> I am using the Spark library to convert JSON/Snappy fil

Re: Process large JSON file without causing OOM

2017-11-14 Thread Alec Swan
; > > <http://in.linkedin.com/in/sonalgoyal> > > > > On Tue, Nov 14, 2017 at 9:37 AM, Alec Swan <alecs...@gmail.com> wrote: > >> Hi Joel, >> >> Here are the relevant snippets of my code and an OOM error thrown >> in frameWriter.save(..). Surpri

Re: Process large JSON file without causing OOM

2017-11-13 Thread Alec Swan
ad.run(Thread.java:745) Thanks, Alec On Mon, Nov 13, 2017 at 8:30 PM, Joel D <games2013@gmail.com> wrote: > Have you tried increasing driver, exec mem (gc overhead too if required)? > > your code snippet and stack trace will be helpful. > > On Mon, Nov 13, 2017 at 7:

Process large JSON file without causing OOM

2017-11-13 Thread Alec Swan
Hello, I am using the Spark library to convert JSON/Snappy files to ORC/ZLIB format. Effectively, my Java service starts up an embedded Spark cluster (master=local[*]) and uses Spark SQL to convert JSON to ORC. However, I keep getting OOM errors with large (~1GB) files. I've tried different ways