ok solved. Looks like breathing the the spark-summit SFO air for 3 days helped
a lot !
Piping the 7 million records to local disk still runs out of memory.So piped
the results into another Hive table. I can live with that :-)
/opt/cloudera/parcels/CDH/lib/spark/bin/spark-sql -e use aers; create
best regards
sanjay
From: Josh Rosen rosenvi...@gmail.com
To: Sanjay Subramanian sanjaysubraman...@yahoo.com
Cc: user@spark.apache.org user@spark.apache.org
Sent: Friday, June 12, 2015 7:15 AM
Subject: Re: spark-sql from CLI ---EXCEPTION: java.lang.OutOfMemoryError:
Java heap
:-) to my questions on all CDH groups, Spark, Hive
best regards
sanjay
From: Josh Rosen rosenvi...@gmail.com
To: Sanjay Subramanian sanjaysubraman...@yahoo.com
Cc: user@spark.apache.org user@spark.apache.org
Sent: Friday, June 12, 2015 7:15 AM
Subject: Re: spark-sql from CLI
--
*From:* Josh Rosen rosenvi...@gmail.com
*To:* Sanjay Subramanian sanjaysubraman...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Friday, June 12, 2015 7:15 AM
*Subject:* Re: spark-sql from CLI ---EXCEPTION:
java.lang.OutOfMemoryError: Java
Sent from my phone
On Jun 11, 2015, at 8:43 AM, Sanjay Subramanian
sanjaysubraman...@yahoo.com.INVALID wrote:
hey guys
Using Hive and Impala daily intensively.
Want to transition to spark-sql in CLI mode
Currently in my sandbox I am using the Spark (standalone mode) in the CDH
It sounds like this might be caused by a memory configuration problem. In
addition to looking at the executor memory, I'd also bump up the driver memory,
since it appears that your shell is running out of memory when collecting a
large query result.
Sent from my phone
On Jun 11, 2015, at