It sounds like this might be caused by a memory configuration problem.  In 
addition to looking at the executor memory, I'd also bump up the driver memory, 
since it appears that your shell is running out of memory when collecting a 
large query result.

Sent from my phone

> On Jun 11, 2015, at 8:43 AM, Sanjay Subramanian 
> <sanjaysubraman...@yahoo.com.INVALID> wrote:
> 
> hey guys
> 
> Using Hive and Impala daily intensively.
> Want to transition to spark-sql in CLI mode
> 
> Currently in my sandbox I am using the Spark (standalone mode) in the CDH 
> distribution (starving developer version 5.3.3)
> 3 datanode hadoop cluster
> 32GB RAM per node
> 8 cores per node
> 
> spark 
> 1.2.0+cdh5.3.3+371
> 
> 
> I am testing some stuff on one view and getting memory errors
> Possibly reason is default memory per executor showing on 18080 is 
> 512M
> 
> These options when used to start the spark-sql CLI does not seem to have any 
> effect 
> --total-executor-cores 12 --executor-memory 4G
> 
> 
> 
> /opt/cloudera/parcels/CDH/lib/spark/bin/spark-sql -e  "select distinct 
> isr,event_dt,age,age_cod,sex,year,quarter from aers.aers_demo_view"
> 
> aers.aers_demo_view (7 million+ records)
> ===================
> isr     bigint  case id
> event_dt        bigint  Event date
> age     double  age of patient
> age_cod string  days,months years
> sex     string  M or F
> year    int
> quarter int
> 
> 
> VIEW DEFINITION
> ================
> CREATE VIEW `aers.aers_demo_view` AS SELECT `isr` AS `isr`, `event_dt` AS 
> `event_dt`, `age` AS `age`, `age_cod` AS `age_cod`, `gndr_cod` AS `sex`, 
> `year` AS `year`, `quarter` AS `quarter` FROM (SELECT
>    `aers_demo_v1`.`isr`,
>    `aers_demo_v1`.`event_dt`,
>    `aers_demo_v1`.`age`,
>    `aers_demo_v1`.`age_cod`,
>    `aers_demo_v1`.`gndr_cod`,
>    `aers_demo_v1`.`year`,
>    `aers_demo_v1`.`quarter`
> FROM
>   `aers`.`aers_demo_v1`
> UNION ALL
> SELECT
>    `aers_demo_v2`.`isr`,
>    `aers_demo_v2`.`event_dt`,
>    `aers_demo_v2`.`age`,
>    `aers_demo_v2`.`age_cod`,
>    `aers_demo_v2`.`gndr_cod`,
>    `aers_demo_v2`.`year`,
>    `aers_demo_v2`.`quarter`
> FROM
>   `aers`.`aers_demo_v2`
> UNION ALL
> SELECT
>    `aers_demo_v3`.`isr`,
>    `aers_demo_v3`.`event_dt`,
>    `aers_demo_v3`.`age`,
>    `aers_demo_v3`.`age_cod`,
>    `aers_demo_v3`.`gndr_cod`,
>    `aers_demo_v3`.`year`,
>    `aers_demo_v3`.`quarter`
> FROM
>   `aers`.`aers_demo_v3`
> UNION ALL
> SELECT
>    `aers_demo_v4`.`isr`,
>    `aers_demo_v4`.`event_dt`,
>    `aers_demo_v4`.`age`,
>    `aers_demo_v4`.`age_cod`,
>    `aers_demo_v4`.`gndr_cod`,
>    `aers_demo_v4`.`year`,
>    `aers_demo_v4`.`quarter`
> FROM
>   `aers`.`aers_demo_v4`
> UNION ALL
> SELECT
>    `aers_demo_v5`.`primaryid` AS `ISR`,
>    `aers_demo_v5`.`event_dt`,
>    `aers_demo_v5`.`age`,
>    `aers_demo_v5`.`age_cod`,
>    `aers_demo_v5`.`gndr_cod`,
>    `aers_demo_v5`.`year`,
>    `aers_demo_v5`.`quarter`
> FROM
>   `aers`.`aers_demo_v5`
> UNION ALL
> SELECT
>    `aers_demo_v6`.`primaryid` AS `ISR`,
>    `aers_demo_v6`.`event_dt`,
>    `aers_demo_v6`.`age`,
>    `aers_demo_v6`.`age_cod`,
>    `aers_demo_v6`.`sex` AS `GNDR_COD`,
>    `aers_demo_v6`.`year`,
>    `aers_demo_v6`.`quarter`
> FROM
>   `aers`.`aers_demo_v6`) `aers_demo_view`
> 
> 
> 
> 
> 
> 
> 
> 15/06/11 08:36:36 WARN DefaultChannelPipeline: An exception was thrown by a 
> user handler while handling an exception event ([id: 0x01b99855, 
> /10.0.0.19:58117 => /10.0.0.19:52016] EXCEPTION: java.lang.OutOfMemoryError: 
> Java heap space)
> java.lang.OutOfMemoryError: Java heap space
>         at 
> org.jboss.netty.buffer.HeapChannelBuffer.<init>(HeapChannelBuffer.java:42)
>         at 
> org.jboss.netty.buffer.BigEndianHeapChannelBuffer.<init>(BigEndianHeapChannelBuffer.java:34)
>         at 
> org.jboss.netty.buffer.ChannelBuffers.buffer(ChannelBuffers.java:134)
>         at 
> org.jboss.netty.buffer.HeapChannelBufferFactory.getBuffer(HeapChannelBufferFactory.java:68)
>         at 
> org.jboss.netty.buffer.AbstractChannelBufferFactory.getBuffer(AbstractChannelBufferFactory.java:48)
>         at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.newCumulationBuffer(FrameDecoder.java:507)
>         at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.updateCumulation(FrameDecoder.java:345)
>         at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:312)
>         at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>         at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>         at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
>         at 
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 15/06/11 08:36:40 ERROR Utils: Uncaught exception in thread 
> task-result-getter-0
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.lang.Long.valueOf(Long.java:577)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:113)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:103)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>         at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>         at 
> org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:171)
>         at 
> org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
>         at 
> org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:558)
>         at 
> org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:352)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:80)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
>         at 
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1468)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:48)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 15/06/11 08:36:38 ERROR ActorSystemImpl: exception on LARS’ timer thread
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:19)
>         at 
> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:431)
>         at 
> akka.actor.LightArrayRevolverScheduler$$anon$12.nextTick(Scheduler.scala:397)
>         at 
> akka.actor.LightArrayRevolverScheduler$$anon$12.run(Scheduler.scala:363)
>         at java.lang.Thread.run(Thread.java:745)
> Exception in thread "task-result-getter-0" java.lang.OutOfMemoryError: GC 
> overhead limit exceeded
>         at java.lang.Long.valueOf(Long.java:577)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:113)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:103)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>         at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>         at 
> org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:171)
>         at 
> org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
>         at 
> org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:558)
>         at 
> org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:352)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:80)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
>         at 
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1468)
>         at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:48)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 15/06/11 08:36:41 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-scheduler-1] shutting down ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:19)
>         at 
> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:431)
>         at 
> akka.actor.LightArrayRevolverScheduler$$anon$12.nextTick(Scheduler.scala:397)
>         at 
> akka.actor.LightArrayRevolverScheduler$$anon$12.run(Scheduler.scala:363)
>         at java.lang.Thread.run(Thread.java:745)
> 15/06/11 08:36:46 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-akka.actor.default-dispatcher-4] shutting down ActorSystem 
> [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 15/06/11 08:36:46 ERROR SparkSQLDriver: Failed in [select distinct 
> isr,event_dt,age,age_cod,sex,year,quarter from aers.aers_demo_view]
> org.apache.spark.SparkException: Job cancelled because SparkContext was shut 
> down
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:702)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:701)
>         at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
>         at 
> org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:701)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1428)
>         at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:201)
>         at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163)
>         at akka.actor.ActorCell.terminate(ActorCell.scala:338)
>         at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431)
>         at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
>         at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:218)
>         at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>         at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15/06/11 08:36:51 WARN DefaultChannelPipeline: An exception was thrown by a 
> user handler while handling an exception event ([id: 0x79935a9b, 
> /10.0.0.35:54028 => /10.0.0.19:52016] EXCEPTION: java.lang.OutOfMemoryError: 
> Java heap space)
> java.lang.OutOfMemoryError: Java heap space
> 15/06/11 08:36:52 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-akka.actor.default-dispatcher-5] shutting down ActorSystem 
> [sparkDriver]
> java.lang.OutOfMemoryError: Java heap space
> 15/06/11 08:36:53 WARN DefaultChannelPipeline: An exception was thrown by a 
> user handler while handling an exception event ([id: 0xcb8c4b5d, 
> /10.0.0.18:46744 => /10.0.0.19:52016] EXCEPTION: java.lang.OutOfMemoryError: 
> Java heap space)
> java.lang.OutOfMemoryError: Java heap space
> 15/06/11 08:36:56 WARN NioEventLoop: Unexpected exception in the selector 
> loop.
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 15/06/11 08:36:57 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-akka.actor.default-dispatcher-18] shutting down ActorSystem 
> [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 15/06/11 08:36:58 ERROR Utils: Uncaught exception in thread 
> task-result-getter-3
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> Exception in thread "task-result-getter-3" java.lang.OutOfMemoryError: GC 
> overhead limit exceeded
> 15/06/11 08:37:01 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-akka.actor.default-dispatcher-4] shutting down ActorSystem 
> [sparkDriver]
> java.lang.OutOfMemoryError: Java heap space
> Time taken: 70.982 seconds
> 15/06/11 08:37:06 WARN QueuedThreadPool: 4 threads could not be stopped
> 15/06/11 08:37:11 ERROR MapOutputTrackerMaster: Error communicating with 
> MapOutputTracker
> akka.pattern.AskTimeoutException: 
> Recipient[Actor[akka://sparkDriver/user/MapOutputTracker#-2109395547]] had 
> already been terminated.
>         at akka.pattern.AskableActorRef$.ask$extension(AskSupport.scala:134)
>         at 
> org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:111)
>         at 
> org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:122)
>         at 
> org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:330)
>         at org.apache.spark.SparkEnv.stop(SparkEnv.scala:83)
>         at org.apache.spark.SparkContext.stop(SparkContext.scala:1210)
>         at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
>         at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)
> Exception in thread "Thread-3" org.apache.spark.SparkException: Error 
> communicating with MapOutputTracker
>         at 
> org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:116)
>         at 
> org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:122)
>         at 
> org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:330)
>         at org.apache.spark.SparkEnv.stop(SparkEnv.scala:83)
>         at org.apache.spark.SparkContext.stop(SparkContext.scala:1210)
>         at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
>         at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)
> Caused by: akka.pattern.AskTimeoutException: 
> Recipient[Actor[akka://sparkDriver/user/MapOutputTracker#-2109395547]] had 
> already been terminated.
>         at akka.pattern.AskableActorRef$.ask$extension(AskSupport.scala:134)
>         at 
> org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:111)
> 
> 

Reply via email to