[ 
https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542153#comment-14542153
 ] 

Ihor Bobak commented on SPARK-7603:
-----------------------------------

I've just downloaded 1.2.2 and configured exactly the same way everything:  the 
problem is NOT reproducible.

Therefore, you most probably did some kind of optimization in the newer version.

If you need more files from me (e.g. hive tables, etc.) - just feel free to ask 
me, I will send you everything.
If you want, I can even give you the backup of the VM which I am working with.

> Crash of thrift server when doing SQL without "limit"
> -----------------------------------------------------
>
>                 Key: SPARK-7603
>                 URL: https://issues.apache.org/jira/browse/SPARK-7603
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.1
>         Environment: Hortonworks Sandbox 2.1  with Spark 1.3.1
>            Reporter: Ihor Bobak
>
> I have 2 tables in hive: one with 120 thousand records, another one is 5 
> times smaller. 
> I'm running a standalone cluster on single VM, and the thrift server with 
> ./start-thriftserver.sh --conf spark.executor.memory=2048m  --conf 
> spark.driver.memory=1024m
> command. 
> My spark-defaults.conf contains:
> spark.master                     spark://sandbox.hortonworks.com:7077
> spark.eventLog.enabled           true
> spark.eventLog.dir               
> hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs
> So, when I am running SQL 
> select <some fields from header>, <some fields from details>
> from  
>       vw_salesorderdetail as d 
>       left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID 
> limit 2000000000;
> everything is fine, no matter that the limit is unreal (again: the resultset 
> returned is just 120000 records).
> But if I am running the same query without limit clause - I get hanging of 
> execution - see here: http://postimg.org/image/fujdjd16f/42945a78/
> and a lot of exceptions in the logs of thrift server - here you are:
> 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID 
> 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID 
> 945) in 36166 ms on sandbox.hortonworks.com (152/200)
> 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context 
> Cleaner
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>       at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
>       at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
> Exception in thread "Spark Context Cleaner" 15/05/13 18:00:02 ERROR Utils: 
> Uncaught exception in thread task-result-getter-1
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at java.lang.String.<init>(String.java:315)
>       at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
>       at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
>       at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>       at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>       at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>       at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>       at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>       at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>       at 
> org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:173)
>       at 
> org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
>       at 
> org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:621)
>       at 
> org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:379)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:50)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> Exception in thread "task-result-getter-1" 15/05/13 18:00:04 INFO 
> TaskSetManager: Starting task 159.0 in stage 48.0 (TID 954, 
> sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>       at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
>       at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at java.lang.String.<init>(String.java:315)
>       at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
>       at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
>       at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>       at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>       at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>       at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>       at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>       at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>       at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>       at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>       at 
> org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:173)
>       at 
> org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
>       at 
> org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:621)
>       at 
> org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:379)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:50)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 15/05/13 18:00:05 INFO TaskSetManager: Finished task 154.0 in stage 48.0 (TID 
> 949) in 40665 ms on sandbox.hortonworks.com (153/200)
> 15/05/13 18:00:20 ERROR Utils: Uncaught exception in thread 
> task-result-getter-3
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> Exception in thread "task-result-getter-3" java.lang.OutOfMemoryError: GC 
> overhead limit exceeded
> 15/05/13 18:00:28 ERROR Utils: Uncaught exception in thread 
> task-result-getter-2
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> Exception in thread "task-result-getter-2" java.lang.OutOfMemoryError: GC 
> overhead limit exceeded
> 15/05/13 18:00:29 INFO TaskSetManager: Starting task 160.0 in stage 48.0 (TID 
> 955, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> 15/05/13 18:00:31 ERROR ActorSystemImpl: exception on LARS’ timer thread
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at 
> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
>       at 
> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
>       at java.lang.Thread.run(Thread.java:744)
> 15/05/13 18:00:31 INFO ActorSystemImpl: starting new LARS thread
> 15/05/13 18:00:31 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down 
> ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at java.lang.Class.getDeclaredMethods0(Native Method)
>       at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
>       at java.lang.Class.getDeclaredMethod(Class.java:2002)
>       at 
> java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1431)
>       at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
>       at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:494)
>       at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
>       at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>       at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
>       at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
>       at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>       at 
> akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)
>       at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>       at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)
>       at 
> akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)
> 15/05/13 18:00:31 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-scheduler-1] shutting down ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at 
> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
>       at 
> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
>       at java.lang.Thread.run(Thread.java:744)
> 15/05/13 18:00:31 ERROR ActorSystemImpl: Uncaught fatal error from thread 
> [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down 
> ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>       at java.lang.Class.getDeclaredMethods0(Native Method)
>       at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
>       at java.lang.Class.getDeclaredMethod(Class.java:2002)
>       at 
> java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1431)
>       at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
>       at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:494)
>       at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
>       at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>       at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
>       at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
>       at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>       at 
> akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)
>       at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> Feel free to contact me - I will send you full logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to