[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without "limit"
[ https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-7603: --- Component/s: (was: Web UI) SQL > Crash of thrift server when doing SQL without "limit" > - > > Key: SPARK-7603 > URL: https://issues.apache.org/jira/browse/SPARK-7603 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.1 > Environment: Hortonworks Sandbox 2.1 with Spark 1.3.1 >Reporter: Ihor Bobak > > I have 2 tables in hive: one with 120 thousand records, another one is 5 > times smaller. > I'm running a standalone cluster on single VM, and the thrift server with > ./start-thriftserver.sh --conf spark.executor.memory=2048m --conf > spark.driver.memory=1024m > command. > My spark-defaults.conf contains: > spark.master spark://sandbox.hortonworks.com:7077 > spark.eventLog.enabled true > spark.eventLog.dir > hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs > So, when I am running SQL > select , > from > vw_salesorderdetail as d > left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID > limit 20; > everything is fine, no matter that the limit is unreal (again: the resultset > returned is just 12 records). > But if I am running the same query without limit clause - I get hanging of > execution - see here: http://postimg.org/image/fujdjd16f/42945a78/ > and a lot of exceptions in the logs of thrift server - here you are: > 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID > 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes) > 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID > 945) in 36166 ms on sandbox.hortonworks.com (152/200) > 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context > Cleaner > java.lang.OutOfMemoryError: GC overhead limit exceeded > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147) > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144) > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618) > at > org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143) > at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65) > Exception in thread "Spark Context Cleaner" 15/05/13 18:00:02 ERROR Utils: > Uncaught exception in thread task-result-getter-1 > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.lang.String.(String.java:315) > at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562) > at com.esotericsoftware.kryo.io.Input.readString(Input.java:436) > at > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157) > at > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146) > at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) > at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) > at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) > at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) > at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) > at
[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without "limit"
[ https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-7603: --- Component/s: (was: SQL) Web UI > Crash of thrift server when doing SQL without "limit" > - > > Key: SPARK-7603 > URL: https://issues.apache.org/jira/browse/SPARK-7603 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.3.1 > Environment: Hortonworks Sandbox 2.1 with Spark 1.3.1 >Reporter: Ihor Bobak > > I have 2 tables in hive: one with 120 thousand records, another one is 5 > times smaller. > I'm running a standalone cluster on single VM, and the thrift server with > ./start-thriftserver.sh --conf spark.executor.memory=2048m --conf > spark.driver.memory=1024m > command. > My spark-defaults.conf contains: > spark.master spark://sandbox.hortonworks.com:7077 > spark.eventLog.enabled true > spark.eventLog.dir > hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs > So, when I am running SQL > select , > from > vw_salesorderdetail as d > left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID > limit 20; > everything is fine, no matter that the limit is unreal (again: the resultset > returned is just 12 records). > But if I am running the same query without limit clause - I get hanging of > execution - see here: http://postimg.org/image/fujdjd16f/42945a78/ > and a lot of exceptions in the logs of thrift server - here you are: > 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID > 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes) > 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID > 945) in 36166 ms on sandbox.hortonworks.com (152/200) > 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context > Cleaner > java.lang.OutOfMemoryError: GC overhead limit exceeded > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147) > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144) > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618) > at > org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143) > at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65) > Exception in thread "Spark Context Cleaner" 15/05/13 18:00:02 ERROR Utils: > Uncaught exception in thread task-result-getter-1 > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.lang.String.(String.java:315) > at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562) > at com.esotericsoftware.kryo.io.Input.readString(Input.java:436) > at > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157) > at > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146) > at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) > at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) > at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) > at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) > at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338) > at > com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) >
[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without limit
[ https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-7603: - Component/s: (was: Spark Core) SQL Crash of thrift server when doing SQL without limit - Key: SPARK-7603 URL: https://issues.apache.org/jira/browse/SPARK-7603 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.1 Environment: Hortonworks Sandbox 2.1 with Spark 1.3.1 Reporter: Ihor Bobak I have 2 tables in hive: one with 120 thousand records, another one is 5 times smaller. I'm running a standalone cluster on single VM, and the thrift server with ./start-thriftserver.sh --conf spark.executor.memory=2048m --conf spark.driver.memory=1024m command. My spark-defaults.conf contains: spark.master spark://sandbox.hortonworks.com:7077 spark.eventLog.enabled true spark.eventLog.dir hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs So, when I am running SQL select some fields from header, some fields from details from vw_salesorderdetail as d left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID limit 20; everything is fine, no matter that the limit is unreal (again: the resultset returned is just 12 records). But if I am running the same query without limit clause - I get hanging of execution - see here: http://postimg.org/image/fujdjd16f/42945a78/ and a lot of exceptions in the logs of thrift server - here you are: 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes) 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID 945) in 36166 ms on sandbox.hortonworks.com (152/200) 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context Cleaner java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147) at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144) at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618) at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143) at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65) Exception in thread Spark Context Cleaner 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread task-result-getter-1 java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.String.init(String.java:315) at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562) at com.esotericsoftware.kryo.io.Input.readString(Input.java:436) at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157) at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146) at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) at