[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without "limit"

2016-10-07 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-7603:
---
Component/s: (was: Web UI)
 SQL

> Crash of thrift server when doing SQL without "limit"
> -
>
> Key: SPARK-7603
> URL: https://issues.apache.org/jira/browse/SPARK-7603
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.1
> Environment: Hortonworks Sandbox 2.1  with Spark 1.3.1
>Reporter: Ihor Bobak
>
> I have 2 tables in hive: one with 120 thousand records, another one is 5 
> times smaller. 
> I'm running a standalone cluster on single VM, and the thrift server with 
> ./start-thriftserver.sh --conf spark.executor.memory=2048m  --conf 
> spark.driver.memory=1024m
> command. 
> My spark-defaults.conf contains:
> spark.master spark://sandbox.hortonworks.com:7077
> spark.eventLog.enabled   true
> spark.eventLog.dir   
> hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs
> So, when I am running SQL 
> select , 
> from  
>   vw_salesorderdetail as d 
>   left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID 
> limit 20;
> everything is fine, no matter that the limit is unreal (again: the resultset 
> returned is just 12 records).
> But if I am running the same query without limit clause - I get hanging of 
> execution - see here: http://postimg.org/image/fujdjd16f/42945a78/
> and a lot of exceptions in the logs of thrift server - here you are:
> 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID 
> 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID 
> 945) in 36166 ms on sandbox.hortonworks.com (152/200)
> 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context 
> Cleaner
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>   at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>   at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
>   at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
> Exception in thread "Spark Context Cleaner" 15/05/13 18:00:02 ERROR Utils: 
> Uncaught exception in thread task-result-getter-1
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.lang.String.(String.java:315)
>   at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
>   at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
>   at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>   at 

[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without "limit"

2016-10-07 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-7603:
---
Component/s: (was: SQL)
 Web UI

> Crash of thrift server when doing SQL without "limit"
> -
>
> Key: SPARK-7603
> URL: https://issues.apache.org/jira/browse/SPARK-7603
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 1.3.1
> Environment: Hortonworks Sandbox 2.1  with Spark 1.3.1
>Reporter: Ihor Bobak
>
> I have 2 tables in hive: one with 120 thousand records, another one is 5 
> times smaller. 
> I'm running a standalone cluster on single VM, and the thrift server with 
> ./start-thriftserver.sh --conf spark.executor.memory=2048m  --conf 
> spark.driver.memory=1024m
> command. 
> My spark-defaults.conf contains:
> spark.master spark://sandbox.hortonworks.com:7077
> spark.eventLog.enabled   true
> spark.eventLog.dir   
> hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs
> So, when I am running SQL 
> select , 
> from  
>   vw_salesorderdetail as d 
>   left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID 
> limit 20;
> everything is fine, no matter that the limit is unreal (again: the resultset 
> returned is just 12 records).
> But if I am running the same query without limit clause - I get hanging of 
> execution - see here: http://postimg.org/image/fujdjd16f/42945a78/
> and a lot of exceptions in the logs of thrift server - here you are:
> 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID 
> 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID 
> 945) in 36166 ms on sandbox.hortonworks.com (152/200)
> 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context 
> Cleaner
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
>   at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>   at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
>   at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
> Exception in thread "Spark Context Cleaner" 15/05/13 18:00:02 ERROR Utils: 
> Uncaught exception in thread task-result-getter-1
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.lang.String.(String.java:315)
>   at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
>   at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
>   at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
>   at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
>   

[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without limit

2015-05-15 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-7603:
-
Component/s: (was: Spark Core)
 SQL

 Crash of thrift server when doing SQL without limit
 -

 Key: SPARK-7603
 URL: https://issues.apache.org/jira/browse/SPARK-7603
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
 Environment: Hortonworks Sandbox 2.1  with Spark 1.3.1
Reporter: Ihor Bobak

 I have 2 tables in hive: one with 120 thousand records, another one is 5 
 times smaller. 
 I'm running a standalone cluster on single VM, and the thrift server with 
 ./start-thriftserver.sh --conf spark.executor.memory=2048m  --conf 
 spark.driver.memory=1024m
 command. 
 My spark-defaults.conf contains:
 spark.master spark://sandbox.hortonworks.com:7077
 spark.eventLog.enabled   true
 spark.eventLog.dir   
 hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs
 So, when I am running SQL 
 select some fields from header, some fields from details
 from  
   vw_salesorderdetail as d 
   left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID 
 limit 20;
 everything is fine, no matter that the limit is unreal (again: the resultset 
 returned is just 12 records).
 But if I am running the same query without limit clause - I get hanging of 
 execution - see here: http://postimg.org/image/fujdjd16f/42945a78/
 and a lot of exceptions in the logs of thrift server - here you are:
 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID 
 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID 
 945) in 36166 ms on sandbox.hortonworks.com (152/200)
 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context 
 Cleaner
 java.lang.OutOfMemoryError: GC overhead limit exceeded
   at 
 org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
   at 
 org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
   at 
 org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
   at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
   at 
 org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
   at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
 Exception in thread Spark Context Cleaner 15/05/13 18:00:02 ERROR Utils: 
 Uncaught exception in thread task-result-getter-1
 java.lang.OutOfMemoryError: GC overhead limit exceeded
   at java.lang.String.init(String.java:315)
   at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
   at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
   at 
 com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
   at 
 com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
   at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
   at 
 com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
   at 
 com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
   at 
 com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
   at 
 com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
   at 
 com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
   at 
 com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
   at 
 com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
   at 
 com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
   at 
 com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
   at 
 com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
   at