[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-12 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084689#comment-16084689
 ] 

Hyukjin Kwon commented on SPARK-21393:
--

Would you mind sharing your codes? I want to reproduce this issue but looks I 
can given the information here.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper26;
> /* 064 */   private UTF8String.IntWrapper wrapper27;
> /* 065 */   private scala.collect

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Kazuaki Ishizaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085726#comment-16085726
 ] 

Kazuaki Ishizaki commented on SPARK-21393:
--

This program seems to require 7 csv files to execute this program. Could you 
please attached these csv files?

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper26;
> /* 064 */   private UTF8String.In

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Kazuaki Ishizaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086189#comment-16086189
 ] 

Kazuaki Ishizaki commented on SPARK-21393:
--

Thank you for uploading files. When I insert {df_new.show()} at appropriate 
places, I can reproduce this problem on Spark 2.1.1 or Spark 2.2.
I am reducing the program.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Kazuaki Ishizaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086254#comment-16086254
 ] 

Kazuaki Ishizaki commented on SPARK-21393:
--

This program can cause the same exception

{code}
from __future__ import absolute_import, division, print_function

import findspark
findspark.init()
import pyspark
from pyspark.sql.functions import *

from pyspark import SparkContext
from pyspark import SparkConf
from pyspark.sql import SQLContext
import pyspark.sql.functions as sf

sc = SparkContext()
sqlContext = SQLContext(sc)
### data
df = sqlContext.read.load('./Data/claims.csv', 
format='com.databricks.spark.csv', header=True)

df_new = df.withColumn('service_type_col',sf.when((sf.col('RevenueCategory') == 
"Emergency Room") | (sf.col('CPT_Name') == "EMERGENCY DEPT VISIT"), 
'EMERGENCY_CARE').otherwise(0))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('ProcedureCategory').isin([ "Laboratory, General"])) & 
(sf.col('service_type_col') == 0), 
'LAB_AND_PATHOLOGY').otherwise(df_new.service_type_col))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('service_type_col') == 0), 
'ROUTINE_RADIOLOGY').otherwise(df_new.service_type_col))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('CPT_Code').isin(["70336"])) & (sf.col('service_type_col') == 
0), 'ADVANCED_IMAGING').otherwise(df_new.service_type_col))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('service_type_col') == 0), 
'DURABLE_MEDICAL_EQUIPMENT').otherwise(df_new.service_type_col))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('CPT_Name').isin(['CHIROPRACTIC MANIPULATION'])) & 
(sf.col('service_type_col') == 0), 
'CHIROPRACTIC').otherwise(df_new.service_type_col))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('service_type_col') == 0), 
'AMBULANCE').otherwise(df_new.service_type_col))
df_new = df_new.withColumn('service_type_col', 
sf.when((sf.col('service_type_col') == 0), 
'RX_MAIL').otherwise(df_new.service_type_col))

df_new.show()
{code}

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.co

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086757#comment-16086757
 ] 

Hyukjin Kwon commented on SPARK-21393:
--

Thanks [~kiszk]. I could more narrow down based on your reproducer:

{code}
from pyspark.sql.functions import *


df = spark.createDataFrame([[1, 2]], "fieldA string, fieldB string")

df = df.withColumn('fieldA', when((df.fieldA == None) | (col('fieldB') == ""), 
None).otherwise(0))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA)) 
# repeat the same line 8 times below
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))
df = df.withColumn('fieldA', when((df.fieldA == 0), None).otherwise(df.fieldA))

df.show()
{code}

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper1

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086795#comment-16086795
 ] 

Hyukjin Kwon commented on SPARK-21393:
--

Scala codes:

{code}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import org.apache.spark.sql.Row

val schema = StructType(StructField("fieldA", IntegerType) :: 
StructField("fieldB", IntegerType) :: Nil)
var df = spark.createDataFrame(spark.sparkContext.parallelize(Seq(Row(1, 2))), 
schema)
df = df.withColumn("fieldA", when(($"fieldA" === null).or($"fieldB" === 0), 
null).otherwise(0))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df.show()
{code}

I could not reproduce this with local relation

{code}
import org.apache.spark.sql.functions._

var df = Seq(Tuple2(1, 2)).toDF("fieldA", "fieldB")
df = df.withColumn("fieldA", when(($"fieldA" === null).or($"fieldB" === 0), 
null).otherwise(0))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df = df.withColumn("fieldA", when($"fieldA" === 0, null).otherwise($"fieldA"))
df.show()
{code}



> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086816#comment-16086816
 ] 

Hyukjin Kwon commented on SPARK-21393:
--

Ah hm sorry, this appears a different issue given the generated plans... 
Will open another JIRA and remove the comments above to prevent confusion.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWra

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-13 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086854#comment-16086854
 ] 

Hyukjin Kwon commented on SPARK-21393:
--

Ah, sorry for saying forth and back. They are the same issues.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper26;
> /* 064 */   private UTF8String.IntWrapper wrapper27;
> /* 065 */   priva

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-14 Thread Zahra (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087362#comment-16087362
 ] 

Zahra commented on SPARK-21393:
---

Hi,

Were you able to reproduce the error?

Thanks,
Zahra

On Fri, Jul 14, 2017 at 12:55 AM, Hyukjin Kwon (JIRA) 



> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper26;
> /* 064 */   private UTF8String.In

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-14 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087379#comment-16087379
 ] 

Hyukjin Kwon commented on SPARK-21393:
--

Yes, I am able. It looked a duplicate of the issue I opened given my 
investigation so I resolved this for the reason as I described above.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper2

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-14 Thread Zahra (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087386#comment-16087386
 ] 

Zahra commented on SPARK-21393:
---

So, now I should be able to run my code without any trouble?


On Fri, Jul 14, 2017 at 10:21 AM, Hyukjin Kwon (JIRA) 



> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper26;
> /* 064 */   private UTF8Strin

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-14 Thread Kazuaki Ishizaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087391#comment-16087391
 ] 

Kazuaki Ishizaki commented on SPARK-21393:
--

Not yet, however I created a patch not to cause failure for a program in 
SPARK-21413.
I will submit a pull request when I can create a test suite for this patch. 
Then, I expect that it will be merged into the master.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private sca

[jira] [Commented] (SPARK-21393) spark (pyspark) crashes unpredictably when using show() or toPandas()

2017-07-15 Thread Kazuaki Ishizaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088660#comment-16088660
 ] 

Kazuaki Ishizaki commented on SPARK-21393:
--

I confirmed that this python program works well after applying a PR for 
SPARK-21413.

> spark (pyspark) crashes unpredictably when using show() or toPandas()
> -
>
> Key: SPARK-21393
> URL: https://issues.apache.org/jira/browse/SPARK-21393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.1.1, 2.2.0
> Environment: Windows 10
> python 2.7
>Reporter: Zahra
> Attachments: Data.zip, working_ST_pyspark.py
>
>
> unpredictbly run into this error either when using 
> `pyspark.sql.DataFrame.show()` or `pyspark.sql.DataFrame.toPandas()`
> error log starts with  (truncated) :
> {noformat}
> 17/07/12 16:03:09 ERROR CodeGenerator: failed to compile: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "apply_47$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
>  grows beyond 64 KB
> /* 001 */ public java.lang.Object generate(Object[] references) {
> /* 002 */   return new SpecificUnsafeProjection(references);
> /* 003 */ }
> /* 004 */
> /* 005 */ class SpecificUnsafeProjection extends 
> org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
> /* 006 */
> /* 007 */   private Object[] references;
> /* 008 */   private scala.collection.immutable.Set set;
> /* 009 */   private scala.collection.immutable.Set set1;
> /* 010 */   private scala.collection.immutable.Set set2;
> /* 011 */   private scala.collection.immutable.Set set3;
> /* 012 */   private UTF8String.IntWrapper wrapper;
> /* 013 */   private UTF8String.IntWrapper wrapper1;
> /* 014 */   private scala.collection.immutable.Set set4;
> /* 015 */   private UTF8String.IntWrapper wrapper2;
> /* 016 */   private UTF8String.IntWrapper wrapper3;
> /* 017 */   private scala.collection.immutable.Set set5;
> /* 018 */   private scala.collection.immutable.Set set6;
> /* 019 */   private scala.collection.immutable.Set set7;
> /* 020 */   private UTF8String.IntWrapper wrapper4;
> /* 021 */   private UTF8String.IntWrapper wrapper5;
> /* 022 */   private scala.collection.immutable.Set set8;
> /* 023 */   private UTF8String.IntWrapper wrapper6;
> /* 024 */   private UTF8String.IntWrapper wrapper7;
> /* 025 */   private scala.collection.immutable.Set set9;
> /* 026 */   private scala.collection.immutable.Set set10;
> /* 027 */   private scala.collection.immutable.Set set11;
> /* 028 */   private UTF8String.IntWrapper wrapper8;
> /* 029 */   private UTF8String.IntWrapper wrapper9;
> /* 030 */   private scala.collection.immutable.Set set12;
> /* 031 */   private UTF8String.IntWrapper wrapper10;
> /* 032 */   private UTF8String.IntWrapper wrapper11;
> /* 033 */   private scala.collection.immutable.Set set13;
> /* 034 */   private scala.collection.immutable.Set set14;
> /* 035 */   private scala.collection.immutable.Set set15;
> /* 036 */   private UTF8String.IntWrapper wrapper12;
> /* 037 */   private UTF8String.IntWrapper wrapper13;
> /* 038 */   private scala.collection.immutable.Set set16;
> /* 039 */   private UTF8String.IntWrapper wrapper14;
> /* 040 */   private UTF8String.IntWrapper wrapper15;
> /* 041 */   private scala.collection.immutable.Set set17;
> /* 042 */   private scala.collection.immutable.Set set18;
> /* 043 */   private scala.collection.immutable.Set set19;
> /* 044 */   private UTF8String.IntWrapper wrapper16;
> /* 045 */   private UTF8String.IntWrapper wrapper17;
> /* 046 */   private scala.collection.immutable.Set set20;
> /* 047 */   private UTF8String.IntWrapper wrapper18;
> /* 048 */   private UTF8String.IntWrapper wrapper19;
> /* 049 */   private scala.collection.immutable.Set set21;
> /* 050 */   private scala.collection.immutable.Set set22;
> /* 051 */   private scala.collection.immutable.Set set23;
> /* 052 */   private UTF8String.IntWrapper wrapper20;
> /* 053 */   private UTF8String.IntWrapper wrapper21;
> /* 054 */   private scala.collection.immutable.Set set24;
> /* 055 */   private UTF8String.IntWrapper wrapper22;
> /* 056 */   private UTF8String.IntWrapper wrapper23;
> /* 057 */   private scala.collection.immutable.Set set25;
> /* 058 */   private scala.collection.immutable.Set set26;
> /* 059 */   private scala.collection.immutable.Set set27;
> /* 060 */   private UTF8String.IntWrapper wrapper24;
> /* 061 */   private UTF8String.IntWrapper wrapper25;
> /* 062 */   private scala.collection.immutable.Set set28;
> /* 063 */   private UTF8String.IntWrapper wrapper26;
> /* 064 */   private UTF8String.IntWrapper