[jira] [Updated] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2019-01-24 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-25767:
--
Fix Version/s: 2.3.3

> Error reported in Spark logs when using the 
> org.apache.spark:spark-sql_2.11:2.3.2 Java library
> --
>
> Key: SPARK-25767
> URL: https://issues.apache.org/jira/browse/SPARK-25767
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0, 2.3.2
>Reporter: Thomas Brugiere
>Assignee: Peter Toth
>Priority: Major
> Fix For: 2.3.3, 2.4.1, 3.0.0
>
> Attachments: fileA.csv, fileB.csv, fileC.csv
>
>
> Hi,
> Here is a bug I found using the latest version of spark-sql_2.11:2.2.0. Note 
> that this case was also tested with spark-sql_2.11:2.3.2 and the bug is also 
> present.
> This issue is a duplicate of the SPARK-25582 issue that I had to close after 
> an accidental manipulation from another developer (was linked to a wrong PR)
> You will find attached three small sample CSV files with the minimal content 
> to raise the bug.
> Find below a reproducer code:
> {code:java}
> import org.apache.spark.SparkConf;
> import org.apache.spark.sql.Dataset;
> import org.apache.spark.sql.Row;
> import org.apache.spark.sql.SparkSession;
> import scala.collection.JavaConverters;
> import scala.collection.Seq;
> import java.util.Arrays;
> public class SparkBug {
> private static  Seq arrayToSeq(T[] input) {
> return 
> JavaConverters.asScalaIteratorConverter(Arrays.asList(input).iterator()).asScala().toSeq();
> }
> public static void main(String[] args) throws Exception {
> SparkConf conf = new 
> SparkConf().setAppName("SparkBug").setMaster("local");
> SparkSession sparkSession = 
> SparkSession.builder().config(conf).getOrCreate();
> Dataset df_a = sparkSession.read().option("header", 
> true).csv("local/fileA.csv").dropDuplicates();
> Dataset df_b = sparkSession.read().option("header", 
> true).csv("local/fileB.csv").dropDuplicates();
> Dataset df_c = sparkSession.read().option("header", 
> true).csv("local/fileC.csv").dropDuplicates();
> String[] key_join_1 = new String[]{"colA", "colB", "colC", "colD", 
> "colE", "colF"};
> String[] key_join_2 = new String[]{"colA", "colB", "colC", "colD", 
> "colE"};
> Dataset df_inventory_1 = df_a.join(df_b, arrayToSeq(key_join_1), 
> "left");
> Dataset df_inventory_2 = df_inventory_1.join(df_c, 
> arrayToSeq(key_join_2), "left");
> df_inventory_2.show();
> }
> }
> {code}
> When running this code, I can see the exception below:
> {code:java}
> 18/10/18 09:25:49 ERROR CodeGenerator: failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 202, Column 18: Expression "agg_isNull_28" is not an rvalue
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 202, Column 18: Expression "agg_isNull_28" is not an rvalue
>     at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
>     at 
> org.codehaus.janino.UnitCompiler.toRvalueOrCompileException(UnitCompiler.java:7170)
>     at 
> org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:5332)
>     at org.codehaus.janino.UnitCompiler.access$9400(UnitCompiler.java:212)
>     at 
> org.codehaus.janino.UnitCompiler$13$1.visitAmbiguousName(UnitCompiler.java:5287)
>     at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4053)
>     at org.codehaus.janino.UnitCompiler$13.visitLvalue(UnitCompiler.java:5284)
>     at org.codehaus.janino.Java$Lvalue.accept(Java.java:3977)
>     at 
> org.codehaus.janino.UnitCompiler.getConstantValue(UnitCompiler.java:5280)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2391)
>     at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:212)
>     at 
> org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1474)
>     at 
> org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1466)
>     at org.codehaus.janino.Java$IfStatement.accept(Java.java:2926)
>     at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1466)
>     at 
> org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1546)
>     at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3075)
>     at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1336)
>     at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1309)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:799)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:958)
>     at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:212)
>     at 
> 

[jira] [Updated] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-30 Thread Xiao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-25767:

Component/s: (was: Java API)
 SQL

> Error reported in Spark logs when using the 
> org.apache.spark:spark-sql_2.11:2.3.2 Java library
> --
>
> Key: SPARK-25767
> URL: https://issues.apache.org/jira/browse/SPARK-25767
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0, 2.3.2
>Reporter: Thomas Brugiere
>Assignee: Peter Toth
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
> Attachments: fileA.csv, fileB.csv, fileC.csv
>
>
> Hi,
> Here is a bug I found using the latest version of spark-sql_2.11:2.2.0. Note 
> that this case was also tested with spark-sql_2.11:2.3.2 and the bug is also 
> present.
> This issue is a duplicate of the SPARK-25582 issue that I had to close after 
> an accidental manipulation from another developer (was linked to a wrong PR)
> You will find attached three small sample CSV files with the minimal content 
> to raise the bug.
> Find below a reproducer code:
> {code:java}
> import org.apache.spark.SparkConf;
> import org.apache.spark.sql.Dataset;
> import org.apache.spark.sql.Row;
> import org.apache.spark.sql.SparkSession;
> import scala.collection.JavaConverters;
> import scala.collection.Seq;
> import java.util.Arrays;
> public class SparkBug {
> private static  Seq arrayToSeq(T[] input) {
> return 
> JavaConverters.asScalaIteratorConverter(Arrays.asList(input).iterator()).asScala().toSeq();
> }
> public static void main(String[] args) throws Exception {
> SparkConf conf = new 
> SparkConf().setAppName("SparkBug").setMaster("local");
> SparkSession sparkSession = 
> SparkSession.builder().config(conf).getOrCreate();
> Dataset df_a = sparkSession.read().option("header", 
> true).csv("local/fileA.csv").dropDuplicates();
> Dataset df_b = sparkSession.read().option("header", 
> true).csv("local/fileB.csv").dropDuplicates();
> Dataset df_c = sparkSession.read().option("header", 
> true).csv("local/fileC.csv").dropDuplicates();
> String[] key_join_1 = new String[]{"colA", "colB", "colC", "colD", 
> "colE", "colF"};
> String[] key_join_2 = new String[]{"colA", "colB", "colC", "colD", 
> "colE"};
> Dataset df_inventory_1 = df_a.join(df_b, arrayToSeq(key_join_1), 
> "left");
> Dataset df_inventory_2 = df_inventory_1.join(df_c, 
> arrayToSeq(key_join_2), "left");
> df_inventory_2.show();
> }
> }
> {code}
> When running this code, I can see the exception below:
> {code:java}
> 18/10/18 09:25:49 ERROR CodeGenerator: failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 202, Column 18: Expression "agg_isNull_28" is not an rvalue
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 202, Column 18: Expression "agg_isNull_28" is not an rvalue
>     at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
>     at 
> org.codehaus.janino.UnitCompiler.toRvalueOrCompileException(UnitCompiler.java:7170)
>     at 
> org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:5332)
>     at org.codehaus.janino.UnitCompiler.access$9400(UnitCompiler.java:212)
>     at 
> org.codehaus.janino.UnitCompiler$13$1.visitAmbiguousName(UnitCompiler.java:5287)
>     at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4053)
>     at org.codehaus.janino.UnitCompiler$13.visitLvalue(UnitCompiler.java:5284)
>     at org.codehaus.janino.Java$Lvalue.accept(Java.java:3977)
>     at 
> org.codehaus.janino.UnitCompiler.getConstantValue(UnitCompiler.java:5280)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2391)
>     at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:212)
>     at 
> org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1474)
>     at 
> org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1466)
>     at org.codehaus.janino.Java$IfStatement.accept(Java.java:2926)
>     at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1466)
>     at 
> org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1546)
>     at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3075)
>     at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1336)
>     at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1309)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:799)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:958)
>     at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:212)

[jira] [Updated] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-18 Thread Thomas Brugiere (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Brugiere updated SPARK-25767:

Attachment: fileC.csv
fileB.csv
fileA.csv

> Error reported in Spark logs when using the 
> org.apache.spark:spark-sql_2.11:2.3.2 Java library
> --
>
> Key: SPARK-25767
> URL: https://issues.apache.org/jira/browse/SPARK-25767
> Project: Spark
>  Issue Type: Bug
>  Components: Java API
>Affects Versions: 2.2.0, 2.3.2
>Reporter: Thomas Brugiere
>Priority: Major
> Attachments: fileA.csv, fileB.csv, fileC.csv
>
>
> Hi,
> Here is a bug I found using the latest version of spark-sql_2.11:2.2.0. Note 
> that this case was also tested with spark-sql_2.11:2.3.2 and the bug is also 
> present.
> This issue is a duplicate of the SPARK-25582 issue that I had to close after 
> an accidental manipulation from another developer (was linked to a wrong PR)
> You will find attached three small sample CSV files with the minimal content 
> to raise the bug.
> Find below a reproducer code:
> {code:java}
> import org.apache.spark.SparkConf;
> import org.apache.spark.sql.Dataset;
> import org.apache.spark.sql.Row;
> import org.apache.spark.sql.SparkSession;
> import scala.collection.JavaConverters;
> import scala.collection.Seq;
> import java.util.Arrays;
> public class SparkBug {
> private static  Seq arrayToSeq(T[] input) {
> return 
> JavaConverters.asScalaIteratorConverter(Arrays.asList(input).iterator()).asScala().toSeq();
> }
> public static void main(String[] args) throws Exception {
> SparkConf conf = new 
> SparkConf().setAppName("SparkBug").setMaster("local");
> SparkSession sparkSession = 
> SparkSession.builder().config(conf).getOrCreate();
> Dataset df_a = sparkSession.read().option("header", 
> true).csv("local/fileA.csv").dropDuplicates();
> Dataset df_b = sparkSession.read().option("header", 
> true).csv("local/fileB.csv").dropDuplicates();
> Dataset df_c = sparkSession.read().option("header", 
> true).csv("local/fileC.csv").dropDuplicates();
> String[] key_join_1 = new String[]{"colA", "colB", "colC", "colD", 
> "colE", "colF"};
> String[] key_join_2 = new String[]{"colA", "colB", "colC", "colD", 
> "colE"};
> Dataset df_inventory_1 = df_a.join(df_b, arrayToSeq(key_join_1), 
> "left");
> Dataset df_inventory_2 = df_inventory_1.join(df_c, 
> arrayToSeq(key_join_2), "left");
> df_inventory_2.show();
> }
> }
> {code}
> When running this code, I can see the exception below:
> {code:java}
> 18/10/18 09:25:49 ERROR CodeGenerator: failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 202, Column 18: Expression "agg_isNull_28" is not an rvalue
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 202, Column 18: Expression "agg_isNull_28" is not an rvalue
>     at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
>     at 
> org.codehaus.janino.UnitCompiler.toRvalueOrCompileException(UnitCompiler.java:7170)
>     at 
> org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:5332)
>     at org.codehaus.janino.UnitCompiler.access$9400(UnitCompiler.java:212)
>     at 
> org.codehaus.janino.UnitCompiler$13$1.visitAmbiguousName(UnitCompiler.java:5287)
>     at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4053)
>     at org.codehaus.janino.UnitCompiler$13.visitLvalue(UnitCompiler.java:5284)
>     at org.codehaus.janino.Java$Lvalue.accept(Java.java:3977)
>     at 
> org.codehaus.janino.UnitCompiler.getConstantValue(UnitCompiler.java:5280)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2391)
>     at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:212)
>     at 
> org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1474)
>     at 
> org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1466)
>     at org.codehaus.janino.Java$IfStatement.accept(Java.java:2926)
>     at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1466)
>     at 
> org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1546)
>     at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3075)
>     at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1336)
>     at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1309)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:799)
>     at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:958)
>     at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:212)
>     at 
>