Kris Mok created SPARK-21041:
--------------------------------

             Summary: With whole-stage codegen, SparkSession.range()'s behavior 
is inconsistent with SparkContext.range()
                 Key: SPARK-21041
                 URL: https://issues.apache.org/jira/browse/SPARK-21041
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Kris Mok


When whole-stage codegen is enabled, in face of integer overflow, 
SparkSession.range()'s behavior is inconsistent with when codegen is turned 
off, while the latter is consistent with SparkContext.range()'s behavior.

The following Spark Shell session shows the inconsistency:
{code:scala}
scala> sc.range
   def range(start: Long,end: Long,step: Long,numSlices: Int): 
org.apache.spark.rdd.RDD[Long]

scala> spark.range
                                                                                
                     
def range(start: Long,end: Long,step: Long,numPartitions: Int): 
org.apache.spark.sql.Dataset[Long]   
def range(start: Long,end: Long,step: Long): org.apache.spark.sql.Dataset[Long] 
                     
def range(start: Long,end: Long): org.apache.spark.sql.Dataset[Long]            
                     
def range(end: Long): org.apache.spark.sql.Dataset[Long] 

scala> sc.range(java.lang.Long.MAX_VALUE - 3, java.lang.Long.MIN_VALUE + 2, 
1).collect
res1: Array[Long] = Array()

scala> spark.range(java.lang.Long.MAX_VALUE - 3, java.lang.Long.MIN_VALUE + 2, 
1).collect
res2: Array[Long] = Array(9223372036854775804, 9223372036854775805, 
9223372036854775806)

scala> spark.conf.set("spark.sql.codegen.wholeStage", false)

scala> spark.range(java.lang.Long.MAX_VALUE - 3, java.lang.Long.MIN_VALUE + 2, 
1).collect
res5: Array[Long] = Array()
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to