Jason Moore created SPARK-8892: ---------------------------------- Summary: Column.cast(LongType) does not work for large values Key: SPARK-8892 URL: https://issues.apache.org/jira/browse/SPARK-8892 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0 Reporter: Jason Moore
It seems that casting a column from String to Long seems to go through an intermediate step of being cast to a Double (hits Cast.scala line 328 in castToDecimal). The result is that for large values, the wrong value is returned. This test reveals this bug: {code:java} import org.apache.spark.sql.types._ import org.apache.spark.sql.{Row, SQLContext} import org.apache.spark.{SparkConf, SparkContext} import org.scalatest.FlatSpec import scala.util.Random class DataFrameCastBug extends FlatSpec { "DataFrame" should "cast StringType to LongType correctly" in { val sc = new SparkContext(new SparkConf().setMaster("local").setAppName("app")) val qc = new SQLContext(sc) val values = Seq.fill(100000)(Random.nextLong) val source = qc.createDataFrame( sc.parallelize(values.map(v => Row(v))), StructType(Seq(StructField("value", LongType)))) val result = source.select(source("value"), source("value").cast(StringType).cast(LongType).as("castValue")) assert(result.where(result("value") !== result("castValue")).count() === 0) } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org