Re: SparkR DataFrame fail to return data of Decimal type
Thanks for the catch. Could you send a PR with this diff ? On Fri, Aug 14, 2015 at 10:30 AM, Shkurenko, Alex ashkure...@enova.com wrote: Got an issue similar to https://issues.apache.org/jira/browse/SPARK-8897, but with the Decimal datatype coming from a Postgres DB: //Set up SparkR Sys.setenv(SPARK_HOME=/Users/ashkurenko/work/git_repos/spark) Sys.setenv(SPARKR_SUBMIT_ARGS=--driver-class-path ~/Downloads/postgresql-9.4-1201.jdbc4.jar sparkr-shell) .libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib), .libPaths())) library(SparkR) sc - sparkR.init(master=local) // Connect to a Postgres DB via JDBC sqlContext - sparkRSQL.init(sc) sql(sqlContext, CREATE TEMPORARY TABLE mytable USING org.apache.spark.sql.jdbc OPTIONS (url 'jdbc:postgresql://servername:5432/dbname' ,dbtable 'mydbtable' ) ) // Try pulling a Decimal column from a table myDataFrame - sql(sqlContext,(select a_decimal_column from mytable )) // The schema shows up fine show(myDataFrame) DataFrame[a_decimal_column:decimal(10,0)] schema(myDataFrame) StructType |-name = a_decimal_column, type = DecimalType(10,0), nullable = TRUE // ... but pulling data fails: localDF - collect(myDataFrame) Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class jobj to a data.frame --- Proposed fix: diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala index d5b4260..b77ae2a 100644 --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala @@ -219,6 +219,9 @@ private[spark] object SerDe { case float | java.lang.Float = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Float].toDouble) +case decimal | java.math.BigDecimal = + writeType(dos, double) + writeDouble(dos, scala.math.BigDecimal(value.asInstanceOf[java.math.BigDecimal]).toDouble) case double | java.lang.Double = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Double]) Thanks, Alex - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
SparkR DataFrame fail to return data of Decimal type
Got an issue similar to https://issues.apache.org/jira/browse/SPARK-8897, but with the Decimal datatype coming from a Postgres DB: //Set up SparkR Sys.setenv(SPARK_HOME=/Users/ashkurenko/work/git_repos/spark) Sys.setenv(SPARKR_SUBMIT_ARGS=--driver-class-path ~/Downloads/postgresql-9.4-1201.jdbc4.jar sparkr-shell) .libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib), .libPaths())) library(SparkR) sc - sparkR.init(master=local) // Connect to a Postgres DB via JDBC sqlContext - sparkRSQL.init(sc) sql(sqlContext, CREATE TEMPORARY TABLE mytable USING org.apache.spark.sql.jdbc OPTIONS (url 'jdbc:postgresql://servername:5432/dbname' ,dbtable 'mydbtable' ) ) // Try pulling a Decimal column from a table myDataFrame - sql(sqlContext,(select a_decimal_column from mytable )) // The schema shows up fine show(myDataFrame) DataFrame[a_decimal_column:decimal(10,0)] schema(myDataFrame) StructType |-name = a_decimal_column, type = DecimalType(10,0), nullable = TRUE // ... but pulling data fails: localDF - collect(myDataFrame) Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class jobj to a data.frame --- Proposed fix: diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala index d5b4260..b77ae2a 100644 --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala @@ -219,6 +219,9 @@ private[spark] object SerDe { case float | java.lang.Float = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Float].toDouble) +case decimal | java.math.BigDecimal = + writeType(dos, double) + writeDouble(dos, scala.math.BigDecimal(value.asInstanceOf[java.math.BigDecimal]).toDouble) case double | java.lang.Double = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Double]) Thanks, Alex
Re: SparkR DataFrame fail to return data of Decimal type
Created https://issues.apache.org/jira/browse/SPARK-9982, working on the PR On Fri, Aug 14, 2015 at 12:43 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Thanks for the catch. Could you send a PR with this diff ? On Fri, Aug 14, 2015 at 10:30 AM, Shkurenko, Alex ashkure...@enova.com wrote: Got an issue similar to https://issues.apache.org/jira/browse/SPARK-8897 , but with the Decimal datatype coming from a Postgres DB: //Set up SparkR Sys.setenv(SPARK_HOME=/Users/ashkurenko/work/git_repos/spark) Sys.setenv(SPARKR_SUBMIT_ARGS=--driver-class-path ~/Downloads/postgresql-9.4-1201.jdbc4.jar sparkr-shell) .libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib), .libPaths())) library(SparkR) sc - sparkR.init(master=local) // Connect to a Postgres DB via JDBC sqlContext - sparkRSQL.init(sc) sql(sqlContext, CREATE TEMPORARY TABLE mytable USING org.apache.spark.sql.jdbc OPTIONS (url 'jdbc:postgresql://servername:5432/dbname' ,dbtable 'mydbtable' ) ) // Try pulling a Decimal column from a table myDataFrame - sql(sqlContext,(select a_decimal_column from mytable )) // The schema shows up fine show(myDataFrame) DataFrame[a_decimal_column:decimal(10,0)] schema(myDataFrame) StructType |-name = a_decimal_column, type = DecimalType(10,0), nullable = TRUE // ... but pulling data fails: localDF - collect(myDataFrame) Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class jobj to a data.frame --- Proposed fix: diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala index d5b4260..b77ae2a 100644 --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala @@ -219,6 +219,9 @@ private[spark] object SerDe { case float | java.lang.Float = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Float].toDouble) +case decimal | java.math.BigDecimal = + writeType(dos, double) + writeDouble(dos, scala.math.BigDecimal(value.asInstanceOf[java.math.BigDecimal]).toDouble) case double | java.lang.Double = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Double]) Thanks, Alex