Re: SparkR DataFrame fail to return data of Decimal type

2015-08-14 Thread Shivaram Venkataraman
Thanks for the catch. Could you send a PR with this diff ?

On Fri, Aug 14, 2015 at 10:30 AM, Shkurenko, Alex ashkure...@enova.com wrote:
 Got an issue similar to https://issues.apache.org/jira/browse/SPARK-8897,
 but with the Decimal datatype coming from a Postgres DB:

 //Set up SparkR

Sys.setenv(SPARK_HOME=/Users/ashkurenko/work/git_repos/spark)
Sys.setenv(SPARKR_SUBMIT_ARGS=--driver-class-path
 ~/Downloads/postgresql-9.4-1201.jdbc4.jar sparkr-shell)
.libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib), .libPaths()))
library(SparkR)
sc - sparkR.init(master=local)

 // Connect to a Postgres DB via JDBC
sqlContext - sparkRSQL.init(sc)
sql(sqlContext, 
 CREATE TEMPORARY TABLE mytable
 USING org.apache.spark.sql.jdbc
 OPTIONS (url 'jdbc:postgresql://servername:5432/dbname'
 ,dbtable 'mydbtable'
 )
 )

 // Try pulling a Decimal column from a table
myDataFrame - sql(sqlContext,(select a_decimal_column  from mytable ))

 // The schema shows up fine

show(myDataFrame)

 DataFrame[a_decimal_column:decimal(10,0)]

schema(myDataFrame)

 StructType
 |-name = a_decimal_column, type = DecimalType(10,0), nullable = TRUE

 // ... but pulling data fails:

 localDF - collect(myDataFrame)

 Error in as.data.frame.default(x[[i]], optional = TRUE) :
   cannot coerce class jobj to a data.frame


 ---
 Proposed fix:

 diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
 b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
 index d5b4260..b77ae2a 100644
 --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
 +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
 @@ -219,6 +219,9 @@ private[spark] object SerDe {
  case float | java.lang.Float =
writeType(dos, double)
writeDouble(dos, value.asInstanceOf[Float].toDouble)
 +case decimal | java.math.BigDecimal =
 +   writeType(dos, double)
 +   writeDouble(dos,
 scala.math.BigDecimal(value.asInstanceOf[java.math.BigDecimal]).toDouble)
  case double | java.lang.Double =
writeType(dos, double)
writeDouble(dos, value.asInstanceOf[Double])

 Thanks,
 Alex

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



SparkR DataFrame fail to return data of Decimal type

2015-08-14 Thread Shkurenko, Alex
Got an issue similar to https://issues.apache.org/jira/browse/SPARK-8897,
but with the Decimal datatype coming from a Postgres DB:

//Set up SparkR

Sys.setenv(SPARK_HOME=/Users/ashkurenko/work/git_repos/spark)
Sys.setenv(SPARKR_SUBMIT_ARGS=--driver-class-path
~/Downloads/postgresql-9.4-1201.jdbc4.jar sparkr-shell)
.libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib), .libPaths()))
library(SparkR)
sc - sparkR.init(master=local)

// Connect to a Postgres DB via JDBC
sqlContext - sparkRSQL.init(sc)
sql(sqlContext, 
CREATE TEMPORARY TABLE mytable
USING org.apache.spark.sql.jdbc
OPTIONS (url 'jdbc:postgresql://servername:5432/dbname'
,dbtable 'mydbtable'
)
)

// Try pulling a Decimal column from a table
myDataFrame - sql(sqlContext,(select a_decimal_column  from mytable ))

// The schema shows up fine

show(myDataFrame)

DataFrame[a_decimal_column:decimal(10,0)]

schema(myDataFrame)

StructType
|-name = a_decimal_column, type = DecimalType(10,0), nullable = TRUE

// ... but pulling data fails:

localDF - collect(myDataFrame)

Error in as.data.frame.default(x[[i]], optional = TRUE) :
  cannot coerce class jobj to a data.frame


---
Proposed fix:

diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
index d5b4260..b77ae2a 100644
--- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
+++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
@@ -219,6 +219,9 @@ private[spark] object SerDe {
 case float | java.lang.Float =
   writeType(dos, double)
   writeDouble(dos, value.asInstanceOf[Float].toDouble)
+case decimal | java.math.BigDecimal =
+   writeType(dos, double)
+   writeDouble(dos,
scala.math.BigDecimal(value.asInstanceOf[java.math.BigDecimal]).toDouble)
 case double | java.lang.Double =
   writeType(dos, double)
   writeDouble(dos, value.asInstanceOf[Double])

Thanks,
Alex


Re: SparkR DataFrame fail to return data of Decimal type

2015-08-14 Thread Shkurenko, Alex
Created https://issues.apache.org/jira/browse/SPARK-9982, working on the PR

On Fri, Aug 14, 2015 at 12:43 PM, Shivaram Venkataraman 
shiva...@eecs.berkeley.edu wrote:

 Thanks for the catch. Could you send a PR with this diff ?

 On Fri, Aug 14, 2015 at 10:30 AM, Shkurenko, Alex ashkure...@enova.com
 wrote:
  Got an issue similar to https://issues.apache.org/jira/browse/SPARK-8897
 ,
  but with the Decimal datatype coming from a Postgres DB:
 
  //Set up SparkR
 
 Sys.setenv(SPARK_HOME=/Users/ashkurenko/work/git_repos/spark)
 Sys.setenv(SPARKR_SUBMIT_ARGS=--driver-class-path
  ~/Downloads/postgresql-9.4-1201.jdbc4.jar sparkr-shell)
 .libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib),
 .libPaths()))
 library(SparkR)
 sc - sparkR.init(master=local)
 
  // Connect to a Postgres DB via JDBC
 sqlContext - sparkRSQL.init(sc)
 sql(sqlContext, 
  CREATE TEMPORARY TABLE mytable
  USING org.apache.spark.sql.jdbc
  OPTIONS (url 'jdbc:postgresql://servername:5432/dbname'
  ,dbtable 'mydbtable'
  )
  )
 
  // Try pulling a Decimal column from a table
 myDataFrame - sql(sqlContext,(select a_decimal_column  from mytable ))
 
  // The schema shows up fine
 
 show(myDataFrame)
 
  DataFrame[a_decimal_column:decimal(10,0)]
 
 schema(myDataFrame)
 
  StructType
  |-name = a_decimal_column, type = DecimalType(10,0), nullable = TRUE
 
  // ... but pulling data fails:
 
  localDF - collect(myDataFrame)
 
  Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class jobj to a data.frame
 
 
  ---
  Proposed fix:
 
  diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
  b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
  index d5b4260..b77ae2a 100644
  --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
  +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
  @@ -219,6 +219,9 @@ private[spark] object SerDe {
   case float | java.lang.Float =
 writeType(dos, double)
 writeDouble(dos, value.asInstanceOf[Float].toDouble)
  +case decimal | java.math.BigDecimal =
  +   writeType(dos, double)
  +   writeDouble(dos,
  scala.math.BigDecimal(value.asInstanceOf[java.math.BigDecimal]).toDouble)
   case double | java.lang.Double =
 writeType(dos, double)
 writeDouble(dos, value.asInstanceOf[Double])
 
  Thanks,
  Alex