Re: [spark-sql] JsonRDD

2015-02-03 Thread Daniil Osipov
Thanks Reynold,

Case sensitivity issues are definitely orthogonal. I'll submit a bug or PR.

Is there a way to rename the object to eliminate the confusion? Not sure
how locked down the API is at this time, but it seems like a potential
confusion point for developers.

On Mon, Feb 2, 2015 at 4:30 PM, Reynold Xin r...@databricks.com wrote:

 It's bad naming - JsonRDD is actually not an RDD. It is just a set of util
 methods.

 The case sensitivity issues seem orthogonal, and would be great to be able
 to control that with a flag.


 On Mon, Feb 2, 2015 at 4:16 PM, Daniil Osipov daniil.osi...@shazam.com
 wrote:

 Hey Spark developers,

 Is there a good reason for JsonRDD being a Scala object as opposed to
 class? Seems most other RDDs are classes, and can be extended.

 The reason I'm asking is that there is a problem with Hive
 interoperability
 with JSON DataFrames where jsonFile generates case sensitive schema, while
 Hive expects case insensitive and fails with an exception during
 saveAsTable if there are two columns with the same name in different case.

 I'm trying to resolve the problem, but that requires me to extend JsonRDD,
 which I can't do. Other RDDs are subclass friendly, why is JsonRDD
 different?

 Dan





Re: [spark-sql] JsonRDD

2015-02-02 Thread Reynold Xin
It's bad naming - JsonRDD is actually not an RDD. It is just a set of util
methods.

The case sensitivity issues seem orthogonal, and would be great to be able
to control that with a flag.


On Mon, Feb 2, 2015 at 4:16 PM, Daniil Osipov daniil.osi...@shazam.com
wrote:

 Hey Spark developers,

 Is there a good reason for JsonRDD being a Scala object as opposed to
 class? Seems most other RDDs are classes, and can be extended.

 The reason I'm asking is that there is a problem with Hive interoperability
 with JSON DataFrames where jsonFile generates case sensitive schema, while
 Hive expects case insensitive and fails with an exception during
 saveAsTable if there are two columns with the same name in different case.

 I'm trying to resolve the problem, but that requires me to extend JsonRDD,
 which I can't do. Other RDDs are subclass friendly, why is JsonRDD
 different?

 Dan