Re: Representing a recursive data type in Spark SQL

2015-05-28 Thread Jeremy Lucas
Hey Reynold, Thanks for the suggestion. Maybe a better definition of what I mean by a recursive data structure is rather what might resemble (in Scala) the type Map[String, Any]. With a type like this, the keys are well-defined as strings (as this is JSON) but the values can be basically any

Re: Representing a recursive data type in Spark SQL

2015-05-28 Thread Matei Zaharia
Your best bet might be to use a mapstring,string in SQL and make the keys be longer paths (e.g. params_param1 and params_param2). I don't think you can have a map in some of them but not in others. Matei On May 28, 2015, at 3:48 PM, Jeremy Lucas jeremyalu...@gmail.com wrote: Hey Reynold,

Re: Representing a recursive data type in Spark SQL

2015-05-28 Thread Reynold Xin
I think it is fairly hard to support recursive data types. What I've seen in one other proprietary system in the past is to let the user define the depth of the nested data types, and then just expand the struct/map/list definition to the maximum level of depth. Would this solve your problem?

Re: Representing a recursive data type in Spark SQL

2015-05-20 Thread Rakesh Chalasani
Hi Jeremy: Row is a collect of 'Any'. So, you can be used as a recursive data type. Is this what you were looking for? Example: val x = sc.parallelize(Array.range(0,10)).map(x = Row(Row(x), Row(x.toString))) Rakesh On Wed, May 20, 2015 at 7:23 PM Jeremy Lucas jeremyalu...@gmail.com wrote:

Re: Representing a recursive data type in Spark SQL

2015-05-20 Thread Jeremy Lucas
Hey Rakesh, To clarify, what I was referring to is when doing something like this: sqlContext.applySchema(rdd, mySchema) mySchema must be a well-defined StructType, which presently does not allow for a recursive type. On Wed, May 20, 2015 at 5:39 PM Rakesh Chalasani vnit.rak...@gmail.com

Representing a recursive data type in Spark SQL

2015-05-20 Thread Jeremy Lucas
Spark SQL has proven to be quite useful in applying a partial schema to large JSON logs and being able to write plain SQL to perform a wide variety of operations over this data. However, one small thing that keeps coming back to haunt me is the lack of support for recursive data types, whereby a