Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-16 Thread Michael Armbrust
Mostly true. The execution of two equivalent logical plans will be exactly the same, independent of the dialect. Resolution can be slightly different as SQLContext defaults to case sensitive and HiveContext defaults to case insensitive. One other very technical detail: The actual planning done

Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-15 Thread anyweil
Thank you so much for the reply, here is my code. 1. val conf = new SparkConf().setAppName(Simple Application) 2. conf.setMaster(local) 3. val sc = new SparkContext(conf) 4. val sqlContext = new org.apache.spark.sql.SQLContext(sc) 5. import sqlContext.createSchemaRDD 6. val path1 =

Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-15 Thread Michael Armbrust
Sorry for the trouble. There are two issues here: - Parsing of repeated nested (i.e. something[0].field) is not supported in the plain SQL parser. SPARK-2096 https://issues.apache.org/jira/browse/SPARK-2096 - Resolution is broken in the HiveQL parser. SPARK-2483

Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-15 Thread anyweil
Thank you so much for the information, now i have merge the fix of #1411 and seems the HiveSQL works with: SELECT name FROM people WHERE schools[0].time2. But one more question is: Is it possible or planed to support the schools.time format to filter the record that there is an element inside

Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-15 Thread Jerry Lam
Hi guys, Sorry, I'm also interested in this nested json structure. I have a similar SQL in which I need to query a nested field in a json. Does the above query works if it is used with sql(sqlText) assuming the data is coming directly from hdfs via sqlContext.jsonFile? The SPARK-2483

Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-15 Thread Michael Armbrust
hql and sql are just two different dialects for interacting with data. After parsing is complete and the logical plan is constructed, the execution is exactly the same. On Tue, Jul 15, 2014 at 2:50 PM, Jerry Lam chiling...@gmail.com wrote: Hi Michael, I don't understand the difference

Re: Repeated data item search with Spark SQL(1.0.1)

2014-07-14 Thread Michael Armbrust
Handling of complex types is somewhat limited in SQL at the moment. It'll be more complete if you use HiveQL. That said, the problem here is you are calling .name on an array. You need to pick an item from the array (using [..]) or use something like a lateral view explode. On Sat, Jul 12,

Repeated data item search with Spark SQL(1.0.1)

2014-07-13 Thread anyweil
Hi All: I am using Spark SQL 1.0.1 for a simple test, the loaded data (JSON format) which is registered as table people is: {name:Michael, schools:[{name:ABC,time:1994},{name:EFG,time:2000}]} {name:Andy, age:30,scores:{eng:98,phy:89}} {name:Justin, age:19} the schools has repeated value