Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-11-06 Thread Corey Nolet
Michael, Thanks for the explanation. I was able to get this running. On Wed, Oct 29, 2014 at 3:07 PM, Michael Armbrust mich...@databricks.com wrote: We are working on more helpful error messages, but in the meantime let me explain how to read this output.

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-29 Thread Michael Armbrust
We are working on more helpful error messages, but in the meantime let me explain how to read this output. org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: 'p.name,'p.age, tree: Project ['p.name,'p.age] Filter ('location.number = 2300) Join Inner,

Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Brett Antonides
Hello, Given the following example customers.json file: { name: Sherlock Holmes, customerNumber: 12345, address: { street: 221b Baker Street, city: London, zipcode: NW1 6XE, country: United Kingdom } }, { name: Big Bird, customerNumber: 10001, address: { street: 123 Sesame Street, city:

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Michael Armbrust
Try: address.city.attr On Tue, Oct 28, 2014 at 8:30 AM, Brett Antonides banto...@gmail.com wrote: Hello, Given the following example customers.json file: { name: Sherlock Holmes, customerNumber: 12345, address: { street: 221b Baker Street, city: London, zipcode: NW1 6XE, country:

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Michael Armbrust
On Tue, Oct 28, 2014 at 2:19 PM, Corey Nolet cjno...@gmail.com wrote: Is it possible to select if, say, there was an addresses field that had a json array? You can get the Nth item by address.getItem(0). If you want to walk through the whole array look at LATERAL VIEW EXPLODE in HiveQL

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Corey Nolet
So it wouldn't be possible to have a json string like this: { name:John, age:53, locations: [{ street:Rodeo Dr, number:2300 }]} And query all people who have a location with number = 2300? On Tue, Oct 28, 2014 at 5:30 PM, Michael Armbrust mich...@databricks.com wrote: On Tue, Oct 28, 2014

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Michael Armbrust
You can do this: $ sbt/sbt hive/console scala jsonRDD(sparkContext.parallelize({ name:John, age:53, locations: [{ street:Rodeo Dr, number:2300 }]} :: Nil)).registerTempTable(people) scala sql(SELECT name FROM people LATERAL VIEW explode(locations) l AS location WHERE location.number =

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Corey Nolet
Michael, Awesome, this is what I was looking for. So it's possible to use hive dialect in a regular sql context? This is what was confusing to me- the docs kind of allude to it but don't directly point it out. On Tue, Oct 28, 2014 at 9:30 PM, Michael Armbrust mich...@databricks.com wrote: You

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Corey Nolet
Am I able to do a join on an exploded field? Like if I have another object: { streetNumber:2300, locationName:The Big Building} and I want to join with the previous json by the locations[].number field- is that possible? On Tue, Oct 28, 2014 at 9:31 PM, Corey Nolet cjno...@gmail.com wrote:

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Michael Armbrust
On Tue, Oct 28, 2014 at 6:56 PM, Corey Nolet cjno...@gmail.com wrote: Am I able to do a join on an exploded field? Like if I have another object: { streetNumber:2300, locationName:The Big Building} and I want to join with the previous json by the locations[].number field- is that possible?

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Michael Armbrust
Can you println the .queryExecution of the SchemaRDD? On Tue, Oct 28, 2014 at 7:43 PM, Corey Nolet cjno...@gmail.com wrote: So this appears to work just fine: hctx.sql(SELECT p.name, p.age FROM people p LATERAL VIEW explode(locations) l AS location JOIN location5 lo ON l.number =

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

2014-10-28 Thread Corey Nolet
scala locations.queryExecution warning: there were 1 feature warning(s); re-run with -feature for details res28: _4.sqlContext.QueryExecution forSome { val _4: org.apache.spark.sql.SchemaRDD } = == Parsed Logical Plan == SparkLogicalPlan (ExistingRdd [locationName#80,locationNumber#81],