Mostly true. The execution of two equivalent logical plans will be exactly
the same, independent of the dialect. Resolution can be slightly different
as SQLContext defaults to case sensitive and HiveContext defaults to case
insensitive.
One other very technical detail: The actual planning done
Thank you so much for the reply, here is my code.
1. val conf = new SparkConf().setAppName(Simple Application)
2. conf.setMaster(local)
3. val sc = new SparkContext(conf)
4. val sqlContext = new org.apache.spark.sql.SQLContext(sc)
5. import sqlContext.createSchemaRDD
6. val path1 =
Sorry for the trouble. There are two issues here:
- Parsing of repeated nested (i.e. something[0].field) is not supported in
the plain SQL parser. SPARK-2096
https://issues.apache.org/jira/browse/SPARK-2096
- Resolution is broken in the HiveQL parser. SPARK-2483
Thank you so much for the information, now i have merge the fix of #1411 and
seems the HiveSQL works with:
SELECT name FROM people WHERE schools[0].time2.
But one more question is:
Is it possible or planed to support the schools.time format to filter the
record that there is an element inside
Hi guys,
Sorry, I'm also interested in this nested json structure.
I have a similar SQL in which I need to query a nested field in a json.
Does the above query works if it is used with sql(sqlText) assuming the
data is coming directly from hdfs via sqlContext.jsonFile?
The SPARK-2483
hql and sql are just two different dialects for interacting with data.
After parsing is complete and the logical plan is constructed, the
execution is exactly the same.
On Tue, Jul 15, 2014 at 2:50 PM, Jerry Lam chiling...@gmail.com wrote:
Hi Michael,
I don't understand the difference
Handling of complex types is somewhat limited in SQL at the moment. It'll
be more complete if you use HiveQL.
That said, the problem here is you are calling .name on an array. You need
to pick an item from the array (using [..]) or use something like a lateral
view explode.
On Sat, Jul 12,
Hi All:
I am using Spark SQL 1.0.1 for a simple test, the loaded data (JSON format)
which is registered as table people is:
{name:Michael,
schools:[{name:ABC,time:1994},{name:EFG,time:2000}]}
{name:Andy, age:30,scores:{eng:98,phy:89}}
{name:Justin, age:19}
the schools has repeated value