Oh sorry _1 is not a valid hive identifier, you need to use backticks to escape it:
Seq(((1, 2), 2)).toDF().registerTempTable("test") sql("SELECT `_1`.`_1` FROM test") On Tue, Nov 10, 2015 at 11:31 AM, pratik khadloya <tispra...@gmail.com> wrote: > I tried the same, didn't work :( > > scala> hc.sql("select _1.item_id from agg_imps_df limit 10").collect() > 15/11/10 14:30:41 INFO parse.ParseDriver: Parsing command: select > _1.item_id from agg_imps_df limit 10 > org.apache.spark.sql.AnalysisException: missing \' at 'from' near '<EOF>'; > line 1 pos 23 > at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:289) > at > org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41) > at > org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40) > at > scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) > at > scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) > at > scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) > at > scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) > at > scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) > > On Tue, Nov 10, 2015 at 11:25 AM Michael Armbrust <mich...@databricks.com> > wrote: > >> Use a `.`: >> >> hc.sql("select _1.item_id from agg_imps_df limit 10").collect() >> >> On Tue, Nov 10, 2015 at 11:24 AM, pratik khadloya <tispra...@gmail.com> >> wrote: >> >>> Hello, >>> >>> I just saved a PairRDD as a table, but i am not able to query it >>> correctly. The below and other variations does not seem to work. >>> >>> scala> hc.sql("select * from agg_imps_df").printSchema() >>> |-- _1: struct (nullable = true) >>> | |-- item_id: long (nullable = true) >>> | |-- flight_id: long (nullable = true) >>> |-- _2: struct (nullable = true) >>> | |-- day_hour: string (nullable = true) >>> | |-- imps: long (nullable = true) >>> | |-- revenue: double (nullable = true) >>> >>> >>> scala> hc.sql("select _1:item_id from agg_imps_df limit 10").collect() >>> >>> >>> Can anyone please suggest the correct way to get the list of item_ids in >>> the query? >>> >>> Thanks, >>> ~Pratik >>> >> >>