Re: SQLContext and HiveContext parse a query string differently ?
Basically, I want to run the following query: select 'a\'b', case(null as Array) However, neither HiveContext and SQLContext can execute it without exception. I have tried sql(select 'a\'b', case(null as Array)) and df.selectExpr("'a\'b'", "case(null as Array)") Neither of them works. >From the exceptions, I find the query is parsed differently. On Fri, May 13, 2016 at 8:00 AM, Yong Zhang <java8...@hotmail.com> wrote: > Not sure what do you mean? You want to have one exactly query running fine > in both sqlContext and HiveContext? The query parser are different, why do > you want to have this feature? Do I understand your question correctly? > > Yong > > -- > Date: Thu, 12 May 2016 13:09:34 +0200 > Subject: SQLContext and HiveContext parse a query string differently ? > From: inv...@gmail.com > To: user@spark.apache.org > > > HI, > > I just want to figure out why the two contexts behavior differently even > on a simple query. > In a netshell, I have a query in which there is a String containing single > quote and casting to Array/Map. > I have tried all the combination of diff type of sql context and query > call api (sql, df.select, df.selectExpr). > I can't find one rules all. > > Here is the code for reproducing the problem. > > - > > import org.apache.spark.sql.SQLContext > import org.apache.spark.sql.hive.HiveContext > import org.apache.spark.{SparkConf, SparkContext} > > object Test extends App { > > val sc = new SparkContext("local[2]", "test", new SparkConf) > val hiveContext = new HiveContext(sc) > val sqlContext = new SQLContext(sc) > > val context = hiveContext > // val context = sqlContext > > import context.implicits._ > > val df = Seq((Seq(1, 2), 2)).toDF("a", "b") > df.registerTempTable("tbl") > df.printSchema() > > // case 1 > context.sql("select cast(a as array) from tbl").show() > // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize > input near 'array' '<' 'string' in primitive type specification; line 1 pos 17 > // SQLContext => OK > > // case 2 > context.sql("select 'a\\'b'").show() > // HiveContext => OK > // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string > literal) found > > // case 3 > df.selectExpr("cast(a as array)").show() // OK with HiveContext and > SQLContext > > // case 4 > df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end > of input expected > } > > - > > Any clarification / workaround is high appreciated. > > -- > Hao Ren > > Data Engineer @ leboncoin > > Paris, France > -- Hao Ren Data Engineer @ leboncoin Paris, France
RE: SQLContext and HiveContext parse a query string differently ?
Not sure what do you mean? You want to have one exactly query running fine in both sqlContext and HiveContext? The query parser are different, why do you want to have this feature? Do I understand your question correctly? Yong Date: Thu, 12 May 2016 13:09:34 +0200 Subject: SQLContext and HiveContext parse a query string differently ? From: inv...@gmail.com To: user@spark.apache.org HI, I just want to figure out why the two contexts behavior differently even on a simple query.In a netshell, I have a query in which there is a String containing single quote and casting to Array/Map.I have tried all the combination of diff type of sql context and query call api (sql, df.select, df.selectExpr).I can't find one rules all. Here is the code for reproducing the problem.- import org.apache.spark.sql.SQLContext import org.apache.spark.sql.hive.HiveContext import org.apache.spark.{SparkConf, SparkContext} object Test extends App { val sc = new SparkContext("local[2]", "test", new SparkConf) val hiveContext = new HiveContext(sc) val sqlContext = new SQLContext(sc) val context = hiveContext // val context = sqlContext import context.implicits._ val df = Seq((Seq(1, 2), 2)).toDF("a", "b") df.registerTempTable("tbl") df.printSchema() // case 1 context.sql("select cast(a as array) from tbl").show() // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize input near 'array' '<' 'string' in primitive type specification; line 1 pos 17 // SQLContext => OK // case 2 context.sql("select 'a\\'b'").show() // HiveContext => OK // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string literal) found // case 3 df.selectExpr("cast(a as array)").show() // OK with HiveContext and SQLContext // case 4 df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end of input expected }- Any clarification / workaround is high appreciated. -- Hao Ren Data Engineer @ leboncoin Paris, France
Re: SQLContext and HiveContext parse a query string differently ?
yep the same error I got root |-- a: array (nullable = true) ||-- element: integer (containsNull = false) |-- b: integer (nullable = false) NoViableAltException(35@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.primitiveType(HiveParser.java:38886) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.castExpression(HiveParser_IdentifiersParser.java:4336) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6235) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6383) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105) at org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45846) at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907) at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373) at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128) at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45817) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.spark.sql.hive.HiveQl$.getAst(HiveQl.scala:276) at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:303) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at
SQLContext and HiveContext parse a query string differently ?
HI, I just want to figure out why the two contexts behavior differently even on a simple query. In a netshell, I have a query in which there is a String containing single quote and casting to Array/Map. I have tried all the combination of diff type of sql context and query call api (sql, df.select, df.selectExpr). I can't find one rules all. Here is the code for reproducing the problem. - import org.apache.spark.sql.SQLContext import org.apache.spark.sql.hive.HiveContext import org.apache.spark.{SparkConf, SparkContext} object Test extends App { val sc = new SparkContext("local[2]", "test", new SparkConf) val hiveContext = new HiveContext(sc) val sqlContext = new SQLContext(sc) val context = hiveContext // val context = sqlContext import context.implicits._ val df = Seq((Seq(1, 2), 2)).toDF("a", "b") df.registerTempTable("tbl") df.printSchema() // case 1 context.sql("select cast(a as array) from tbl").show() // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize input near 'array' '<' 'string' in primitive type specification; line 1 pos 17 // SQLContext => OK // case 2 context.sql("select 'a\\'b'").show() // HiveContext => OK // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string literal) found // case 3 df.selectExpr("cast(a as array)").show() // OK with HiveContext and SQLContext // case 4 df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end of input expected } - Any clarification / workaround is high appreciated. -- Hao Ren Data Engineer @ leboncoin Paris, France