Re: SQLContext and HiveContext parse a query string differently ?

2016-05-13 Thread Hao Ren
Basically, I want to run the following query:

select 'a\'b', case(null as Array)

However, neither HiveContext and SQLContext can execute it without
exception.

I have tried

sql(select 'a\'b', case(null as Array))

and

df.selectExpr("'a\'b'", "case(null as Array)")

Neither of them works.

>From the exceptions, I find the query is parsed differently.



On Fri, May 13, 2016 at 8:00 AM, Yong Zhang <java8...@hotmail.com> wrote:

> Not sure what do you mean? You want to have one exactly query running fine
> in both sqlContext and HiveContext? The query parser are different, why do
> you want to have this feature? Do I understand your question correctly?
>
> Yong
>
> --
> Date: Thu, 12 May 2016 13:09:34 +0200
> Subject: SQLContext and HiveContext parse a query string differently ?
> From: inv...@gmail.com
> To: user@spark.apache.org
>
>
> HI,
>
> I just want to figure out why the two contexts behavior differently even
> on a simple query.
> In a netshell, I have a query in which there is a String containing single
> quote and casting to Array/Map.
> I have tried all the combination of diff type of sql context and query
> call api (sql, df.select, df.selectExpr).
> I can't find one rules all.
>
> Here is the code for reproducing the problem.
>
> -
>
> import org.apache.spark.sql.SQLContext
> import org.apache.spark.sql.hive.HiveContext
> import org.apache.spark.{SparkConf, SparkContext}
>
> object Test extends App {
>
>   val sc  = new SparkContext("local[2]", "test", new SparkConf)
>   val hiveContext = new HiveContext(sc)
>   val sqlContext  = new SQLContext(sc)
>
>   val context = hiveContext
>   //  val context = sqlContext
>
>   import context.implicits._
>
>   val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
>   df.registerTempTable("tbl")
>   df.printSchema()
>
>   // case 1
>   context.sql("select cast(a as array) from tbl").show()
>   // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize 
> input near 'array' '<' 'string' in primitive type specification; line 1 pos 17
>   // SQLContext => OK
>
>   // case 2
>   context.sql("select 'a\\'b'").show()
>   // HiveContext => OK
>   // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string 
> literal) found
>
>   // case 3
>   df.selectExpr("cast(a as array)").show() // OK with HiveContext and 
> SQLContext
>
>   // case 4
>   df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end 
> of input expected
> }
>
> -
>
> Any clarification / workaround is high appreciated.
>
> --
> Hao Ren
>
> Data Engineer @ leboncoin
>
> Paris, France
>



-- 
Hao Ren

Data Engineer @ leboncoin

Paris, France


RE: SQLContext and HiveContext parse a query string differently ?

2016-05-12 Thread Yong Zhang
Not sure what do you mean? You want to have one exactly query running fine in 
both sqlContext and HiveContext? The query parser are different, why do you 
want to have this feature? Do I understand your question correctly?
Yong

Date: Thu, 12 May 2016 13:09:34 +0200
Subject: SQLContext and HiveContext parse a query string differently ?
From: inv...@gmail.com
To: user@spark.apache.org

HI,
I just want to figure out why the two contexts behavior differently even on a 
simple query.In a netshell, I have a query in which there is a String 
containing single quote and casting to Array/Map.I have tried all the 
combination of diff type of sql context and query call api (sql, df.select, 
df.selectExpr).I can't find one rules all.
Here is the code for reproducing the 
problem.-
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.{SparkConf, SparkContext}

object Test extends App {

  val sc  = new SparkContext("local[2]", "test", new SparkConf)
  val hiveContext = new HiveContext(sc)
  val sqlContext  = new SQLContext(sc)

  val context = hiveContext
  //  val context = sqlContext

  import context.implicits._

  val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
  df.registerTempTable("tbl")
  df.printSchema()

  // case 1
  context.sql("select cast(a as array) from tbl").show()
  // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize 
input near 'array' '<' 'string' in primitive type specification; line 1 pos 17
  // SQLContext => OK

  // case 2
  context.sql("select 'a\\'b'").show()
  // HiveContext => OK
  // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string 
literal) found

  // case 3
  df.selectExpr("cast(a as array)").show() // OK with HiveContext and 
SQLContext

  // case 4
  df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end of 
input expected
}-
Any clarification / workaround is high appreciated.
-- 
Hao Ren
Data Engineer @ leboncoin
Paris, France
  

Re: SQLContext and HiveContext parse a query string differently ?

2016-05-12 Thread Mich Talebzadeh
yep the same error I got

root
 |-- a: array (nullable = true)
 ||-- element: integer (containsNull = false)
 |-- b: integer (nullable = false)
NoViableAltException(35@[])
at
org.apache.hadoop.hive.ql.parse.HiveParser.primitiveType(HiveParser.java:38886)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.castExpression(HiveParser_IdentifiersParser.java:4336)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6235)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6383)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
at
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
at
org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45846)
at
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
at
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
at
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
at
org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45817)
at
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
at
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
at
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
at
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
at
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
at
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
at
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
at
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.spark.sql.hive.HiveQl$.getAst(HiveQl.scala:276)
at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:303)
at
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
at
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
at
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at

SQLContext and HiveContext parse a query string differently ?

2016-05-12 Thread Hao Ren
HI,

I just want to figure out why the two contexts behavior differently even on
a simple query.
In a netshell, I have a query in which there is a String containing single
quote and casting to Array/Map.
I have tried all the combination of diff type of sql context and query call
api (sql, df.select, df.selectExpr).
I can't find one rules all.

Here is the code for reproducing the problem.
-

import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.{SparkConf, SparkContext}

object Test extends App {

  val sc  = new SparkContext("local[2]", "test", new SparkConf)
  val hiveContext = new HiveContext(sc)
  val sqlContext  = new SQLContext(sc)

  val context = hiveContext
  //  val context = sqlContext

  import context.implicits._

  val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
  df.registerTempTable("tbl")
  df.printSchema()

  // case 1
  context.sql("select cast(a as array) from tbl").show()
  // HiveContext => org.apache.spark.sql.AnalysisException: cannot
recognize input near 'array' '<' 'string' in primitive type
specification; line 1 pos 17
  // SQLContext => OK

  // case 2
  context.sql("select 'a\\'b'").show()
  // HiveContext => OK
  // SQLContext => failure: ``union'' expected but ErrorToken(unclosed
string literal) found

  // case 3
  df.selectExpr("cast(a as array)").show() // OK with
HiveContext and SQLContext

  // case 4
  df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext =>
failure: end of input expected
}

-

Any clarification / workaround is high appreciated.

-- 
Hao Ren

Data Engineer @ leboncoin

Paris, France