Re: Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x

2015-03-18 Thread Cheng Lian
Would you mind to provide the query? If it's confidential, could you 
please help constructing a query that reproduces this issue?


Cheng

On 3/18/15 6:03 PM, Roberto Coluccio wrote:

Hi everybody,

When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried both 
1.2.0 and 1.2.1) I encounter a weird error never occurred before about 
which I'd kindly ask for any possible help.


 In particular, all my Spark SQL queries fail with the following 
exception:


java.lang.RuntimeException: [1.218] failure: identifier expected

[my query listed]
  ^
  at scala.sys.package$.error(package.scala:27)
  at

org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
  at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
  at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
  at

org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
  at

org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
  at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
  at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
  at

scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
  at

scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
  ...



The unit tests I've got for testing this stuff fail both if I 
build+test the project with Maven and if I run then as single 
ScalaTest files or test suites/packages.


When running my app as usual on EMR in YARN-cluster mode, I get the 
following:


|15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED, 
exitCode: 15, (reason: User class threw exception: [1.218] failure: identifier 
expected

SELECT * FROM ... (my query)


  ^)
Exception in thread Driver java.lang.RuntimeException: [1.218] failure: 
identifier expected

SELECT * FROM... (my query)


  ^
 at scala.sys.package$.error(package.scala:27)
 at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at 
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
 at 
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
 at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
 at 
scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
 at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:31)
 at 
org.apache.spark.sql.SQLContext$$anonfun$parseSql$1.apply(SQLContext.scala:83)
 at 
org.apache.spark.sql.SQLContext$$anonfun$parseSql$1.apply(SQLContext.scala:83)
 at scala.Option.getOrElse(Option.scala:120)
 at 

Re: Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x

2015-03-18 Thread Cheng Lian
I suspect that you hit this bug 
https://issues.apache.org/jira/browse/SPARK-6250, it depends on the 
actual contents of your query.


Yin had opened a PR for this, although not merged yet, it should be a 
valid fix https://github.com/apache/spark/pull/5078


This fix will be included in 1.3.1.

Cheng

On 3/18/15 10:04 PM, Roberto Coluccio wrote:

Hi Cheng, thanks for your reply.

The query is something like:

SELECT * FROM (
  SELECT m.column1, IF (d.columnA IS NOT null, d.columnA,
m.column2), ..., m.columnN FROM tableD d RIGHT OUTER JOIN tableM m
on m.column2 = d.columnA WHERE m.column2!=\None\ AND d.columnA!=\\
  UNION ALL
  SELECT ... [another SELECT statement with different conditions
but same tables]
  UNION ALL
  SELECT ... [another SELECT statement with different conditions
but same tables]
) a


I'm using just sqlContext, no hiveContext. Please, note once again 
that this perfectly worked w/ Spark 1.1.x.


The tables, i.e. tableD and tableM are previously registered with the 
RDD.registerTempTable method, where the input RDDs are actually a 
RDD[MyCaseClassM/D], with MyCaseClassM and MyCaseClassD being simple 
case classes with only (and less than 22) String fields.


Hope the situation is a bit more clear. Thanks anyone who will help me 
out here.


Roberto



On Wed, Mar 18, 2015 at 12:09 PM, Cheng Lian lian.cs@gmail.com 
mailto:lian.cs@gmail.com wrote:


Would you mind to provide the query? If it's confidential, could
you please help constructing a query that reproduces this issue?

Cheng

On 3/18/15 6:03 PM, Roberto Coluccio wrote:

Hi everybody,

When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried
both 1.2.0 and 1.2.1) I encounter a weird error never occurred
before about which I'd kindly ask for any possible help.

 In particular, all my Spark SQL queries fail with the following
exception:

java.lang.RuntimeException: [1.218] failure: identifier expected

[my query listed]
  ^
  at scala.sys.package$.error(package.scala:27)
  at

org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
  at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
  at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
  at

org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
  at

org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
  at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
  at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
  at

scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
  at

scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
  ...



The unit tests I've got for testing this stuff fail both if I
build+test the project with Maven and if I run then as single
ScalaTest files or test suites/packages.

When running my app as usual on EMR in YARN-cluster mode, I get
the following:

|15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED, 
exitCode: 15, (reason: User class threw exception: [1.218] failure: identifier 
expected

SELECT * FROM ... (my query)


  ^)
Exception in thread Driver java.lang.RuntimeException: [1.218] failure: 
identifier expected

SELECT * FROM... (my query)


  ^
 at scala.sys.package$.error(package.scala:27)
 at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
 at 
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at 
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at 
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
 at 
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
 at 

Re: Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x

2015-03-18 Thread Roberto Coluccio
You know, I actually have one of the columns called timestamp ! This may
really cause the problem reported in the bug you linked, I guess.

On Wed, Mar 18, 2015 at 3:37 PM, Cheng Lian lian.cs@gmail.com wrote:

  I suspect that you hit this bug
 https://issues.apache.org/jira/browse/SPARK-6250, it depends on the
 actual contents of your query.

 Yin had opened a PR for this, although not merged yet, it should be a
 valid fix https://github.com/apache/spark/pull/5078

 This fix will be included in 1.3.1.

 Cheng

 On 3/18/15 10:04 PM, Roberto Coluccio wrote:

 Hi Cheng, thanks for your reply.

  The query is something like:

  SELECT * FROM (
   SELECT m.column1, IF (d.columnA IS NOT null, d.columnA, m.column2),
 ..., m.columnN FROM tableD d RIGHT OUTER JOIN tableM m on m.column2 =
 d.columnA WHERE m.column2!=\None\ AND d.columnA!=\\
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but same
 tables]
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but same
 tables]
 ) a


  I'm using just sqlContext, no hiveContext. Please, note once again that
 this perfectly worked w/ Spark 1.1.x.

  The tables, i.e. tableD and tableM are previously registered with the
 RDD.registerTempTable method, where the input RDDs are actually a
 RDD[MyCaseClassM/D], with MyCaseClassM and MyCaseClassD being simple case
 classes with only (and less than 22) String fields.

  Hope the situation is a bit more clear. Thanks anyone who will help me
 out here.

  Roberto



 On Wed, Mar 18, 2015 at 12:09 PM, Cheng Lian lian.cs@gmail.com
 wrote:

  Would you mind to provide the query? If it's confidential, could you
 please help constructing a query that reproduces this issue?

 Cheng

 On 3/18/15 6:03 PM, Roberto Coluccio wrote:

 Hi everybody,

  When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried both
 1.2.0 and 1.2.1) I encounter a weird error never occurred before about
 which I'd kindly ask for any possible help.

   In particular, all my Spark SQL queries fail with the following
 exception:

  java.lang.RuntimeException: [1.218] failure: identifier expected

 [my query listed]
   ^
   at scala.sys.package$.error(package.scala:27)
   at
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
   at
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
   at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
   at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   ...



  The unit tests I've got for testing this stuff fail both if I
 build+test the project with Maven and if I run then as single ScalaTest
 files or test suites/packages.

  When running my app as usual on EMR in YARN-cluster mode, I get the
 following:

  15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)
  
  
^)
 Exception in thread Driver java.lang.RuntimeException: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query) 
  
  
  ^
 at scala.sys.package$.error(package.scala:27)
 at 
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at 
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
 at 
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
 at 
 

Re: Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x

2015-03-18 Thread Roberto Coluccio
Hi Cheng, thanks for your reply.

The query is something like:

SELECT * FROM (
   SELECT m.column1, IF (d.columnA IS NOT null, d.columnA, m.column2), ...,
 m.columnN FROM tableD d RIGHT OUTER JOIN tableM m on m.column2 = d.columnA
 WHERE m.column2!=\None\ AND d.columnA!=\\
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but same
 tables]
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but same
 tables]
 ) a


I'm using just sqlContext, no hiveContext. Please, note once again that
this perfectly worked w/ Spark 1.1.x.

The tables, i.e. tableD and tableM are previously registered with the
RDD.registerTempTable method, where the input RDDs are actually a
RDD[MyCaseClassM/D], with MyCaseClassM and MyCaseClassD being simple case
classes with only (and less than 22) String fields.

Hope the situation is a bit more clear. Thanks anyone who will help me out
here.

Roberto



On Wed, Mar 18, 2015 at 12:09 PM, Cheng Lian lian.cs@gmail.com wrote:

  Would you mind to provide the query? If it's confidential, could you
 please help constructing a query that reproduces this issue?

 Cheng

 On 3/18/15 6:03 PM, Roberto Coluccio wrote:

 Hi everybody,

  When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried both 1.2.0
 and 1.2.1) I encounter a weird error never occurred before about which I'd
 kindly ask for any possible help.

   In particular, all my Spark SQL queries fail with the following
 exception:

  java.lang.RuntimeException: [1.218] failure: identifier expected

 [my query listed]
   ^
   at scala.sys.package$.error(package.scala:27)
   at
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
   at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
   at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
   at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   ...



  The unit tests I've got for testing this stuff fail both if I build+test
 the project with Maven and if I run then as single ScalaTest files or test
 suites/packages.

  When running my app as usual on EMR in YARN-cluster mode, I get the
 following:

  15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)
   
   
  ^)
 Exception in thread Driver java.lang.RuntimeException: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)  
   
   
   ^
 at scala.sys.package$.error(package.scala:27)
 at 
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at 
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
 at 
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   

Re: Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x

2015-03-18 Thread Roberto Coluccio
Hey Cheng, thank you so much for your suggestion, the problem was actually
a column/field called timestamp in one of the case classes!! Once I
changed its name everything worked out fine again. Let me say it was kinda
frustrating ...

Roberto

On Wed, Mar 18, 2015 at 4:07 PM, Roberto Coluccio 
roberto.coluc...@gmail.com wrote:

 You know, I actually have one of the columns called timestamp ! This may
 really cause the problem reported in the bug you linked, I guess.

 On Wed, Mar 18, 2015 at 3:37 PM, Cheng Lian lian.cs@gmail.com wrote:

  I suspect that you hit this bug
 https://issues.apache.org/jira/browse/SPARK-6250, it depends on the
 actual contents of your query.

 Yin had opened a PR for this, although not merged yet, it should be a
 valid fix https://github.com/apache/spark/pull/5078

 This fix will be included in 1.3.1.

 Cheng

 On 3/18/15 10:04 PM, Roberto Coluccio wrote:

 Hi Cheng, thanks for your reply.

  The query is something like:

  SELECT * FROM (
   SELECT m.column1, IF (d.columnA IS NOT null, d.columnA, m.column2),
 ..., m.columnN FROM tableD d RIGHT OUTER JOIN tableM m on m.column2 =
 d.columnA WHERE m.column2!=\None\ AND d.columnA!=\\
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but
 same tables]
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but
 same tables]
 ) a


  I'm using just sqlContext, no hiveContext. Please, note once again that
 this perfectly worked w/ Spark 1.1.x.

  The tables, i.e. tableD and tableM are previously registered with the
 RDD.registerTempTable method, where the input RDDs are actually a
 RDD[MyCaseClassM/D], with MyCaseClassM and MyCaseClassD being simple
 case classes with only (and less than 22) String fields.

  Hope the situation is a bit more clear. Thanks anyone who will help me
 out here.

  Roberto



 On Wed, Mar 18, 2015 at 12:09 PM, Cheng Lian lian.cs@gmail.com
 wrote:

  Would you mind to provide the query? If it's confidential, could you
 please help constructing a query that reproduces this issue?

 Cheng

 On 3/18/15 6:03 PM, Roberto Coluccio wrote:

 Hi everybody,

  When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried both
 1.2.0 and 1.2.1) I encounter a weird error never occurred before about
 which I'd kindly ask for any possible help.

   In particular, all my Spark SQL queries fail with the following
 exception:

  java.lang.RuntimeException: [1.218] failure: identifier expected

 [my query listed]
   ^
   at scala.sys.package$.error(package.scala:27)
   at
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
   at
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
   at
 scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
   at
 scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   ...



  The unit tests I've got for testing this stuff fail both if I
 build+test the project with Maven and if I run then as single ScalaTest
 files or test suites/packages.

  When running my app as usual on EMR in YARN-cluster mode, I get the
 following:

  15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)
 
 
  ^)
 Exception in thread Driver java.lang.RuntimeException: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)
 
 
 ^
 at scala.sys.package$.error(package.scala:27)
 at 
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at 
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
 at 
 

Re: Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x

2015-03-18 Thread Yin Huai
Hi Roberto,

For now, if the timestamp is a top level column (not a field in a
struct), you can use use backticks to quote the column name like `timestamp
`.

Thanks,

Yin

On Wed, Mar 18, 2015 at 12:10 PM, Roberto Coluccio 
roberto.coluc...@gmail.com wrote:

 Hey Cheng, thank you so much for your suggestion, the problem was actually
 a column/field called timestamp in one of the case classes!! Once I
 changed its name everything worked out fine again. Let me say it was kinda
 frustrating ...

 Roberto

 On Wed, Mar 18, 2015 at 4:07 PM, Roberto Coluccio 
 roberto.coluc...@gmail.com wrote:

 You know, I actually have one of the columns called timestamp ! This
 may really cause the problem reported in the bug you linked, I guess.

 On Wed, Mar 18, 2015 at 3:37 PM, Cheng Lian lian.cs@gmail.com
 wrote:

  I suspect that you hit this bug
 https://issues.apache.org/jira/browse/SPARK-6250, it depends on the
 actual contents of your query.

 Yin had opened a PR for this, although not merged yet, it should be a
 valid fix https://github.com/apache/spark/pull/5078

 This fix will be included in 1.3.1.

 Cheng

 On 3/18/15 10:04 PM, Roberto Coluccio wrote:

 Hi Cheng, thanks for your reply.

  The query is something like:

  SELECT * FROM (
   SELECT m.column1, IF (d.columnA IS NOT null, d.columnA, m.column2),
 ..., m.columnN FROM tableD d RIGHT OUTER JOIN tableM m on m.column2 =
 d.columnA WHERE m.column2!=\None\ AND d.columnA!=\\
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but
 same tables]
   UNION ALL
   SELECT ... [another SELECT statement with different conditions but
 same tables]
 ) a


  I'm using just sqlContext, no hiveContext. Please, note once again
 that this perfectly worked w/ Spark 1.1.x.

  The tables, i.e. tableD and tableM are previously registered with the
 RDD.registerTempTable method, where the input RDDs are actually a
 RDD[MyCaseClassM/D], with MyCaseClassM and MyCaseClassD being simple
 case classes with only (and less than 22) String fields.

  Hope the situation is a bit more clear. Thanks anyone who will help me
 out here.

  Roberto



 On Wed, Mar 18, 2015 at 12:09 PM, Cheng Lian lian.cs@gmail.com
 wrote:

  Would you mind to provide the query? If it's confidential, could you
 please help constructing a query that reproduces this issue?

 Cheng

 On 3/18/15 6:03 PM, Roberto Coluccio wrote:

 Hi everybody,

  When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried both
 1.2.0 and 1.2.1) I encounter a weird error never occurred before about
 which I'd kindly ask for any possible help.

   In particular, all my Spark SQL queries fail with the following
 exception:

  java.lang.RuntimeException: [1.218] failure: identifier expected

 [my query listed]
   ^
   at scala.sys.package$.error(package.scala:27)
   at
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
   at
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
   at
 org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
   at
 scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
   at
 scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   at
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
   ...



  The unit tests I've got for testing this stuff fail both if I
 build+test the project with Maven and if I run then as single ScalaTest
 files or test suites/packages.

  When running my app as usual on EMR in YARN-cluster mode, I get the
 following:

  15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)


^)
 Exception in thread Driver java.lang.RuntimeException: [1.218] failure: 
 identifier expected

 SELECT * FROM ... (my query)   


^
 at scala.sys.package$.error(package.scala:27)
 at 
 org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
 at 
 org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
 at