I suspect that you hit this bug
https://issues.apache.org/jira/browse/SPARK-6250, it depends on the
actual contents of your query.
Yin had opened a PR for this, although not merged yet, it should be a
valid fix https://github.com/apache/spark/pull/5078
This fix will be included in 1.3.1.
Cheng
On 3/18/15 10:04 PM, Roberto Coluccio wrote:
Hi Cheng, thanks for your reply.
The query is something like:
SELECT * FROM (
SELECT m.column1, IF (d.columnA IS NOT null, d.columnA,
m.column2), ..., m.columnN FROM tableD d RIGHT OUTER JOIN tableM m
on m.column2 = d.columnA WHERE m.column2!=\"None\" AND d.columnA!=\"\"
UNION ALL
SELECT ... [another SELECT statement with different conditions
but same tables]
UNION ALL
SELECT ... [another SELECT statement with different conditions
but same tables]
) a
I'm using just sqlContext, no hiveContext. Please, note once again
that this perfectly worked w/ Spark 1.1.x.
The tables, i.e. tableD and tableM are previously registered with the
RDD.registerTempTable method, where the input RDDs are actually a
RDD[MyCaseClassM/D], with MyCaseClassM and MyCaseClassD being simple
case classes with only (and less than 22) String fields.
Hope the situation is a bit more clear. Thanks anyone who will help me
out here.
Roberto
On Wed, Mar 18, 2015 at 12:09 PM, Cheng Lian <lian.cs....@gmail.com
<mailto:lian.cs....@gmail.com>> wrote:
Would you mind to provide the query? If it's confidential, could
you please help constructing a query that reproduces this issue?
Cheng
On 3/18/15 6:03 PM, Roberto Coluccio wrote:
Hi everybody,
When trying to upgrade from Spark 1.1.1 to Spark 1.2.x (tried
both 1.2.0 and 1.2.1) I encounter a weird error never occurred
before about which I'd kindly ask for any possible help.
In particular, all my Spark SQL queries fail with the following
exception:
java.lang.RuntimeException: [1.218] failure: identifier expected
[my query listed]
^
at scala.sys.package$.error(package.scala:27)
at
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
at
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
at
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
...
The unit tests I've got for testing this stuff fail both if I
build+test the project with Maven and if I run then as single
ScalaTest files or test suites/packages.
When running my app as usual on EMR in YARN-cluster mode, I get
the following:
|15/03/17 11:32:14 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception: [1.218] failure: identifier
expected
SELECT * FROM ... (my query)
^)
Exception in thread "Driver" java.lang.RuntimeException: [1.218] failure:
identifier expected
SELECT * FROM... (my query)
^
at scala.sys.package$.error(package.scala:27)
at
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:33)
at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
at
org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:79)
at
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:174)
at
org.apache.spark.sql.catalyst.SparkSQLParser$$anonfun$org$apache$spark$sql$catalyst$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:173)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
at
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
at
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
at
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at
scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
at
scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
at
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.apply(SparkSQLParser.scala:31)
at
org.apache.spark.sql.SQLContext$$anonfun$parseSql$1.apply(SQLContext.scala:83)
at
org.apache.spark.sql.SQLContext$$anonfun$parseSql$1.apply(SQLContext.scala:83)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:83)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:303)
at mycompany.mypackage.MyClassFunction.apply(MyClassFunction.scala:34)
atmycompany.mypackage.MyClass$.main(MyClass.scala:254)
atmycompany.mypackage.MyClass.main(MyClass.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:441)
15/03/17 11:32:14 INFO yarn.ApplicationMaster: Invoking sc stop from
shutdown hook|
Any suggestions?
Thanks,
Roberto