[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956094#comment-14956094 ]
Xin Wu commented on SPARK-10747: -------------------------------- I ran this query on the released Hive 1.2.1 version, and this is not supported yet {code} hive> select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc nulls last) from tolap; FAILED: ParseException line 1:76 missing ) at 'nulls' near 'nulls' line 1:82 missing EOF at 'last' near 'nulls' {code} And SparkSQL is using Hive ql parser to parse the query. and it will fail. {code} scala> sqlContext.sql("select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc nulls last) from tolap") org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' near 'nulls' line 1:82 missing EOF at 'last' near 'nulls'; at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:298) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110) at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34) at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:276) at org.apache.spark.sql.hive.HiveQLDialect.parse(HiveContext.scala:62) at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:173) at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:173) at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115) at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) {code} In HiveQl.scala, you see the following where getAst(sql) will throw the org.apache.hadoop.hive.ql.parse.ParseException, {code} def createPlan(sql: String): LogicalPlan = { try { val tree = getAst(sql) if (nativeCommands contains tree.getText) { HiveNativeCommand(sql) } else { nodeToPlan(tree) match { case NativePlaceholder => HiveNativeCommand(sql) case other => other } } } catch { case pe: org.apache.hadoop.hive.ql.parse.ParseException => pe.getMessage match { case errorRegEx(line, start, message) => throw new AnalysisException(message, Some(line.toInt), Some(start.toInt)) case otherMessage => throw new AnalysisException(otherMessage) } {code} which is thrown by org.apache.hadoop.hive.ql.parse.ParseDriver.java {code} public ASTNode parse(String command) throws ParseException { return this.parse(command, (Context)null); } {code} So I think this needs to wait for HIVE-9535 to be resolved.. I am new and learning the spark code, so I hope my understanding is correct here. > add support for window specification to include how NULLS are ordered > --------------------------------------------------------------------- > > Key: SPARK-10747 > URL: https://issues.apache.org/jira/browse/SPARK-10747 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 1.5.0 > Reporter: N Campbell > > You cannot express how NULLS are to be sorted in the window order > specification and have to use a compensating expression to simulate. > Error: org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' > near 'nulls' > line 1:82 missing EOF at 'last' near 'nulls'; > SQLState: null > Same limitation as Hive reported in Apache JIRA HIVE-9535 ) > This fails > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc > nulls last) from tolap > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by case when > c3 is null then 1 else 0 end) from tolap -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org