[jira] [Commented] (SPARK-10747) add support for window specification to include how NULLS are ordered
[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440656#comment-15440656 ] Xin Wu commented on SPARK-10747: This JIRA may be changed to support NULLS FIRST|LAST feature in ORDER BY clause. > add support for window specification to include how NULLS are ordered > - > > Key: SPARK-10747 > URL: https://issues.apache.org/jira/browse/SPARK-10747 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 1.5.0 >Reporter: N Campbell > > You cannot express how NULLS are to be sorted in the window order > specification and have to use a compensating expression to simulate. > Error: org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' > near 'nulls' > line 1:82 missing EOF at 'last' near 'nulls'; > SQLState: null > Same limitation as Hive reported in Apache JIRA HIVE-9535 ) > This fails > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc > nulls last) from tolap > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by case when > c3 is null then 1 else 0 end) from tolap -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10747) add support for window specification to include how NULLS are ordered
[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440651#comment-15440651 ] Apache Spark commented on SPARK-10747: -- User 'xwu0226' has created a pull request for this issue: https://github.com/apache/spark/pull/14842 > add support for window specification to include how NULLS are ordered > - > > Key: SPARK-10747 > URL: https://issues.apache.org/jira/browse/SPARK-10747 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.5.0 >Reporter: N Campbell > > You cannot express how NULLS are to be sorted in the window order > specification and have to use a compensating expression to simulate. > Error: org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' > near 'nulls' > line 1:82 missing EOF at 'last' near 'nulls'; > SQLState: null > Same limitation as Hive reported in Apache JIRA HIVE-9535 ) > This fails > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc > nulls last) from tolap > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by case when > c3 is null then 1 else 0 end) from tolap -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10747) add support for window specification to include how NULLS are ordered
[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425547#comment-15425547 ] Xin Wu commented on SPARK-10747: [~hvanhovell] Yes. Since we have native parser now, we can do this within SparkSQL. I can work on this. Thanks! > add support for window specification to include how NULLS are ordered > - > > Key: SPARK-10747 > URL: https://issues.apache.org/jira/browse/SPARK-10747 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.5.0 >Reporter: N Campbell > > You cannot express how NULLS are to be sorted in the window order > specification and have to use a compensating expression to simulate. > Error: org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' > near 'nulls' > line 1:82 missing EOF at 'last' near 'nulls'; > SQLState: null > Same limitation as Hive reported in Apache JIRA HIVE-9535 ) > This fails > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc > nulls last) from tolap > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by case when > c3 is null then 1 else 0 end) from tolap -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10747) add support for window specification to include how NULLS are ordered
[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425521#comment-15425521 ] Herman van Hovell commented on SPARK-10747: --- [~xwu0226] Would be interested to open a PR for this one? > add support for window specification to include how NULLS are ordered > - > > Key: SPARK-10747 > URL: https://issues.apache.org/jira/browse/SPARK-10747 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.5.0 >Reporter: N Campbell > > You cannot express how NULLS are to be sorted in the window order > specification and have to use a compensating expression to simulate. > Error: org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' > near 'nulls' > line 1:82 missing EOF at 'last' near 'nulls'; > SQLState: null > Same limitation as Hive reported in Apache JIRA HIVE-9535 ) > This fails > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc > nulls last) from tolap > select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by case when > c3 is null then 1 else 0 end) from tolap -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10747) add support for window specification to include how NULLS are ordered
[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956094#comment-14956094 ] Xin Wu commented on SPARK-10747: I ran this query on the released Hive 1.2.1 version, and this is not supported yet {code} hive> select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc nulls last) from tolap; FAILED: ParseException line 1:76 missing ) at 'nulls' near 'nulls' line 1:82 missing EOF at 'last' near 'nulls' {code} And SparkSQL is using Hive ql parser to parse the query. and it will fail. {code} scala> sqlContext.sql("select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc nulls last) from tolap") org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' near 'nulls' line 1:82 missing EOF at 'last' near 'nulls'; at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:298) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110) at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34) at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:276) at org.apache.spark.sql.hive.HiveQLDialect.parse(HiveContext.scala:62) at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:173) at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:173) at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115) at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) {code} In HiveQl.scala, you see the following where getAst(sql) will throw the org.apache.hadoop.hive.ql.parse.ParseException, {code} def createPlan(sql: String): LogicalPlan = { try { val tree = getAst(sql) if (nativeCommands contains tree.getText) { HiveNativeCommand(sql) } else { nodeToPlan(tree) match { case NativePlaceholder => HiveNativeCommand(sql) case other => other } } } catch { case pe: org.apache.hadoop.hive.ql.parse.ParseException => pe.getMessage match { case errorRegEx(line, start, message) => throw new AnalysisException(message, Some(line.toInt), Some(start.toInt)) case otherMessage => throw new AnalysisException(otherMessage) } {code} which is thrown by org.apache.hadoop.hive.ql.parse.ParseDriver.java {code} public ASTNode parse(String command) throws ParseException { return this.parse(command, (Context)null); } {code} So I think this needs to wait for HIVE-9535 to be resolved.. I am new and learning the spark code, so I hope my understanding is correct here. > add support for window