[jira] [Commented] (SPARK-12362) Create a full-fledged built-in SQL parser

2015-12-30 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074937#comment-15074937
 ] 

Apache Spark commented on SPARK-12362:
--

User 'hvanhovell' has created a pull request for this issue:
https://github.com/apache/spark/pull/10525

> Create a full-fledged built-in SQL parser
> -
>
> Key: SPARK-12362
> URL: https://issues.apache.org/jira/browse/SPARK-12362
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Reynold Xin
>Assignee: Herman van Hovell
>Priority: Critical
>
> Spark currently has two SQL parsers it is using: a simple one based on Scala 
> parser combinator, and another one based on Hive.
> Neither is a good long term solution. The parser combinator one has bad error 
> messages for users and does not warn when there are conflicts in the defined 
> grammar. The Hive one depends directly on Hive itself, and as a result, it is 
> very difficult to introduce new grammar or fix bugs.
> The goal of the ticket is to create a single SQL query parser that is 
> powerful enough to replace the existing ones. The requirements for the new 
> parser are:
> 1. Can support almost all of HiveQL
> 2. Can support all existing SQL parser built using Scala parser combinators
> 3. Can be used for expression parsing in addition to SQL query parsing
> 4. Can provide good error messages for incorrect syntax
> Rather than building one from scratch, we should investigate whether we can 
> leverage existing open source projects such as Hive (by inlining the parser 
> part) or Calcite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12362) Create a full-fledged built-in SQL parser

2015-12-29 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074137#comment-15074137
 ] 

Apache Spark commented on SPARK-12362:
--

User 'hvanhovell' has created a pull request for this issue:
https://github.com/apache/spark/pull/10509

> Create a full-fledged built-in SQL parser
> -
>
> Key: SPARK-12362
> URL: https://issues.apache.org/jira/browse/SPARK-12362
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Reynold Xin
>Assignee: Apache Spark
>
> Spark currently has two SQL parsers it is using: a simple one based on Scala 
> parser combinator, and another one based on Hive.
> Neither is a good long term solution. The parser combinator one has bad error 
> messages for users and does not warn when there are conflicts in the defined 
> grammar. The Hive one depends directly on Hive itself, and as a result, it is 
> very difficult to introduce new grammar or fix bugs.
> The goal of the ticket is to create a single SQL query parser that is 
> powerful enough to replace the existing ones. The requirements for the new 
> parser are:
> 1. Can support almost all of HiveQL
> 2. Can support all existing SQL parser built using Scala parser combinators
> 3. Can be used for expression parsing in addition to SQL query parsing
> 4. Can provide good error messages for incorrect syntax
> Rather than building one from scratch, we should investigate whether we can 
> leverage existing open source projects such as Hive (by inlining the parser 
> part) or Calcite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12362) Create a full-fledged built-in SQL parser

2015-12-29 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074574#comment-15074574
 ] 

Reynold Xin commented on SPARK-12362:
-

[~hvanhovell] please create additional subtasks as part of this ticket. Thanks!


> Create a full-fledged built-in SQL parser
> -
>
> Key: SPARK-12362
> URL: https://issues.apache.org/jira/browse/SPARK-12362
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Reynold Xin
>Assignee: Apache Spark
>
> Spark currently has two SQL parsers it is using: a simple one based on Scala 
> parser combinator, and another one based on Hive.
> Neither is a good long term solution. The parser combinator one has bad error 
> messages for users and does not warn when there are conflicts in the defined 
> grammar. The Hive one depends directly on Hive itself, and as a result, it is 
> very difficult to introduce new grammar or fix bugs.
> The goal of the ticket is to create a single SQL query parser that is 
> powerful enough to replace the existing ones. The requirements for the new 
> parser are:
> 1. Can support almost all of HiveQL
> 2. Can support all existing SQL parser built using Scala parser combinators
> 3. Can be used for expression parsing in addition to SQL query parsing
> 4. Can provide good error messages for incorrect syntax
> Rather than building one from scratch, we should investigate whether we can 
> leverage existing open source projects such as Hive (by inlining the parser 
> part) or Calcite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12362) Create a full-fledged built-in SQL parser

2015-12-21 Thread Nong Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066864#comment-15066864
 ] 

Nong Li commented on SPARK-12362:
-

I think it makes sense to inline the hive ql parser into spark sql.  This 
satisfies the requirements in a pretty good way.

It is maximally HiveQL compatible and what the existing spark sql integration 
is built on. The parser uses antlr and looks
to be easy to extend going forward. Inlining it would involve taking some of 
the existing code in the hive.ql.parse package,
restricting it to the code that deals with parsing and not semantic analysis.




> Create a full-fledged built-in SQL parser
> -
>
> Key: SPARK-12362
> URL: https://issues.apache.org/jira/browse/SPARK-12362
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Reynold Xin
>
> Spark currently has two SQL parsers it is using: a simple one based on Scala 
> parser combinator, and another one based on Hive.
> Neither is a good long term solution. The parser combinator one has bad error 
> messages for users and does not warn when there are conflicts in the defined 
> grammar. The Hive one depends directly on Hive itself, and as a result, it is 
> very difficult to introduce new grammar or fix bugs.
> The goal of the ticket is to create a single SQL query parser that is 
> powerful enough to replace the existing ones. The requirements for the new 
> parser are:
> 1. Can support almost all of HiveQL
> 2. Can support all existing SQL parser built using Scala parser combinators
> 3. Can be used for expression parsing in addition to SQL query parsing
> 4. Can provide good error messages for incorrect syntax
> Rather than building one from scratch, we should investigate whether we can 
> leverage existing open source projects such as Hive (by inlining the parser 
> part) or Calcite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12362) Create a full-fledged built-in SQL parser

2015-12-21 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066883#comment-15066883
 ] 

Reynold Xin commented on SPARK-12362:
-

+1

> Create a full-fledged built-in SQL parser
> -
>
> Key: SPARK-12362
> URL: https://issues.apache.org/jira/browse/SPARK-12362
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Reynold Xin
>
> Spark currently has two SQL parsers it is using: a simple one based on Scala 
> parser combinator, and another one based on Hive.
> Neither is a good long term solution. The parser combinator one has bad error 
> messages for users and does not warn when there are conflicts in the defined 
> grammar. The Hive one depends directly on Hive itself, and as a result, it is 
> very difficult to introduce new grammar or fix bugs.
> The goal of the ticket is to create a single SQL query parser that is 
> powerful enough to replace the existing ones. The requirements for the new 
> parser are:
> 1. Can support almost all of HiveQL
> 2. Can support all existing SQL parser built using Scala parser combinators
> 3. Can be used for expression parsing in addition to SQL query parsing
> 4. Can provide good error messages for incorrect syntax
> Rather than building one from scratch, we should investigate whether we can 
> leverage existing open source projects such as Hive (by inlining the parser 
> part) or Calcite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org