[ https://issues.apache.org/jira/browse/SPARK-21136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304378#comment-16304378 ]
Denys Zadorozhnyi commented on SPARK-21136: ------------------------------------------- I dug up into this issue in Spark 2.2.1 (ANTLR 4.7) and here are my findings: # 1 The offending token is gathered in {{InputMismatchException}} from {{recognizer.getCurrentToken()}} (which is "from" in the examples above). # 2 In ANTLR 4.7.1 in the these cases {{InputMismatchException.offendingState}} and {{InputMismatchException .ctx}} are additionally set that should give some clues (see [https://github.com/antlr/antlr4/pull/1969] and the issue [https://github.com/antlr/antlr4/issues/1922] for details). However the error message is generated in the ANTLR's {{DefaultErrorStrategy.reportErrror()}}. I've considered the idea to pass an error handler ({{DefaultErrorStrategy}} subclass) to the parser and override {{reportError()}}, make a new {{InputMismatchException}} with "correct" {{offendingToken}} and pass it up the chain but it did not feel right. > Misleading error message for typo in SQL > ---------------------------------------- > > Key: SPARK-21136 > URL: https://issues.apache.org/jira/browse/SPARK-21136 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.0 > Reporter: Daniel Darabos > Priority: Minor > > {code} > scala> spark.sql("select * from a left joinn b on a.id = b.id").show > org.apache.spark.sql.catalyst.parser.ParseException: > mismatched input 'from' expecting {<EOF>, 'WHERE', 'GROUP', 'ORDER', > 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', > 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 1, pos 9) > == SQL == > select * from a left joinn b on a.id = b.id > ---------^^^ > {code} > The issue is that {{^^^}} points at {{from}}, not at {{joinn}}. The text of > the error makes no sense either. If {{*}}, {{a}}, and {{b}} are complex in > themselves, a misleading error like this can hinder debugging substantially. > I tried to see if maybe I could fix this. Am I correct to deduce that the > error message originates in ANTLR4, which parses the query based on the > syntax defined in {{SqlBase.g4}}? If so, I guess I would have to figure out > how that syntax definition works, and why it misattributes the error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org