[ 
https://issues.apache.org/jira/browse/SPARK-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722209#comment-16722209
 ] 

ASF GitHub Bot commented on SPARK-26376:
----------------------------------------

MaxGekk opened a new pull request #23325: [SPARK-26376][SQL] Skip inputs 
without tokens by JSON datasource
URL: https://github.com/apache/spark/pull/23325
 
 
   ## What changes were proposed in this pull request?
   
   Added new flag for `JacksonParser` - `skipInputWithoutTokens` to control 
parser's behaviour when its input doesn't contain any valid JSON tokens. The 
flag is set to `true` for JSON datasource and enables the same behaviour of the 
datasource as it has in Spark 2.4 and earlier. The flag is set to `false` for 
JSON functions like `from_json`. As a consequence of that, `from_json` produces 
bad records in the `PERMISSIVE` mode for strings without JSON tokens. 
   
   ## How was this patch tested?
   
   It was tested by `JsonSuite`.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Skip inputs without tokens by JSON datasource
> ---------------------------------------------
>
>                 Key: SPARK-26376
>                 URL: https://issues.apache.org/jira/browse/SPARK-26376
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Maxim Gekk
>            Priority: Minor
>
> The changes 
> https://github.com/apache/spark/commit/38628dd1b8298d2686e5d00de17c461c70db99a8
>  can potentially break existing application if it doesn't expect a bad record 
> for string without any JSON tokens in the PERMISSIVE mode. This ticket aims 
> to return previous behaviour of JSON datasource and ignore such strings 
> (including empty strings). The from_json function should keep new behaviour 
> and produce bad records for empty strings and strings without any JSON 
> tokens.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to