[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

MaxGekk Sat, 08 Dec 2018 02:27:00 -0800

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23253#discussion_r240000694
  
    --- Diff: docs/sql-migration-guide-upgrade.md ---
    @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide
     
       - In Spark version 2.4 and earlier, CSV datasource converts a malformed 
CSV string to a row with all `null`s in the PERMISSIVE mode. Since Spark 3.0, 
returned row can contain non-`null` fields if some of CSV column values were 
parsed and converted to desired types successfully.
     
    +  - In Spark version 2.4 and earlier, JSON datasource and JSON functions 
like `from_json` convert a bad JSON record to a row with all `null`s in the 
PERMISSIVE mode when specified schema is `StructType`. Since Spark 3.0, 
returned row can contain non-`null` fields if some of JSON column values were 
parsed and converted to desired types successfully.
    +
    --- End diff --
    
    @cloud-fan This PR propose similar changes as in 
https://github.com/apache/spark/pull/23120 . Could you take a look at it. 
    
    > For such behavior change, shall we add a config to roll back to previous 
behavior?
    
    I don't think it makes sense to introduce global SQL config for this 
particular case. The risk of breaking users apps is low because apps logic 
cannot based only on presence of all nulls in row. All nulls don't 
differentiate bad and not-bad JSON records. From my point of view, a note in 
the migration guide is enough.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

Reply via email to