[GitHub] spark pull request #22814: [SPARK-25819][SQL] Support parse mode option for ...

cloud-fan Thu, 25 Oct 2018 06:01:46 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22814#discussion_r228162464
  
    --- Diff: docs/sql-migration-guide-upgrade.md ---
    @@ -11,6 +11,10 @@ displayTitle: Spark SQL Upgrading Guide
     
       - In PySpark, when creating a `SparkSession` with 
`SparkSession.builder.getOrCreate()`, if there is an existing `SparkContext`, 
the builder was trying to update the `SparkConf` of the existing `SparkContext` 
with configurations specified to the builder, but the `SparkContext` is shared 
by all `SparkSession`s, so we should not update them. Since 3.0, the builder 
comes to not update the configurations. This is the same behavior as Java/Scala 
API in 2.3 and above. If you want to update them, you need to update them prior 
to creating a `SparkSession`.
     
    +  - In Avro data source, the function `from_avro` supports following parse 
modes:
    +    * `PERMISSIVE`: Corrupt records are processed as null result. To 
implement this, the data schema is forced to be fully nullable, which might be 
different from the one user provided. This is the default mode.
    +    * `FAILFAST`: Throws an exception on processing corrupted record.
    --- End diff --
    
    We don't change existing APIs but add a new `from_avro` method to take an 
extra `option` parameter. Users won't hit any problem when upgrading Spark, and 
ideally they should read release notes and use this new feature if they need. I 
don't think we need to put it in migration guide.
    
    Let's not abuse the migration guide.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22814: [SPARK-25819][SQL] Support parse mode option for ...

Reply via email to