[GitHub] spark pull request #22951: [SPARK-25945][SQL] Support locale while parsing d...

MaxGekk Thu, 08 Nov 2018 02:22:21 -0800

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22951#discussion_r231832597
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -446,6 +450,9 @@ def csv(self, path, schema=None, sep=None, 
encoding=None, quote=None, escape=Non
                                   If None is set, it uses the default value, 
``1.0``.
             :param emptyValue: sets the string representation of an empty 
value. If None is set, it uses
                                the default value, empty string.
    +        :param locale: sets a locale as language tag in IETF BCP 47 
format. If None is set,
    +                       it uses the default value, ``en-US``. For instance, 
``locale`` is used while
    +                       parsing dates and timestamps.
    --- End diff --
    
    It seems parsing decimals using `locale` will be slightly tricky in JSON 
case because we leave this to Jackson by calling its method `getCurrentToken` 
and `getDecimalValue`, and I haven't found how to pass locale to it. Probably 
we will need a custom deserialiser?
    
    In the CSV case, it should be easier since we convert strings ourselves. I 
will try to do that for CSV first of all when this PR be merged.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22951: [SPARK-25945][SQL] Support locale while parsing d...

Reply via email to