[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

mengxr Thu, 07 Jun 2018 08:06:58 -0700

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21501#discussion_r193781216
  
    --- Diff: python/pyspark/ml/feature.py ---
    @@ -2582,25 +2582,27 @@ class StopWordsRemover(JavaTransformer, 
HasInputCol, HasOutputCol, JavaMLReadabl
                           typeConverter=TypeConverters.toListString)
         caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do 
a case sensitive " +
                               "comparison over the stop words", 
typeConverter=TypeConverters.toBoolean)
    +    locale = Param(Params._dummy(), "locale", "locale of the input. 
ignored when case sensitive is false",
    +                   typeConverter=TypeConverters.toString)
     
         @keyword_only
    -    def __init__(self, inputCol=None, outputCol=None, stopWords=None, 
caseSensitive=False):
    +    def __init__(self, inputCol=None, outputCol=None, stopWords=None, 
caseSensitive=False, locale="en"):
    --- End diff --
    
    We should keep the default consistent. I'm not sure if python's 
[`locale.getdefaultlocale`](https://docs.python.org/2/library/locale.html#locale.getdefaultlocale)
 is compatible with Java's. But we can always make the default `None` in Python 
and leave it to JVM to set the default.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

Reply via email to