Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r193781216 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,27 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) + locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive is false", + typeConverter=TypeConverters.toString) @keyword_only - def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False): + def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False, locale="en"): --- End diff -- We should keep the default consistent. I'm not sure if python's [`locale.getdefaultlocale`](https://docs.python.org/2/library/locale.html#locale.getdefaultlocale) is compatible with Java's. But we can always make the default `None` in Python and leave it to JVM to set the default.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org