Github user smurakozi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19599#discussion_r156339373
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
    @@ -224,8 +222,8 @@ class LinearRegression @Since("1.3.0") (@Since("1.3.0") 
override val uid: String
           elasticNetParam, fitIntercept, maxIter, regParam, standardization, 
aggregationDepth)
         instr.logNumFeatures(numFeatures)
     
    -    if (($(solver) == Auto &&
    -      numFeatures <= WeightedLeastSquares.MAX_NUM_FEATURES) || $(solver) 
== Normal) {
    +    if (($(solver).equalsIgnoreCase(Auto) && numFeatures <= 
WeightedLeastSquares.MAX_NUM_FEATURES)
    --- End diff --
    
    I think using equalsIgnoreCase all over the places is problematic: you need 
to check and potentially modify code at many places in the code. Everywhere 
when $() is used on a StringParam you may need this change.  
    It's pretty easy to miss some and that would lead to subtle bugs.
    
    A safer approach would be to work with lowercase param values everywhere.
    You could convert to lowercase in the constructor and require that possible 
param values are given as lowercase. This way all comparisons, pattern matches, 
etc. would work just fine. 
    The downside of this approach would be that everywhere where you show these 
values to the users you would present the lowercase value. It might cause 
issues e.g. if you log it and some external log parser expects the original 
(non-lowercase) version. In the very broad sense, it would break compatibility 
but I guess that would be acceptable.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to