[GitHub] spark pull request: [SPARK-10560] [PySpark] [MLlib] [Docs] Make St...

mengxr Fri, 16 Oct 2015 14:36:05 -0700

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9141#discussion_r42292808
  
    --- Diff: python/pyspark/mllib/classification.py ---
    @@ -594,19 +594,27 @@ def train(cls, data, lambda_=1.0):
     @inherit_doc
     class StreamingLogisticRegressionWithSGD(StreamingLinearAlgorithm):
         """
    -    Run LogisticRegression with SGD on a batch of data.
    -
    -    The weights obtained at the end of training a stream are used as 
initial
    -    weights for the next batch.
    -
    -    :param stepSize: Step size for each iteration of gradient descent.
    -    :param numIterations: Number of iterations run for each batch of data.
    -    :param miniBatchFraction: Fraction of data on which SGD is run for each
    -                              iteration.
    -    :param regParam: L2 Regularization parameter.
    -    :param convergenceTol: A condition which decides iteration termination.
    +    Train or predict a logistic regression model on streaming data. 
Training uses
    +    Stochastic Gradient Descent to update the model based on each new 
batch of
    +    incoming data from a DStream.
    +
    +    Each batch of data is assumed to be an RDD of LabeledPoints.
    +    The number of data points per batch can vary, but the number
    +    of features must be constant. An initial weight
    +    vector must be provided.
    +
    +    :param stepSize:          Step size for each iteration of gradient 
descent.
    --- End diff --
    
    We shouldn't do vertical alignment. If in the future we add a new parameter 
with a long name, we have to change all lines. There are two options:
    
    ~~~
    :param stepSize: Step size for each iteration of gradient descent.
    ~~~
    
    or 
    
    ~~~
    :param stepSize:
      Step size for each iteration of gradient descent.
    ~~~
    
    I think the latter one is better because it doesn't affected by the length 
of the parameter name.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10560] [PySpark] [MLlib] [Docs] Make St...

Reply via email to