[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/760


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604864
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -309,8 +207,10 @@ object MultipleLinearRegression {
   : DataSet[LabeledVector] = {
--- End diff --

Docstring for the return type?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604529
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -87,11 +89,11 @@ import org.apache.flink.ml.pipeline.{FitOperation, 
PredictOperation, Predictor}
   *
   */
 class MultipleLinearRegression extends Predictor[MultipleLinearRegression] 
{
-
+  import org.apache.flink.ml._
--- End diff --

Line 49 typo: iteratinos - iterations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604106
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/pipeline/Predictor.scala
 ---
@@ -35,7 +35,7 @@ import org.apache.flink.ml.common.{FlinkMLTools, 
ParameterMap, WithParameters}
   *
   * @tparam Self Type of the implementing class
   */
-trait Predictor[Self] extends Estimator[Self] with WithParameters with 
Serializable {
+trait Predictor[Self] extends Estimator[Self] with WithParameters {
--- End diff --

Why is Serializable no longer needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on the pull request:

https://github.com/apache/flink/pull/760#issuecomment-108254611
  
Looks good, some minor comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31632977
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -87,11 +89,11 @@ import org.apache.flink.ml.pipeline.{FitOperation, 
PredictOperation, Predictor}
   *
   */
 class MultipleLinearRegression extends Predictor[MultipleLinearRegression] 
{
-
+  import org.apache.flink.ml._
--- End diff --

Good catch :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31633196
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -309,8 +207,10 @@ object MultipleLinearRegression {
   : DataSet[LabeledVector] = {
--- End diff --

Good point. Will add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31632765
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/pipeline/Predictor.scala
 ---
@@ -35,7 +35,7 @@ import org.apache.flink.ml.common.{FlinkMLTools, 
ParameterMap, WithParameters}
   *
   * @tparam Self Type of the implementing class
   */
-trait Predictor[Self] extends Estimator[Self] with WithParameters with 
Serializable {
+trait Predictor[Self] extends Estimator[Self] with WithParameters {
--- End diff --

The `Estimator` types should never be shipped to the TaskManager. Usually 
they contain some state in the form of other `DataSets` which aren't 
serializable anyways. However, the operations have to be `Serializable`, 
because they can contain values which are referenced by the Flink operations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-02 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/760

[FLINK-1993] [ml] Replaces custom SGD in MultipleLinearRegression with 
optimizer's SGD

This PR replaces the custom SGD implementation with the optimization 
framework's SGD implementation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink replaceSGDInMLR

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/760.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #760


commit 2df0c700237af0456c3770c198ed595fdf205408
Author: Till Rohrmann trohrm...@apache.org
Date:   2015-05-29T16:02:47Z

[FLINK-1993] [ml] Replaces custom SGD logic with optimization framework's 
SGD in MultipleLinearRegression

Fixes PipelineITSuite because of change MLR loss function




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---