[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-04 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-77221980
  
That's correct: element i should have the error/loss for the ensemble 
containing trees {0, 1, ..., i}.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-03 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-77105855
  
Yes but each element of the array corresponds to the error / loss in every 
iteration right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-03 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-77099027
  
@MechCoder   No problem; sorry I didn't make the JIRAs clearer!  Calling it 
```errorPerIteration``` sounds OK unless we allow users to pass in evaluators, 
in which case the evaluator might be something new like Accuracy which isn't an 
"error" metric.  I'd still vote for evaluateEachIteration in case we allow this 
later on.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-03 Thread MechCoder
Github user MechCoder closed the pull request at:

https://github.com/apache/spark/pull/4819


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-03 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-77003845
  
@jkbradley Just one quick clarification, please.

When you mean `evaluateEachIteration` should return an Array of Doubles, do 
you mean that each element corresponds to the cumulative error per iteration 
(i.e tree)? In that case how does the name `errorPerIteration` sound?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-03 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76990366
  
Ouch. I just realised what you meant.. Scratch my previous couple of 
comments. :/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-03-02 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76779548
  
@MechCoder  I had intended to use this internally and to expose a public 
method.  (The "evaluateEachIteration" method was the public one, but feel free 
to think of a better name.)  Yes, the evaluator was the loss metric, which 
should probably be an optional parameter (defaulting to the training metric).

* [https://issues.apache.org/jira/browse/SPARK-6025]: This is the JIRA for 
the public method.
* [https://issues.apache.org/jira/browse/SPARK-5972]: This is the JIRA for 
the internal optimization.

I'm Ok with combining the 2 JIRAs in 1 PR since they are closely related.  
For the internal optimization, the "residual" to store is not really the 
residual but rather the cumulative prediction of the ensemble; that in turn can 
be used to compute both the gradient and the error.  (Note it will be important 
to use the cached residual for computing the gradient, not just the objective.) 
 That may require adding some internal API to ensembles to permit prediction 
from a pre-computed sum of trees' predictions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-02-28 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76527936
  
Also, the present code is unoptimized since there are two runs across the 
data RDD. one to update the residual, and the other to calculate the error. But 
that can be taken care after we discuss the design.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-02-28 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76527811
  
@jkbradley I am assuming that this is what you intended. It works but I'm 
not sure about the present design, which differs from the design that you had 
posted in the JIRA.

def evaluateEachIteration(data: RDD[LabeledPoint], evaluator): 
Array[Double]

I am not sure how this would work, if the existing residual is not passed 
and could you also say what Array[Double] is supposed to be?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-02-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76511909
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28110/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-02-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76511907
  
  [Test build #28110 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28110/consoleFull)
 for   PR 4819 at commit 
[`7d4ed48`](https://github.com/apache/spark/commit/7d4ed483e0a0c58669ab00421d00eecda832cfba).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6025] [MLlib] Add helper method to effi...

2015-02-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4819#issuecomment-76509618
  
  [Test build #28110 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28110/consoleFull)
 for   PR 4819 at commit 
[`7d4ed48`](https://github.com/apache/spark/commit/7d4ed483e0a0c58669ab00421d00eecda832cfba).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org