[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-56876698
  
Hi, I addressed the recent review comments and merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-56876846
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20819/consoleFull)
 for   PR 2347 at commit 
[`bd49701`](https://github.com/apache/spark/commit/bd49701e2bf4e4a04c85f9786d9319d56e8a44e8).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-56886274
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20819/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-56886264
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20819/consoleFull)
 for   PR 2347 at commit 
[`bd49701`](https://github.com/apache/spark/commit/bd49701e2bf4e4a04c85f9786d9319d56e8a44e8).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2347


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-56898211
  
LGTM. Merged into master. What's your username on JIRA? I'll assign the 
JIRA to you. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-25 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-56903523
  
Great, thanks. My username is 'staple', looks like you already assigned to 
me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55701824
  
this is ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55702131
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20372/consoleFull)
 for   PR 2347 at commit 
[`03d0e2f`](https://github.com/apache/spark/commit/03d0e2fb2cf38053cfb2344dc668b442db79f28f).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55708415
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20372/consoleFull)
 for   PR 2347 at commit 
[`03d0e2f`](https://github.com/apache/spark/commit/03d0e2fb2cf38053cfb2344dc668b442db79f28f).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55747592
  
Hi, per the discussion in https://github.com/apache/spark/pull/2362 the 
plan is to continue caching before deserialization from python rather than 
after, in order to minimize the cached rdd memory footprint.

This means that, without further work, warning messages will be logged for 
every python mllib regression and kmeans run. I added a patch that suppresses 
these warning messages during python runs in a way that I think is fairly 
unobtrusive. Please let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55747882
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20389/consoleFull)
 for   PR 2347 at commit 
[`9bed1fd`](https://github.com/apache/spark/commit/9bed1fda7888c692063de0ea33e739242229d4a1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55758803
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20389/consoleFull)
 for   PR 2347 at commit 
[`9bed1fd`](https://github.com/apache/spark/commit/9bed1fda7888c692063de0ea33e739242229d4a1).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-15 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55685703
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-12 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55374976
  
@davies It is hard to tell whether we already have fast access to the input 
RDD. Force caching may cause problems, e.g.,

1. kicking out some cached RDDs,
2. using too much memory if the input data is large but it could be 
generated from a small RDD.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-11 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55308192
  
Is it possible that add the cache for RDD automatically instead of show an 
warning, if the cache is always helpful?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-11 Thread staple
Github user staple commented on a diff in the pull request:

https://github.com/apache/spark/pull/2347#discussion_r17430431
  
--- Diff: docs/mllib-linear-methods.md ---
@@ -470,7 +471,7 @@ public class LinearRegression {
 }
   }
 );
-JavaRDDObject MSE = new JavaDoubleRDD(valuesAndPreds.map(
+double MSE = new JavaDoubleRDD(valuesAndPreds.map(
--- End diff --

:) Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-11 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55288257
  
Hi, I made the requested comment changes. I also filed a separate PR for 
the caching changes: #2362


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread staple
GitHub user staple opened a pull request:

https://github.com/apache/spark/pull/2347

[SPARK-1484][MLLIB] Warn when running an iterative algorithm on uncached 
data.

Add warnings to KMeans, GeneralizedLinearAlgorithm, and computeSVD when 
called with input data that is not cached. KMeans is implemented iteratively, 
and I believe that GeneralizedLinearAlgorithm’s current optimizers are 
iterative and its future optimizers are also likely to be iterative. 
RowMatrix’s computeSVD is iterative against an RDD when run in DistARPACK 
mode. ALS and DecisionTree are iterative as well, but they implement RDD 
caching internally so do not require a warning.

I added a warning to GeneralizedLinearAlgorithm rather than inside its 
optimizers themselves, where the iteration actually occurs, because internally 
GeneralizedLinearAlgorithm maps its input data to an uncached RDD before 
passing it to an optimizer. (In other words, the warning would be printed for 
every GeneralizedLinearAlgorithm run, regardless of whether its input is 
cached, if the warning were in GradientDescent or other optimizer.) I assume 
that use of an uncached RDD by GeneralizedLinearAlgorithm is intentional, and 
that the mapping there (adding label, intercepts and scaling) is a lightweight 
operation. Arguably a user calling an optimizer such as GradientDescent will be 
knowledgable enough to cache their data without needing a log warning, so lack 
of a warning in the optimizers may be ok.

This patch causes all calls to GeneralizedLinearAlgorithm from Python to 
print a warning, because the implementation in 
PythonMLLibAPI.trainRegressionModel deserializes the data from python using 
map(SerDe.deserializeLabeledPoint) to create a deserialized RDD without caching 
this new RDD. This means that deserialization must occur on every training 
iteration for RDDs originating in Python. Perhaps the python cache() call from 
_regression_train_wrapper / _get_unmangled_labeled_point_rdd should be moved to 
be after deserialization instead of before serialization. There is a similar 
issue in KMeans.

Some of the documentation examples making use of these iterative algorithms 
did not cache their training RDDs (while others did). I updated the examples to 
always cache. I also fixed some (unrelated) minor errors in the documentation 
examples.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/staple/spark SPARK-1484

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2347.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2347


commit 7b31102b3ad68e821a21a31ab3e49fe069c98e9e
Author: Aaron Staple aaron.sta...@gmail.com
Date:   2014-09-10T14:18:17Z

Minor doc example fixes.

commit bc90b68094c32678aa41fd65756105f9d3dd414b
Author: Aaron Staple aaron.sta...@gmail.com
Date:   2014-09-10T14:19:58Z

[SPARK-1484][MLLIB] Warn when running an iterative algorithm on uncached 
data.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55138304
  
See above where I describe how, for python RDDs, the input data is 
automatically cached and then deserialized via a map to an uncached RDD, 
requiring deserialization of every row for every training iteration. Would it 
make sense to change this to cache after deserializing instead of before? If so 
I can file a new ticket and PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55145287
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55162937
  
Sure, I changed the warning message text as you suggested.

Do you think the deserialization mapping in the python RDDs I described is 
ok (a lightweight operation)? If so, I imagine it would be a problem for the 
warning message to always be printed when Python is used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/2347#discussion_r17388045
  
--- Diff: docs/mllib-linear-methods.md ---
@@ -470,7 +471,7 @@ public class LinearRegression {
 }
   }
 );
-JavaRDDObject MSE = new JavaDoubleRDD(valuesAndPreds.map(
+double MSE = new JavaDoubleRDD(valuesAndPreds.map(
--- End diff --

Nice catch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/2347#discussion_r17388049
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -117,6 +118,13 @@ class KMeans private (
* performance, because this is an iterative algorithm.
*/
   def run(data: RDD[Vector]): KMeansModel = {
+
+if (data.getStorageLevel == StorageLevel.NONE) {
+  // Warn when running an iterative algorithm on uncached data. 
SPARK-1484
--- End diff --

It should be okay if we remove this comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/2347#discussion_r17388072
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala 
---
@@ -256,6 +262,11 @@ class RowMatrix(
   logWarning(sRequested $k singular values but only found $sk 
nonzeros.)
 }
 
+if (computeMode == SVDMode.DistARPACK  rows.getStorageLevel == 
StorageLevel.NONE) {
--- End diff --

ditto: add a comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/2347#discussion_r17388066
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -125,6 +133,11 @@ class KMeans private (
 }
 val model = runBreeze(breezeData)
 norms.unpersist()
+
+if (data.getStorageLevel == StorageLevel.NONE) {
--- End diff --

Please add a comment explaining why we want to output this warning message 
twice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-10 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/2347#issuecomment-55181535
  
@staple For Python, I think caching on the JVM side is good. The only thing 
we need to take care of is that NaiveBayes and DecisionTree doesn't need 
caching.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org