[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-27 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
I'll go for this tomorrow if there are no other comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14663
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64288/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14663
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14663
  
**[Test build #64288 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64288/consoleFull)**
 for PR 14663 at commit 
[`e263084`](https://github.com/apache/spark/commit/e263084e794ab508bfa4effa84c39de8acbb6512).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14663
  
**[Test build #64288 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64288/consoleFull)**
 for PR 14663 at commit 
[`e263084`](https://github.com/apache/spark/commit/e263084e794ab508bfa4effa84c39de8acbb6512).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-23 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
If I understood you correctly @MLnick you favored just adding warnings in 
the doc? I added to three more places that needed it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-22 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/14663
  
Ah right, good point. Actually I realised that the doc in 
`ml.feature.StandardScaler` needs updating for `withMean`:

```
  /**
   * Whether to center the data with mean before scaling.
   * It will build a dense output, so this does not work on sparse input
   * and will raise an exception.
   * Default: false
   * @group param
   */
  val withMean: BooleanParam = new BooleanParam(this, "withMean",
"Whether to center data with mean")
```

... as good a place as any to add a warning about using with sparse input 
(this can then show up in `explainParam` doc as well as ScalaDoc. Python side 
will probably need it too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-22 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
Warning seems reasonable. I think you'd have to put in a flag to remember 
if the user has been warned in order to avoid spewing millions of them. Worth 
it, you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-22 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/14663
  
As mentioned on the JIRA discussion, I'm neutral on this, though I tend to 
lean towards allowing the user to do what they want even if it might be 
"dangerous". I guess +0?

Though perhaps we may want to explicitly log a `WARN` message if sparse 
input is encountered and `withMean=true`, just so there is a very clear root 
cause message in the logs if things blow up with OOM (it's in the user guide 
and doc string, but still may be missed by many users).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-22 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
going once, going twice. This would simply let an operation proceed where 
it errored before, at the cost of giving a user a little more rope to hang 
him/herself. I think it unblocks a legitimate and common set of use cases, so I 
think it's worth changing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-19 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
Another one where I'd welcome comments from ... @holdenk @MLnick @davies et 
al


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14663
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63840/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14663
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14663
  
**[Test build #63840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63840/consoleFull)**
 for PR 14663 at commit 
[`496a8df`](https://github.com/apache/spark/commit/496a8df562ea755560cdae78a52d079b5cc5ebe8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14663
  
**[Test build #63840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63840/consoleFull)**
 for PR 14663 at commit 
[`496a8df`](https://github.com/apache/spark/commit/496a8df562ea755560cdae78a52d079b5cc5ebe8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org