[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/13228#discussion_r64345475
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -1250,6 +1257,20 @@ def getNumBuckets(self):
 """
 return self.getOrDefault(self.numBuckets)
 
+@since("2.0.0")
+def setRelativeError(self, value):
+"""
+Sets the value of :py:attr:`relativeError`.
+"""
+return self._set(relativeError=value)
+
+@since("2.0.0")
+def getRelativeError(self):
+"""
+Gets the value of relativeError or its default value.
--- End diff --

yeah - though it is a large, but actually trivial, change. We could 
probably do it in one pass right at the end.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13228


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13228#discussion_r64344703
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -1250,6 +1257,20 @@ def getNumBuckets(self):
 """
 return self.getOrDefault(self.numBuckets)
 
+@since("2.0.0")
+def setRelativeError(self, value):
+"""
+Sets the value of :py:attr:`relativeError`.
+"""
+return self._set(relativeError=value)
+
+@since("2.0.0")
+def getRelativeError(self):
+"""
+Gets the value of relativeError or its default value.
--- End diff --

I think we should probably document the setters at some point, but perhaps 
2.1 timeframe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221192896
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221192898
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59188/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221192784
  
**[Test build #59188 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59188/consoleFull)**
 for PR 13228 at commit 
[`6417384`](https://github.com/apache/spark/commit/6417384f3b8a29dadea1089fdccb150a00cf26ad).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221185187
  
**[Test build #59188 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59188/consoleFull)**
 for PR 13228 at commit 
[`6417384`](https://github.com/apache/spark/commit/6417384f3b8a29dadea1089fdccb150a00cf26ad).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-24 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221184857
  
jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-23 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/13228#discussion_r64270058
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -1250,6 +1257,20 @@ def getNumBuckets(self):
 """
 return self.getOrDefault(self.numBuckets)
 
+@since("2.0.0")
+def setRelativeError(self, value):
+"""
+Sets the value of :py:attr:`relativeError`.
+"""
+return self._set(relativeError=value)
+
+@since("2.0.0")
+def getRelativeError(self):
+"""
+Gets the value of relativeError or its default value.
--- End diff --

@BryanCutler it's a good point - but I've seen that all other docs for the 
getters through py-ml omit the `:py:attr:`. I'm not certain if this is intended 
or some oversight that has become "copy-pasted" as de facto standard :) I for 
one would support updating all of them as it makes sense to me that the doc 
links match the setters. But it then needs to be done for all Python param 
setter methods to be consistent.

cc @jkbradley @holdenk @yanboliang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-23 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221062894
  
@jkbradley fair enough - I misinterpreted things thinking we were trying to 
nail down further Python parity. I do agree the focus should be on docs / QA. 
As it happens this was trivial to do while working on the doc update :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-23 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221037153
  
Just reviewed, and this LGTM  : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-23 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-221035663
  
I do want us to push for pyspark parity as much as possible, but I made 
those QA JIRAs for Python parity checks to do an audit, not to achieve parity 
for 2.0.  I definitely find it tempting to push as much API parity stuff in as 
possible, but I've gradually become convinced it's better to just focus on QA & 
docs to get the release out the door.  As soon as the release is done, we can 
focus on the newly created Python JIRAs and other new APIs, etc.

But it's your call on this one.  If this is ready, then maybe it's better 
to just go ahead and merge it to avoid wasted work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220759647
  
@jkbradley this is a missing param on the python side - I noticed it while
reviewing the PR for doc for count vectorizer, hashing tf and quantile
discretizer.

It also contains doc fixes as quantile discretizer doc is incorrect now
that it uses approx quantile.

I thought we were still trying to get as much of pyspark parity for 2.0 as
we can?

Otherwise I can just make the doc changes
On Sat, 21 May 2016 at 00:35, jkbradley  wrote:

> Is this targeted at 2.0? I'd prefer we focus on finishing documentation &
> QA checks before working on new features and APIs.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly or view it on GitHub
> 
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220734278
  
Is this targeted at 2.0?  I'd prefer we focus on finishing documentation & 
QA checks before working on new features and APIs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread BryanCutler
Github user BryanCutler commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220729967
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/13228#discussion_r64113992
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -1250,6 +1257,20 @@ def getNumBuckets(self):
 """
 return self.getOrDefault(self.numBuckets)
 
+@since("2.0.0")
+def setRelativeError(self, value):
+"""
+Sets the value of :py:attr:`relativeError`.
+"""
+return self._set(relativeError=value)
+
+@since("2.0.0")
+def getRelativeError(self):
+"""
+Gets the value of relativeError or its default value.
--- End diff --

no biggie, but the setter has a link for the param :py:attr:`relativeError`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220714826
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220714827
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59021/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220714687
  
**[Test build #59021 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59021/consoleFull)**
 for PR 13228 at commit 
[`6417384`](https://github.com/apache/spark/commit/6417384f3b8a29dadea1089fdccb150a00cf26ad).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13228#issuecomment-220705009
  
**[Test build #59021 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59021/consoleFull)**
 for PR 13228 at commit 
[`6417384`](https://github.com/apache/spark/commit/6417384f3b8a29dadea1089fdccb150a00cf26ad).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15442][ML][PYSPARK] Add 'relativeError'...

2016-05-20 Thread MLnick
GitHub user MLnick opened a pull request:

https://github.com/apache/spark/pull/13228

[SPARK-15442][ML][PYSPARK] Add 'relativeError' param to PySpark 
QuantileDiscretizer

This PR adds the `relativeError` param to PySpark's `QuantileDiscretizer` 
to match Scala.

Also cleaned up a duplication of `numBuckets` where the param is both a 
class and instance attribute (I removed the instance attr to match the style of 
params throughout `ml`).

Finally, cleaned up the docs for `QuantileDiscretizer` to reflect that it 
now uses `approxQuantile`.

## How was this patch tested?

A little doctest and built API docs locally to check HTML doc generation.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MLnick/spark SPARK-15442-py-relerror-param

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13228.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13228


commit 907123b8c13b6bd180821c8752e1e9d282a2a9fb
Author: Nick Pentreath 
Date:   2016-05-20T18:23:34Z

initial work

commit 60339055b11a57eeb25e93ba1429ca6022916609
Author: Nick Pentreath 
Date:   2016-05-20T19:47:26Z

clean up doc

commit 6417384f3b8a29dadea1089fdccb150a00cf26ad
Author: Nick Pentreath 
Date:   2016-05-20T19:57:51Z

amend doctest param value




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org