[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-23 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-221126837
  
cc @jkbradley ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-19 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-220426013
  
Personally I'm in favor of #2 or #3 since they would both give us nice 
looking documentation on par with that of the Scala side.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-18 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-220219840
  
@jkbradley @yanboliang @holdenk @sethah let's discuss the issue of defaults 
in param doc (refer 
https://github.com/apache/spark/pull/13148#discussion_r63600571) on this PR 
since it is pertinent.

Here, Holden raises 2 issues:
1. The Scaladoc contains default values for many params (sometimes in 
shared traits). In addition the Scala `Param` itself has the self-contained 
`doc` field (typically not containing defaults, since the built-in doc shows 
current and default in `explainParam`).
2. The PyDoc only contains the `Param` `doc` field.

(By the way, (1) implies that in cases where the default param value in the 
trait is overridden, the Scaladoc is incorrect, but that is another issue).

The result of (2) is that the HTML API doc doesn't look great, e.g.

https://cloud.githubusercontent.com/assets/1036807/15381231/0a937dde-1d7e-11e6-885c-b120679f84ee.png;>

Also, nowhere in the PyDoc are the defaults listed, while in the Scaladoc 
they are.

I agree that it would be nice to have the defaults listed in the PyDoc in 
some way.
1. One solution is the original approach here, where defaults are put in 
the Param doc in a standard way, but stripped out during `explainParams`. This 
works but IMO is more prone to breaking in future if people forget to do things 
in exactly the correct format. It also doesn't directly solve the problem of 
the API doc looking ugly; 
2. Another solution is the current approach here, where the attributes are 
turned into properties with a docstring (possibly including the default) - this 
does solve the problem of nice display in the API doc. The downside here is the 
potentially fairly large change to make everything a property, and the code 
duplication introduced (though kept to a minimum) and extra boilerplate when 
adding new params that could be more error-prone;
3. A third solution is what I've done 
[here](https://github.com/mlnick/spark/tree/sphinx-doc-params) as a PoC, which 
basically adds the built-in doc as the instance docstring for each Python 
`Param`. Then we override the `AttributeDocumenter` in Sphinx to handle it. The 
result displays nicely in the API doc (the same as the property approach, but 
no defaults are added). The other thing that changes is the `__init__` 
docstring is brought back (for some reason the current docs are not showing 
that), which means that the defaults are essentially documented there for each 
class. In a way this seems more "Pythonic" to me (i.e. Python users are 
accustomed to seeing the default arg values in constructer doc, e.g. 
sciki-learn).
4. Another option is to do nothing (for now at least), except bring back 
the `__init__` docstring. This keeps the ugly-looking `Param` doc, but at least 
shows the default args for each class, and is the current behavior. We can do 
something like (1) or (3) later (but maybe not (2) during Spark 2.x as it may 
be too large a change).
5. A final option is to perhaps document defaults elsewhere (such as the 
setter for the param which is usually implemented in the class or a model trait 
in Scala).

Let's decide on an approach and make it consistent across the board.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-18 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-220117199
  
@yanboliang thanks for pointing that out. I'll continue some discussion 
over there (tl;dr - imho PyDoc is also a thing that matters).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-18 Thread yanboliang
Github user yanboliang commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-220070510
  
Refer @jkbradley 's comments for #13148 which is related to default value 
statement in param build-in doc: 
```
No need to state default in built-in doc. Scala doc is OK. Built-in doc is 
only displayed by explainParams, which reads from the defaultParamMap already.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-11 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218547004
  
@MLnick what do you think of this approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218234260
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218234267
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58244/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218234130
  
**[Test build #58244 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58244/consoleFull)**
 for PR 12914 at commit 
[`27d351d`](https://github.com/apache/spark/commit/27d351d501a62eb7dc9077d5aa3237d3b4c546bc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218230292
  
**[Test build #58244 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58244/consoleFull)**
 for PR 12914 at commit 
[`27d351d`](https://github.com/apache/spark/commit/27d351d501a62eb7dc9077d5aa3237d3b4c546bc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218229038
  
I gave it some more thought over coffee - I can remove much of the 
duplication if we want.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218187728
  
Darn - it would have been a really simple solution.

I'm not against having them as properties - the issue is the duplication 
you see, e.g.
```
_maxIter = Param(Params._dummy(), "maxIter", "max number of iterations (>= 
0).", typeConverter=TypeConverters.toInt)
maxIter = property(lambda x: x._maxIter, doc="Param maxIter, max number of 
iterations (>= 0).")
```

Not sure if @jkbradley / @davies has other thoughts or ideas on the 
property issue? It just seems like a large-ish change for perhaps not that much 
benefit.

The other option is just to restore the previous doc behavior we had of 
including the `_init_` doc string in the class doc. I'm not sure if a version 
change of sphinx changed default behavior, but it appears we can restore it by 
adding this line to `conf.py`: `autoclass_content = 'both'`.

It's not ideal but it takes us back to what we had before, and at least the 
defaults for each param are fairly clearly documented in that way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-10 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-218189988
  
Or we could go back to the first solution I proposed in this PR which
duplicates the values when someone calls explainParam but is otherwise a
small change without any duplication.

On Tuesday, May 10, 2016, Nick Pentreath  wrote:

> Darn - it would have been a really simple solution.
>
> I'm not against having them as properties - the issue is the duplication
> you see, e.g.
>
> _maxIter = Param(Params._dummy(), "maxIter", "max number of iterations 
(>= 0).", typeConverter=TypeConverters.toInt)
> maxIter = property(lambda x: x._maxIter, doc="Param maxIter, max number 
of iterations (>= 0).")
>
> Not sure if @jkbradley  / @davies
>  has other thoughts or ideas on the property
> issue? It just seems like a large-ish change for perhaps not that much
> benefit.
>
> The other option is just to restore the previous doc behavior we had of
> including the _init_ doc string in the class doc. I'm not sure if a
> version change of sphinx changed default behavior, but it appears we can
> restore it by adding this line to conf.py: autoclass_content = 'both'.
>
> It's not ideal but it takes us back to what we had before, and at least
> the defaults for each param are fairly clearly documented in that way.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly or view it on GitHub
> 
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217951794
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217951795
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58154/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217951605
  
**[Test build #58154 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58154/consoleFull)**
 for PR 12914 at commit 
[`5e89a85`](https://github.com/apache/spark/commit/5e89a853ace5c1a0efff714c64c12213aca195a7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217947835
  
**[Test build #58154 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58154/consoleFull)**
 for PR 12914 at commit 
[`5e89a85`](https://github.com/apache/spark/commit/5e89a853ace5c1a0efff714c64c12213aca195a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217938714
  
We could change these to be properties so they have proper docstrings but 
technically that would be an API breaking change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217933755
  
@MLnick Turns out thats a sphinx extension since they don't have proper 
docstrings so it isn't inherited and we can't really do much about that so I 
don't think that solution is going to work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217919508
  
seems like that doesn't work with inheritance out of the box, I'll do some 
poking.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217913130
  
@MLnick - That seems like a good solution without needing any regex. It 
does look a bit odd in the generated docs but all of the current params look a 
bit odd in the generated docs anyways.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-09 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217875812
  
@holdenk I'd prefer for us to find a way to document the default value in 
the API docs without resorting to stripping it out with a regex.

This works for example:
```
#: alpha for implicit preference. Default: 10
alpha = Param(Params._dummy(), "alpha", "alpha for implicit preference",
  typeConverter=TypeConverters.toFloat)
```
However I agree the result doesn't look ideal:
https://cloud.githubusercontent.com/assets/1036807/15114579/e34fb858-15fa-11e6-810b-d77504006a4b.png;>

But surely there must be a way to get it to look better using something 
like this approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-08 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217755456
  
Any more ideas on if this is something we want (cc @davies ?)? This one 
only does shared params so I'd like to follow it up for the non-shared params 
as well. I think having the default values in the API docs is pretty useful (we 
include them in the Scaladoc for a reason).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217348813
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57961/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217348812
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217348761
  
**[Test build #57961 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57961/consoleFull)**
 for PR 12914 at commit 
[`602c5ca`](https://github.com/apache/spark/commit/602c5ca591aca6b049e5729bf9a718ddcb31fbeb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217348026
  
**[Test build #57961 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57961/consoleFull)**
 for PR 12914 at commit 
[`602c5ca`](https://github.com/apache/spark/commit/602c5ca591aca6b049e5729bf9a718ddcb31fbeb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217344576
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217344567
  
**[Test build #57958 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57958/consoleFull)**
 for PR 12914 at commit 
[`4fffa65`](https://github.com/apache/spark/commit/4fffa65d3b1616293d5cff27cc6c951b9db53ef5).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217344578
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57958/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217343853
  
**[Test build #57958 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57958/consoleFull)**
 for PR 12914 at commit 
[`4fffa65`](https://github.com/apache/spark/commit/4fffa65d3b1616293d5cff27cc6c951b9db53ef5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/12914#discussion_r62286043
  
--- Diff: python/pyspark/ml/param/_shared_params_code_gen.py ---
@@ -58,6 +61,9 @@ def __init__(self):
 if defaultValueStr is not None:
 template += '''
 self._setDefault($name=$defaultValueStr)'''
+doc += '''(Default $friendlyDefault)'''
+if friendlyDefault is None:
+friendlyDefault = defaultValueStr
--- End diff --

I'm ok with getting rid of the friendly default string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/12914#discussion_r62285930
  
--- Diff: python/pyspark/ml/param/_shared_params_code_gen.py ---
@@ -58,6 +61,9 @@ def __init__(self):
 if defaultValueStr is not None:
 template += '''
 self._setDefault($name=$defaultValueStr)'''
+doc += '''(Default $friendlyDefault)'''
+if friendlyDefault is None:
+friendlyDefault = defaultValueStr
--- End diff --

Except for ```seed```, other params have the same ```defaultValueStr``` and 
```friendlyDefault```. I don't think it's necessary to use two variables 
because ```defaultValueStr``` is already string type and can be used in the doc.
For param ```seed```, I think ```hash(type(self).__name__)``` is more clear 
than ```hash of type name``` for users to understand the default value of this 
param.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/12914#discussion_r62285684
  
--- Diff: python/pyspark/ml/param/__init__.py ---
@@ -280,7 +281,9 @@ def explainParam(self, param):
 else:
 values.append("undefined")
 valueStr = "(" + ", ".join(values) + ")"
-return "%s: %s %s" % (param.name, param.doc, valueStr)
+# Remove the default value from the doc string
+docStr = re.sub(r'\(Default [\w\-\s]+?\)\Z', '', param.doc)
+return "%s: %s %s" % (param.name, docStr, valueStr)
--- End diff --

So this filters the auto generated ones that get added. For other strings I 
think its up to the creator of the param to follow the format if they want the 
standard behaviour. I don't think we need to catch all of them this is just for 
nicer display during explain.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/12914#discussion_r62285526
  
--- Diff: python/pyspark/ml/param/__init__.py ---
@@ -280,7 +281,9 @@ def explainParam(self, param):
 else:
 values.append("undefined")
 valueStr = "(" + ", ".join(values) + ")"
-return "%s: %s %s" % (param.name, param.doc, valueStr)
+# Remove the default value from the doc string
+docStr = re.sub(r'\(Default [\w\-\s]+?\)\Z', '', param.doc)
+return "%s: %s %s" % (param.name, docStr, valueStr)
--- End diff --

Further more, I don't think using regex is an elegant way to remove the 
default value in users specified docs. Users can use various words to express 
```default``` that we can not filter them completely.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/12914#discussion_r62285048
  
--- Diff: python/pyspark/ml/param/__init__.py ---
@@ -280,7 +281,9 @@ def explainParam(self, param):
 else:
 values.append("undefined")
 valueStr = "(" + ", ".join(values) + ")"
-return "%s: %s %s" % (param.name, param.doc, valueStr)
+# Remove the default value from the doc string
+docStr = re.sub(r'\(Default [\w\-\s]+?\)\Z', '', param.doc)
+return "%s: %s %s" % (param.name, docStr, valueStr)
--- End diff --

The regex does not take effect on my local test:
```
>>> import re
>>> doc = "StorageLevel for ALS model factors. Default: 'MEMORY_AND_DISK'"
>>> re.sub(r'\(Default [\w\-\s]+?\)\Z', '', doc)
"StorageLevel for ALS model factors. Default: 'MEMORY_AND_DISK'"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217320267
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57938/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217320265
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217320135
  
**[Test build #57938 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57938/consoleFull)**
 for PR 12914 at commit 
[`6960997`](https://github.com/apache/spark/commit/6960997c496725b928bfea5b577431837f62de55).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class GetExternalRowField(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217318620
  
**[Test build #57938 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57938/consoleFull)**
 for PR 12914 at commit 
[`6960997`](https://github.com/apache/spark/commit/6960997c496725b928bfea5b577431837f62de55).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217295137
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217295140
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57926/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217295109
  
**[Test build #57926 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57926/consoleFull)**
 for PR 12914 at commit 
[`3b81799`](https://github.com/apache/spark/commit/3b8179978e7ad802d93cd2fd856339fefa9c5b5a).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217292797
  
**[Test build #57926 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57926/consoleFull)**
 for PR 12914 at commit 
[`3b81799`](https://github.com/apache/spark/commit/3b8179978e7ad802d93cd2fd856339fefa9c5b5a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217291600
  
MiMa failure is clearly unrelated, jenkins retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread sethah
Github user sethah commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217285101
  
I'm not sure what changes went into python docs, but it seems that from 1.6 
to now we made a change that does NOT show the constructor for Python classes, 
and hence does not show the default value anywhere. I like @holdenk's 
suggestion to remove it from explain param. I don't really see another way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217249358
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217249307
  
**[Test build #57910 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57910/consoleFull)**
 for PR 12914 at commit 
[`3b81799`](https://github.com/apache/spark/commit/3b8179978e7ad802d93cd2fd856339fefa9c5b5a).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217249363
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57910/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217245973
  
**[Test build #57910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57910/consoleFull)**
 for PR 12914 at commit 
[`3b81799`](https://github.com/apache/spark/commit/3b8179978e7ad802d93cd2fd856339fefa9c5b5a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217204458
  
@MLnick I could change the `explainParam` call to strip out the existing 
default so it shows up in both the HTML docs and looks reasonable in 
`explainParam`/`explainParams`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217134578
  
@holdenk one potential issue that I've found with including the default 
value in the param "help" is that it then gets duplicated if users call 
`explainParam`/`explainParams`, e.g.

```
In [44]: als = ALS().setAlpha(10)
In [45]: print als.explainParams()
alpha: alpha for implicit preference (default: 1.0, current: 10.0)
checkpointInterval: set checkpoint interval (>= 1) or disable checkpoint 
(-1). E.g. 10 means that the cache will get checkpointed every 10 iterations. 
(default: 10)
finalStorageLevel: StorageLevel for ALS model factors. Default: 
'MEMORY_AND_DISK'. (default: MEMORY_AND_DISK)
```

I've used ALS as an example here where I put the defaults for storage level 
params in the param doc, and you can see it gets duplicated and IMO doesn't 
look great. Though without something like this, the defaults don't go into the 
HTML doc for the param (well they go into the constructor doc so they are there 
somewhere, but I agree it would be better if they lived with the param doc).

Not sure if there is an alternative solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-05 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217093446
  
cc @yanboliang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217037170
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57815/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217037167
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217037091
  
**[Test build #57815 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57815/consoleFull)**
 for PR 12914 at commit 
[`c523674`](https://github.com/apache/spark/commit/c523674477972e7551543e0df283b1af3aa0bf40).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217033932
  
**[Test build #57815 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57815/consoleFull)**
 for PR 12914 at commit 
[`c523674`](https://github.com/apache/spark/commit/c523674477972e7551543e0df283b1af3aa0bf40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-04 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-217032893
  
cc @sethah & @keypointt 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-04 Thread holdenk
GitHub user holdenk opened a pull request:

https://github.com/apache/spark/pull/12914

[SPARK-15130][PySpark][ML][DOCS] pyspark expose default params in ml 
pipeline

## What changes were proposed in this pull request?

Add default values to shared params in ML pipeline.

## How was this patch tested?

Existing unit tests + built docs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/holdenk/spark 
SPARK-15130-pyspark-expose-default-params-in-ml-pipeline

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12914.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12914


commit 33c8eae6cd969b410550ccebe4b6d465c30ee54b
Author: Holden Karau 
Date:   2016-05-04T22:58:20Z

First pass at including default values in the PyDocs for params in ML 
pipelines API

commit ede6af01405792a2b7e618f40ee63065386c5d85
Author: Holden Karau 
Date:   2016-05-04T23:00:47Z

Python style (long lines)

commit f066f5c4c8db8c8b4f25859601fce564c98d826a
Author: Holden Karau 
Date:   2016-05-04T23:07:22Z

Merge branch 'master' into 
SPARK-15130-pyspark-expose-default-params-in-ml-pipeline




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org