[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-22 Thread mpjlu
Github user mpjlu commented on a diff in the pull request:

https://github.com/apache/spark/pull/18068#discussion_r117912054
  
--- Diff: python/pyspark/ml/tests.py ---
@@ -1075,7 +1076,8 @@ def test_linear_regression_summary(self):
 pValues = s.pValues
 self.assertTrue(isinstance(pValues, list) and 
isinstance(pValues[0], float))
 # test evaluation (with training dataset) produces a summary with 
same values
-# one check is enough to verify a summary is returned, Scala 
version runs full test
+# one check is enough to verify a summary is returned
+# The child class LinearRegressionTrainingSummary runs full test
--- End diff --

I think this is not because Scala version runs full test. Even Scala 
version runs full test, we still need the function call test.
If a child class have done the function call test, we don't need to test 
parent class again.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-22 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/18068#discussion_r117912043
  
--- Diff: python/pyspark/ml/tests.py ---
@@ -1075,7 +1076,8 @@ def test_linear_regression_summary(self):
 pValues = s.pValues
 self.assertTrue(isinstance(pValues, list) and 
isinstance(pValues[0], float))
 # test evaluation (with training dataset) produces a summary with 
same values
-# one check is enough to verify a summary is returned, Scala 
version runs full test
+# one check is enough to verify a summary is returned
+# The child class LinearRegressionTrainingSummary runs full test
--- End diff --

I'm not sure what this comment means?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discr...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18068
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77229/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discr...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18068
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discr...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18068
  
**[Test build #77229 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77229/testReport)**
 for PR 18068 at commit 
[`7bbfe3a`](https://github.com/apache/spark/commit/7bbfe3a860964d166f67c3b099b00c8b11a73f9d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  class DecisionTreeClassifierWrapperWriter(instance: 
DecisionTreeClassifierWrapper)`
  * `  class DecisionTreeClassifierWrapperReader extends 
MLReader[DecisionTreeClassifierWrapper] `
  * `  class DecisionTreeRegressorWrapperWriter(instance: 
DecisionTreeRegressorWrapper)`
  * `  class DecisionTreeRegressorWrapperReader extends 
MLReader[DecisionTreeRegressorWrapper] `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discr...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18068
  
**[Test build #77231 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77231/testReport)**
 for PR 18068 at commit 
[`013adc4`](https://github.com/apache/spark/commit/013adc4460c588e1e06a66d23ce66d864803554e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18058
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18058
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77228/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18058
  
**[Test build #77228 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77228/testReport)**
 for PR 18058 at commit 
[`85882ae`](https://github.com/apache/spark/commit/85882aeda99e9407fed82fe7fef79adcb886).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class HasNumPartitions(Params):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18025: [WIP][SparkR] Update doc and examples for sql functions

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/18025
  
Great point. 

- For a method that is defined in one class and belongs in a group like 
`cov`, we can document it in its own Rd, and add a link to in the `SeeAlso` 
section of the group doc. In this case, the `\alias{cov}` will be in `cov.Rd`. 
- For a method that is defined for multiple classes but meaning are 
drastically different: I think we can still document them in one Rd, and add a 
`details` section to describe the method for each class. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18067
  
**[Test build #77230 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77230/testReport)**
 for PR 18067 at commit 
[`65cf494`](https://github.com/apache/spark/commit/65cf494a0f432c23ea83bc532942bb9c84febaaa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discr...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18068
  
**[Test build #77229 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77229/testReport)**
 for PR 18068 at commit 
[`7bbfe3a`](https://github.com/apache/spark/commit/7bbfe3a860964d166f67c3b099b00c8b11a73f9d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18059: [SPARK-20834][SQL]TypeCoercion:loss of precision when wi...

2017-05-22 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/18059
  
If user wants a precise result, why not use deciaml? float and double are 
both imprecise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...

2017-05-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/18035
  
let's ignore the appveyor intermitted error - since it passed before simple 
typo changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18025: [WIP][SparkR] Update doc and examples for sql functions

2017-05-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/18025
  
Thanks for summarizing. I think they make sense. To be clear though, we 
should also talk about:
- what if a method is defined in one class and belongs in a group, but also 
defined for another class (eg. sql function: `cov`)
- what if it is defined for multiple classes but meaning are drastically 
different (eg. coalesce(DF) and coalesce(col)  in my example above)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request:

https://github.com/apache/spark/pull/17967#discussion_r117909338
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala 
---
@@ -37,6 +37,42 @@ import org.apache.spark.sql.types._
  */
 private[feature] trait RFormulaBase extends HasFeaturesCol with 
HasLabelCol {
 
+  /**
+   * Param for how to order categories of a string FEATURE column used by 
`StringIndexer`.
+   * The last category after ordering is dropped when encoding strings.
+   * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 
'alphabetAsc'.
+   * The default value is 'frequencyDesc'. When the ordering is set to 
'alphabetDesc', `RFormula`
+   * drops the same category as R when encoding strings.
+   *
+   * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 
'b'`:
+   * {{{
+   * 
+-+---+--+
--- End diff --

@felixcheung @HyukjinKwon The scaladoc complied, but the javadoc failed...  
Not sure if there is additional config for java? 


![image](https://cloud.githubusercontent.com/assets/11082368/26341144/048b8d6e-3f47-11e7-8600-c111643a0295.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18058
  
**[Test build #77228 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77228/testReport)**
 for PR 18058 at commit 
[`85882ae`](https://github.com/apache/spark/commit/85882aeda99e9407fed82fe7fef79adcb886).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-22 Thread mpjlu
GitHub user mpjlu reopened a pull request:

https://github.com/apache/spark/pull/18068

 [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discrepancy with 
numInstances and degreesOfFreedom in LR and GLR - Python version

## What changes were proposed in this pull request?
Add test cases for PR-18062

## How was this patch tested?
The existing UT


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mpjlu/spark moreTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18068.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18068


commit 6b31ec7dda73c155fc94c5ccf53709099f8033dd
Author: Peng 
Date:   2017-05-22T11:37:50Z

fix visibility of numInstances and degreesOfFreedom in LR and GLR - Python 
version

commit a8b407f877269f235611e5dc5bb338c421206a57
Author: Peng 
Date:   2017-05-23T05:52:29Z

follow up of SPARK-20764

commit 7bbfe3a860964d166f67c3b099b00c8b11a73f9d
Author: Peng 
Date:   2017-05-23T05:58:52Z

Merge remote-tracking branch 'origin/master' into moreTest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18067
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18067
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77226/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18067
  
**[Test build #77226 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77226/testReport)**
 for PR 18067 at commit 
[`f43ebe0`](https://github.com/apache/spark/commit/f43ebe03115b0b22ed01b76925312dfbc7a2c8c0).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-22 Thread mpjlu
Github user mpjlu closed the pull request at:

https://github.com/apache/spark/pull/18068


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-22 Thread mpjlu
GitHub user mpjlu opened a pull request:

https://github.com/apache/spark/pull/18068

 [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discrepancy with 
numInstances and degreesOfFreedom in LR and GLR - Python version

## What changes were proposed in this pull request?
Add test cases for PR-18062

## How was this patch tested?
The existing UT


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mpjlu/spark moreTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18068.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18068


commit 6b31ec7dda73c155fc94c5ccf53709099f8033dd
Author: Peng 
Date:   2017-05-22T11:37:50Z

fix visibility of numInstances and degreesOfFreedom in LR and GLR - Python 
version

commit a8b407f877269f235611e5dc5bb338c421206a57
Author: Peng 
Date:   2017-05-23T05:52:29Z

follow up of SPARK-20764




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/18058
  
Jenkins, ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18046
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18046
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77221/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18046
  
**[Test build #77221 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77221/testReport)**
 for PR 18046 at commit 
[`e9acb63`](https://github.com/apache/spark/commit/e9acb63e1e695ddab4d80ed74844f2244c3f0e05).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread facaiy
Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/18058
  
There seems something wrong with CI. I saw the same  non-response/delay of 
CI once again since last month.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-22 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r117907961
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
@@ -510,6 +510,69 @@ public UTF8String trim() {
 }
   }
 
+  /**
+   * Removes the given trim string from both ends of a string
+   * @param trimString the trim character string
+   */
+  public UTF8String trim(UTF8String trimString) {
+// This method searches for each character in the source string, 
removes the character if it is found
+// in the trim string, stops at the first not found. It starts from 
left end, then right end.
+// It returns a new string in which both ends trim characters have 
been removed.
+int s = 0; // the searching byte position of the input string
+int i = 0; // the first beginning byte position of a non-matching 
character
+int e = 0; // the last byte position
+int numChars = 0; // number of characters from the input string
+int[] stringCharLen = new int[numBytes]; // array of character length 
for the input string
+int[] stringCharPos = new int[numBytes]; // array of the first byte 
position for each character in the input string
+int searchCharBytes;
+
+while (s < this.numBytes) {
+  UTF8String searchChar = copyUTF8String(s, s + 
numBytesForFirstByte(this.getByte(s)) - 1);
+  searchCharBytes = searchChar.numBytes;
+  // try to find the matching for the searchChar in the trimString set
+  if (trimString.find(searchChar, 0) >= 0) {
+i += searchCharBytes;
+  } else {
+// no matching, exit the search
+break;
+  }
+  s += searchCharBytes;
+}
+
+if (i >= this.numBytes) {
+  // empty string
+  return UTF8String.EMPTY_UTF8;
+} else {
+  //build the position and length array
+  s = 0;
+  while (s < numBytes) {
+stringCharPos[numChars] = s;
+stringCharLen[numChars]= numBytesForFirstByte(getByte(s));
--- End diff --

> I was thinking that these two arrays are only used by trimRight, in the 
case trimLeft trim all the source string, then we don't need to do the 
trimRight, so it will save some performance.

Yeah I agree with you. I just think `numBytesForFirstByte` is called twice 
for beginning matched chars. But it seems easier to extract methods based on 
current implementation. Let's keep `stringCharPos` and `stringCharLen` only in 
"trimRight" part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17308
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77227/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17308
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17308
  
**[Test build #77227 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77227/testReport)**
 for PR 17308 at commit 
[`15dfc80`](https://github.com/apache/spark/commit/15dfc80a8a35208f5f9df150de7c4bd9a015e2d8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/18058
  
@srowen @MLnick Could you help to add @facaiy to whitelist? It seems we 
can't trigger this job currently. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...

2017-05-22 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17967#discussion_r117906884
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala 
---
@@ -37,6 +37,42 @@ import org.apache.spark.sql.types._
  */
 private[feature] trait RFormulaBase extends HasFeaturesCol with 
HasLabelCol {
 
+  /**
+   * Param for how to order categories of a string FEATURE column used by 
`StringIndexer`.
+   * The last category after ordering is dropped when encoding strings.
+   * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 
'alphabetAsc'.
+   * The default value is 'frequencyDesc'. When the ordering is set to 
'alphabetDesc', `RFormula`
+   * drops the same category as R when encoding strings.
+   *
+   * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 
'b'`:
+   * {{{
+   * 
+-+---+--+
--- End diff --

according to this, table is https://wiki.scala-lang.org/display/SW/Syntax



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...

2017-05-22 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17967#discussion_r117906723
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala 
---
@@ -37,6 +37,42 @@ import org.apache.spark.sql.types._
  */
 private[feature] trait RFormulaBase extends HasFeaturesCol with 
HasLabelCol {
 
+  /**
+   * Param for how to order categories of a string FEATURE column used by 
`StringIndexer`.
+   * The last category after ordering is dropped when encoding strings.
+   * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 
'alphabetAsc'.
+   * The default value is 'frequencyDesc'. When the ordering is set to 
'alphabetDesc', `RFormula`
+   * drops the same category as R when encoding strings.
+   *
+   * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 
'b'`:
+   * {{{
+   * 
+-+---+--+
--- End diff --

it's suppose to work with raw html tag? I'm not sure why `` works but 
`` doesn't...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

2017-05-22 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18067#discussion_r117906523
  
--- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
@@ -776,6 +778,19 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))
 head(predict(isoregModel, newDF))
 ```
 
+ Decision Tree
+
+`spark.decisionTree` fits a [decision 
tree](https://en.wikipedia.org/wiki/Decision_tree_learning) classification or 
regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` 
to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+We use the `longley` dataset to train a decision tree and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
--- End diff --

I'd say try to use a data set without `.` in column name if you can.
Probably would be confusion when examples are causing warnings when users 
run them 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-22 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r117906408
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
@@ -510,6 +510,69 @@ public UTF8String trim() {
 }
   }
 
+  /**
+   * Removes the given trim string from both ends of a string
+   * @param trimString the trim character string
+   */
+  public UTF8String trim(UTF8String trimString) {
+// This method searches for each character in the source string, 
removes the character if it is found
+// in the trim string, stops at the first not found. It starts from 
left end, then right end.
+// It returns a new string in which both ends trim characters have 
been removed.
+int s = 0; // the searching byte position of the input string
+int i = 0; // the first beginning byte position of a non-matching 
character
+int e = 0; // the last byte position
+int numChars = 0; // number of characters from the input string
+int[] stringCharLen = new int[numBytes]; // array of character length 
for the input string
+int[] stringCharPos = new int[numBytes]; // array of the first byte 
position for each character in the input string
+int searchCharBytes;
+
+while (s < this.numBytes) {
+  UTF8String searchChar = copyUTF8String(s, s + 
numBytesForFirstByte(this.getByte(s)) - 1);
+  searchCharBytes = searchChar.numBytes;
+  // try to find the matching for the searchChar in the trimString set
+  if (trimString.find(searchChar, 0) >= 0) {
+i += searchCharBytes;
+  } else {
+// no matching, exit the search
+break;
+  }
+  s += searchCharBytes;
+}
+
+if (i >= this.numBytes) {
+  // empty string
+  return UTF8String.EMPTY_UTF8;
+} else {
+  //build the position and length array
+  s = 0;
+  while (s < numBytes) {
+stringCharPos[numChars] = s;
+stringCharLen[numChars]= numBytesForFirstByte(getByte(s));
+s += stringCharLen[numChars];
--- End diff --

um, I'm also thinking about the performance difference. Let's keep it 
unchanged for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17966: [SPARK-20727] Skip tests that use Hadoop utils on...

2017-05-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17966


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/17966
  
merged to master/2.2

I think we should still check win-builder. Also it's a bit hard to tell if 
the skipped tests are skipped - might want to follow up with a trace


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17864: [SPARK-20604][ML] Allow imputer to handle numeric types

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/17864
  
Ping folks for comments/review. Many thanks. 
@viirya @MLnick @jkbradley @hhbyyh @yanboliang @BenFradet 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18048
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18048
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77218/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18048
  
**[Test build #77218 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77218/testReport)**
 for PR 18048 at commit 
[`9af9caf`](https://github.com/apache/spark/commit/9af9caf20f46674eabee2c0ece5ae828d2426a5d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17308
  
**[Test build #77227 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77227/testReport)**
 for PR 17308 at commit 
[`15dfc80`](https://github.com/apache/spark/commit/15dfc80a8a35208f5f9df150de7c4bd9a015e2d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18025: [WIP][SparkR] Update doc and examples for sql functions

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/18025
  
@felixcheung I think we may want to distinguish a few cases:
1. For methods that are mainly defined by only one class, e.g., most 
function methods for Column, it makes sense to group and document them 
together. For example, most aggregate functions of Column go into one single 
Rd, since they are not defined for other classes. In this case, `avg` will go 
to this doc since it is not used by other classes. 
2. For methods that are defined by multiple classes, e.g., the `show` 
method defined for SparkDataFrame, GroupedData, Column and StreamingQuery, we 
can still document them in `show.Rd`. In this case, `show` will go to this doc 
and shows the help for all classes that have defined a `show` method. 
3. When it makes sense, we can also combine 1 & 2 above. For example, 
`gapply` and `gapplyCollecte` are defined for both SparkDataFrame and 
GroupedData. But we can still document them together and create shared 
examples. 

Let me know if this makes sense. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18067
  
**[Test build #77226 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77226/testReport)**
 for PR 18067 at commit 
[`f43ebe0`](https://github.com/apache/spark/commit/f43ebe03115b0b22ed01b76925312dfbc7a2c8c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

2017-05-22 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request:

https://github.com/apache/spark/pull/18067

[SPARK-20849][DOC][SPARKR]  Document R DecisionTree

## What changes were proposed in this pull request?
1, add an example for sparkr `decisionTree`
2, document it in user guide

## How was this patch tested?
local submit


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhengruifeng/spark dt_example

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18067.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18067


commit 3d8172f98f0994fec9ff359dfca4e6fcddd85863
Author: Zheng RuiFeng 
Date:   2017-05-23T03:56:20Z

create pr

commit def3ef4635094955c20c7e9511ce681378794d34
Author: Zheng RuiFeng 
Date:   2017-05-23T04:33:33Z

update vignettes

commit f43ebe03115b0b22ed01b76925312dfbc7a2c8c0
Author: Zheng RuiFeng 
Date:   2017-05-23T05:44:44Z

update sparkr.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17308
  
**[Test build #77225 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77225/testReport)**
 for PR 17308 at commit 
[`ef2d6cd`](https://github.com/apache/spark/commit/ef2d6cd4275d93518ec27d4b08916575a3e597d7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18064
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18064
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77220/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18064
  
**[Test build #77220 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77220/testReport)**
 for PR 18064 at commit 
[`b355c6d`](https://github.com/apache/spark/commit/b355c6d034c6aefcf8f74757353afce870e9bf1d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait Command extends LogicalPlan `
  * `case class ExecutedCommandExec(cmd: RunnableCommand, children: 
Seq[SparkPlan]) extends SparkPlan `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16989: [WIP][SPARK-19659] Fetch big blocks to disk when shuffle...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16989
  
**[Test build #77224 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77224/testReport)**
 for PR 16989 at commit 
[`e022b6d`](https://github.com/apache/spark/commit/e022b6d4ccab0f7fc7b47a468b23046a11576311).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/17698
  
@10110346 Hi, you can use the command @gatorsmile mentioned above to 
generate the result file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18066
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77222/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18066
  
**[Test build #77222 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77222/testReport)**
 for PR 18066 at commit 
[`6ed3d3f`](https://github.com/apache/spark/commit/6ed3d3fa51cd9b09e2f137bda87dcb16e5a9fb1a).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class GenerateColumnAccessor(useColumnarBatch: Boolean)`
  * `class GenerateColumnarBatch(
`
  * `  class GeneratedColumnarBatchIterator extends $`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18066
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18064
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77219/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18064
  
**[Test build #77219 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77219/testReport)**
 for PR 18064 at commit 
[`9507f19`](https://github.com/apache/spark/commit/9507f1938f894b2884b024c8472084a3a531e20d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait Command extends LogicalPlan `
  * `case class ExecutedCommandExec(cmd: RunnableCommand, children: 
Seq[SparkPlan]) extends SparkPlan `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18064
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16989: [WIP][SPARK-19659] Fetch big blocks to disk when shuffle...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16989
  
**[Test build #77223 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77223/testReport)**
 for PR 16989 at commit 
[`9b733ec`](https://github.com/apache/spark/commit/9b733ec0fbc4bad8fc7f2413af1be5c6f718d9c1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18066
  
**[Test build #77222 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77222/testReport)**
 for PR 18066 at commit 
[`6ed3d3f`](https://github.com/apache/spark/commit/6ed3d3fa51cd9b09e2f137bda87dcb16e5a9fb1a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...

2017-05-22 Thread Gauravshah
Github user Gauravshah commented on the issue:

https://github.com/apache/spark/pull/14957
  
@saulshanabrook looks like #16578 is a superset, trying to invest in that 
pull request. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18066: [SPARK-20822][SQL] Generate code to build table c...

2017-05-22 Thread kiszk
GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/18066

[SPARK-20822][SQL] Generate code to build table cache using ColumnarBatch 
and to get value from ColumnVector

## What changes were proposed in this pull request?

This PR generates the following Java code
1. Build each in-memory table cache using `ColumnarBatch` with 
`ColumnVector` instead of using CachedBatch with `Array[Byte]`.
2. Get a value for a column in `ColumnVector without using an iterator

As the first step, for ease of review, I supported only integer and double 
data types with whole-stage codegen. Another PR will address an execution path 
without whole-stage codegen

This PR implements the follings:
1. Keep a in-memory table cache using `ColumnarBatch` with `ColumnVector`. 
For supporting the new and coventional cache data structure, this PR declares 
`CachedBatch` as trait, and declares `CachedColumnarBatch` and 
`CachedBatchBytes` as actual implementations.
2. Generate Java code to build a in-memory table cache.
3. Generate Java code to directly get value from `ColumnVector`.

This PR improves runtime performance by
1. build in-memory table cache by eliminating lots of virtual calls and 
complicated data path.
2. eliminating data copy from column-oriented storage to `InternalRow` in a 
`SpecificColumnarIterator` iterator.


**Options**
A ColumnVector for all primitive data types in ColumnarBatch can be 
compressed. Currently, there are two ways to enable compression:

1. Set true into a property `spark.sql.inMemoryColumnarStorage.compressed 
(default is true)`, or
2. Call `DataFrame.persist(st)`, where st is `MEMORY_ONLY_SER`, 
`MEMORY_ONLY_SER_2`, `MEMORY_AND_DISK_SER`, or `MEMORY_AND_DISK_SER_2`.


**an example program**
```java
val df = sparkContext.parallelize((1 to 10), 1).map(i => (i, 
i.toDouble)).toDF("i", "d").cache
df.filter("i < 8 and 4.0 < d").show
```

**Generated code for building a in-memory table cache**
```
/* 001 */ import scala.collection.Iterator;
/* 002 */ import org.apache.spark.sql.types.DataType;
/* 003 */ import 
org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder;
/* 004 */ import 
org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter;
/* 005 */ import org.apache.spark.sql.execution.columnar.MutableUnsafeRow;
/* 006 */ import org.apache.spark.sql.execution.vectorized.ColumnVector;
/* 007 */
/* 008 */ public SpecificColumnarIterator generate(Object[] references) {
/* 009 */   return new SpecificColumnarIterator(references);
/* 010 */ }
/* 011 */
/* 012 */ class SpecificColumnarIterator extends 
org.apache.spark.sql.execution.columnar.ColumnarIterator {
/* 013 */   private ColumnVector[] colInstances;
/* 014 */   private UnsafeRow unsafeRow = new UnsafeRow(0);
/* 015 */   private BufferHolder bufferHolder = new BufferHolder(unsafeRow);
/* 016 */   private UnsafeRowWriter rowWriter = new 
UnsafeRowWriter(bufferHolder, 0);
/* 017 */   private MutableUnsafeRow mutableRow = null;
/* 018 */
/* 019 */   private int rowIdx = 0;
/* 020 */   private int numRowsInBatch = 0;
/* 021 */
/* 022 */   private scala.collection.Iterator input = null;
/* 023 */   private DataType[] columnTypes = null;
/* 024 */   private int[] columnIndexes = null;
/* 025 */
/* 026 */
/* 027 */
/* 028 */   public SpecificColumnarIterator(Object[] references) {
/* 029 */
/* 030 */ this.mutableRow = new MutableUnsafeRow(rowWriter);
/* 031 */   }
/* 032 */
/* 033 */   public void initialize(Iterator input, DataType[] columnTypes, 
int[] columnIndexes) {
/* 034 */ this.input = input;
/* 035 */ this.columnTypes = columnTypes;
/* 036 */ this.columnIndexes = columnIndexes;
/* 037 */   }
/* 038 */
/* 039 */
/* 040 */
/* 041 */   public boolean hasNext() {
/* 042 */ if (rowIdx < numRowsInBatch) {
/* 043 */   return true;
/* 044 */ }
/* 045 */ if (!input.hasNext()) {
/* 046 */   return false;
/* 047 */ }
/* 048 */
/* 049 */ org.apache.spark.sql.execution.columnar.CachedColumnarBatch 
cachedBatch =
/* 050 */ (org.apache.spark.sql.execution.columnar.CachedColumnarBatch) 
input.next();
/* 051 */ org.apache.spark.sql.execution.vectorized.ColumnarBatch batch 
= cachedBatch.columnarBatch();
/* 052 */ rowIdx = 0;
/* 053 */ numRowsInBatch = cachedBatch.getNumRows();
/* 054 */ colInstances = new ColumnVector[columnIndexes.length];
/* 055 */ for (int i = 0; i < columnIndexes.length; i ++) {
/* 056 */   colInstances[i] = batch.column(columnIndexes[i]);
/* 057 */ }
/* 058 */
/* 059 */ return hasNext();
/* 060 */   }
/* 061 */
/* 062 */   public InternalRo

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/18058
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/18051
  
Maybe I'm missing something completely, but I still don't get the point why 
we are removing the `xx-method` link since we are defining methods as S4 using 
`setMethod`. Lots of packages have these entries in the index. Below is a 
snapshot from the `sp` package. You can find a lot more there. 


![image](https://cloud.githubusercontent.com/assets/11082368/26338918/e8bdd65e-3f38-11e7-83ef-c3293bc267a0.png)


Even for S3 methods, they tend to repeat as well. Below is a snapshot of 
the `gamm4` package. 


![image](https://cloud.githubusercontent.com/assets/11082368/26338937/10432bac-3f39-11e7-9b91-5774e33ff7f8.png)





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18033: [SPARK-20807][SQL] Add compression/decompression of colu...

2017-05-22 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18033
  
@hvanhovell would it be possible to review this or let us know the 
appropriate persons for this review?
cc @sameeragarwal


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17698
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77217/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17698
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17698
  
**[Test build #77217 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77217/testReport)**
 for PR 17698 at commit 
[`10be7eb`](https://github.com/apache/spark/commit/10be7eb586dcf992af2982ba94aa446408ad1e25).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18040: [SPARK-20815] [SPARKR] NullPointerException in RP...

2017-05-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18040


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18040: [SPARK-20815] [SPARKR] NullPointerException in RPackageU...

2017-05-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/18040
  
merged to master/2.2, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/18051
  
@actuaryzhang - we were just talking this in the other PR. what do you 
think?
@zero323 - right, I do agree `?abs-method` is kind of a big problem...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18046
  
**[Test build #77221 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77221/testReport)**
 for PR 18046 at commit 
[`e9acb63`](https://github.com/apache/spark/commit/e9acb63e1e695ddab4d80ed74844f2244c3f0e05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18064
  
**[Test build #77220 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77220/testReport)**
 for PR 18064 at commit 
[`b355c6d`](https://github.com/apache/spark/commit/b355c6d034c6aefcf8f74757353afce870e9bf1d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18064
  
**[Test build #77219 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77219/testReport)**
 for PR 18064 at commit 
[`9507f19`](https://github.com/apache/spark/commit/9507f1938f894b2884b024c8472084a3a531e20d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request:

https://github.com/apache/spark/pull/17967#discussion_r117892629
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala 
---
@@ -37,6 +37,42 @@ import org.apache.spark.sql.types._
  */
 private[feature] trait RFormulaBase extends HasFeaturesCol with 
HasLabelCol {
 
+  /**
+   * Param for how to order categories of a string FEATURE column used by 
`StringIndexer`.
+   * The last category after ordering is dropped when encoding strings.
+   * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 
'alphabetAsc'.
+   * The default value is 'frequencyDesc'. When the ordering is set to 
'alphabetDesc', `RFormula`
+   * drops the same category as R when encoding strings.
+   *
+   * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 
'b'`:
+   * {{{
+   * 
+-+---+--+
--- End diff --

@HyukjinKwon Thanks for the clarification. I don't think `list` paints a 
clear picture here. Would rather keep the table structure. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/17967
  
@yanboliang I updated the example in the param doc. I hope it is clear now 
that it is `alphabetDesc` that drops the same category as R. That is, RFormula 
with `alphabetDesc` drops the first alphabetic category in string encoding. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18048
  
**[Test build #77218 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77218/testReport)**
 for PR 18048 at commit 
[`9af9caf`](https://github.com/apache/spark/commit/9af9caf20f46674eabee2c0ece5ae828d2426a5d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...

2017-05-22 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18048
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18064
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77215/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18064
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18064
  
**[Test build #77215 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77215/testReport)**
 for PR 18064 at commit 
[`5486950`](https://github.com/apache/spark/commit/5486950edada8ae87d2586f3f6d1e2d82027b015).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait Command extends LogicalPlan `
  * `case class ExecutedCommandExec(cmd: RunnableCommand, children: 
Seq[SparkPlan]) extends SparkPlan `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17762: [SPARK-9103][WIP] Track Netty memory usage - take...

2017-05-22 Thread jsoltren
Github user jsoltren closed the pull request at:

https://github.com/apache/spark/pull/17762


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17762: [SPARK-9103][WIP] Track Netty memory usage - take two

2017-05-22 Thread jsoltren
Github user jsoltren commented on the issue:

https://github.com/apache/spark/pull/17762
  
To close the loop here: I'm going to rework these ideas into a new JIRA 
that I'll file, to track *total* memory usage in the UI.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17698
  
**[Test build #77217 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77217/testReport)**
 for PR 17698 at commit 
[`10be7eb`](https://github.com/apache/spark/commit/10be7eb586dcf992af2982ba94aa446408ad1e25).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/18058
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/18058
  
add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17698
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17698
  
**[Test build #77214 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77214/testReport)**
 for PR 17698 at commit 
[`7edfed5`](https://github.com/apache/spark/commit/7edfed5577e8610b4ba42f64979c4168fce829d5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17698
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77214/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18035
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18035
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77216/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18035
  
**[Test build #77216 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77216/testReport)**
 for PR 18035 at commit 
[`5d9afe0`](https://github.com/apache/spark/commit/5d9afe06b665464b06705d618a18a8032255fe1d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/18058
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-22 Thread facaiy
Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/18058
  
Thanks, @yanboliang. 
Do you have any suggestion about testing the parameter?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17698
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17698
  
**[Test build #77213 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77213/testReport)**
 for PR 17698 at commit 
[`6ce4220`](https://github.com/apache/spark/commit/6ce4220bf861f4a64f3126f1f14043dcb666a056).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17698
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77213/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...

2017-05-22 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18048
  
ping @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...

2017-05-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18035
  
**[Test build #77216 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77216/testReport)**
 for PR 18035 at commit 
[`5d9afe0`](https://github.com/apache/spark/commit/5d9afe06b665464b06705d618a18a8032255fe1d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >