[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14517
  
**[Test build #63739 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63739/consoleFull)**
 for PR 14517 at commit 
[`68cf597`](https://github.com/apache/spark/commit/68cf597c15384d052118340cba6a928e3d45e76f).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14517
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63739/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14517
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14517
  
**[Test build #63739 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63739/consoleFull)**
 for PR 14517 at commit 
[`68cf597`](https://github.com/apache/spark/commit/68cf597c15384d052118340cba6a928e3d45e76f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14623: [SPARK-17044][SQL] Make test files for window functions ...

2016-08-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14623
  
Hi, @rxin .
If you think the direction of PR is not appropriate to your initial 
intention, please let me know.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14517
  
**[Test build #63738 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63738/consoleFull)**
 for PR 14517 at commit 
[`ce9f9c0`](https://github.com/apache/spark/commit/ce9f9c02cad0e5e5a556e33fa66d3faef16fc22a).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14517
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14517
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63738/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14517
  
**[Test build #63738 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63738/consoleFull)**
 for PR 14517 at commit 
[`ce9f9c0`](https://github.com/apache/spark/commit/ce9f9c02cad0e5e5a556e33fa66d3faef16fc22a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14625
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63737/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14625
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14625
  
**[Test build #63737 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63737/consoleFull)**
 for PR 14625 at commit 
[`cdea1a3`](https://github.com/apache/spark/commit/cdea1a3ab334fc264be16f81b3f94102111c0489).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-08-13 Thread vectorijk
Github user vectorijk commented on the issue:

https://github.com/apache/spark/pull/13690
  
Yes, sure. But I'm in a vacation this week. I will keep working on this and
update as soon as possible when I get back next week.

On Thu, Aug 11, 2016, 19:46 Felix Cheung  wrote:

> Hi @vectorijk  would you be interested in
> continuing this work?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74694116
  
--- Diff: R/pkg/R/functions.R ---
@@ -1143,7 +1139,7 @@ setMethod("minute",
 #' @export
 #' @examples \dontrun{select(df, monotonically_increasing_id())}
 setMethod("monotonically_increasing_id",
-  signature(x = "missing"),
+  signature(),
--- End diff --

Hmm, I looked into this a bit

in below, "a" is what we have orginially
```
setGeneric("a", function(x) { standardGeneric("a") } )
setMethod("a", signature(x = "missing"), function() { 1 })
> a()
[1] 1
> a(1)
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘a’ for signature 
‘"numeric"’
> showMethods(a)
Function: a (package .GlobalEnv)
x="missing"

setGeneric("b", function(x = "missing") { standardGeneric("b") } )
setMethod("b", signature("missing"), function() { 1 })
> b()
[1] 1
> b(1)
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘b’ for signature 
‘"numeric"’
> showMethods(b)
Function: b (package .GlobalEnv)
x="missing"

setGeneric("tt", function(...) { standardGeneric("tt") } )
setMethod("tt", signature(), function() { 1 })
> tt(1)
Error in .local(...) : unused argument (1)
> tt("emme")
Error in .local(...) : unused argument ("emme")
> showMethods(tt)
Function: tt (package .GlobalEnv)
...="ANY"
...="character"
(inherited from: ...="ANY")
...="numeric"
(inherited from: ...="ANY")
```

I think the issue with `setMethod("foo", signature(), function() { 1 })` 
are two folds:
1. The error message when calling the function with a parameter is less 
clear than "unable to find function for signature ‘"numeric"’"
2. S4 method is automatically generated for each parameter type that it is 
called with (see "numeric" or character" from showMethods)

So perhaps "b" is a better approach?
Unfortunately we still have to "document" a parameter or `...` in either 
case as `setGeneric` refuse to take it otherwise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14625
  
Sure. The scope is a little bit large, but let me try to go over the 
existing join-related test cases in the test suites. We might not be able to 
cover all of them in a single ticket. Will try my best.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14625
  
**[Test build #63737 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63737/consoleFull)**
 for PR 14625 at commit 
[`cdea1a3`](https://github.com/apache/spark/commit/cdea1a3ab334fc264be16f81b3f94102111c0489).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693897
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -181,7 +181,7 @@ getDefaultSqlSource <- function() {
 #' @method createDataFrame default
 #' @note createDataFrame since 1.4.0
 # TODO(davies): support sampling and infer type from NA
-createDataFrame.default <- function(data, schema = NULL, samplingRatio = 
1.0) {
+createDataFrame.default <- function(data, schema = NULL) {
--- End diff --

we discussed we shouldn't remove `samplingRatio = 1.0` from the signature?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693887
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2461,8 +2473,9 @@ setMethod("unionAll",
 #' Union two or more SparkDataFrames. This is equivalent to `UNION ALL` in 
SQL.
 #' Note that this does not remove duplicate rows across the two 
SparkDataFrames.
 #'
-#' @param x A SparkDataFrame
-#' @param ... Additional SparkDataFrame
+#' @param x a SparkDataFrame.
+#' @param ... additional SparkDataFrame(s).
+#' @param deparse.level dummy variable, currently not used.
--- End diff --

It's not a dummy variable per se - it was there to match the signature of 
the base implementation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693856
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1146,7 +1147,7 @@ setMethod("head",
 
 #' Return the first row of a SparkDataFrame
 #'
-#' @param x A SparkDataFrame
--- End diff --

Right - they are not - RDD functions are not exported from the packages 
(not public) and we don't want Rd file generated for them. Please see PR #14626 
- we want a separate SetGeneric for non-RDD functions, and then this line 
documenting both DataFrame and Column parameter can then go to generics.R


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693823
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -510,9 +510,7 @@ setMethod("registerTempTable",
 #'
 #' Insert the contents of a SparkDataFrame into a table registered in the 
current SparkSession.
 #'
-#' @param x A SparkDataFrame
-#' @param tableName A character vector containing the name of the table
--- End diff --

why is tableName moved?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14625
  
Can we repurpose this ticket to just create test cases for joins in general?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14625
  
Below is the output of Hive for the same queries. They are the same.


[outputHive.txt](https://github.com/apache/spark/files/416810/outputHive.txt)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14556: [SPARK-16966][Core] Make App Name to the valid name inst...

2016-08-13 Thread Sherry302
Github user Sherry302 commented on the issue:

https://github.com/apache/spark/pull/14556
  
@srowen Thanks for the new PR and the review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14631
  
Hi, @davies .
Could you review this PR when you have some time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693204
  
--- Diff: R/pkg/R/functions.R ---
@@ -1497,7 +1493,7 @@ setMethod("soundex",
 #' \dontrun{select(df, spark_partition_id())}
 #' @note spark_partition_id since 2.0.0
 setMethod("spark_partition_id",
-  signature(x = "missing"),
+  signature(),
--- End diff --

what I mean is you could put it as `signature(missing)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693197
  
--- Diff: R/pkg/R/generics.R ---
@@ -1251,10 +1311,57 @@ setGeneric("year", function(x) { 
standardGeneric("year") })
 #' @export
 setGeneric("spark.glm", function(data, formula, ...) { 
standardGeneric("spark.glm") })
 
+#' @param formula a symbolic description of the model to be fitted. If 
\code{data} is a
+#'SparkDataFrame, currently only a few formula operators 
are supported,
+#'including '~', '.', ':', '+', and '-'.
+#' @param data a SparkDataFrame or (R glm) data.frame, list or environment 
for training.
+#' @param family a description of the error distribution and link function 
to be used in the model.
+#'   This can be a character string naming a family function, 
a family function or
+#'   the result of a call to a family function. Refer R family 
at
+#'   
\url{https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html}.
+#' @param epsilon positive convergence tolerance of iterations.
+#' @param maxit integer giving the maximal number of IRLS iterations.
+#' @param weights an optional vector of 'prior weights' to be used in the 
fitting process.
+#'Should be NULL or a numeric vector.
+#' @param subset an optional vector specifying a subset of observations to 
be used in the
+#'   fitting process.
+#' @param na.action a function which indicates what should happen when the 
data contain NAs.
+#'  The default is set by the na.action setting of 
options, and is na.fail
+#'  if that is unset. The 'factory-fresh' default is 
na.omit. Another possible
+#'  value is NULL, no action. Value na.exclude can be 
useful.
+#' @param start starting values for the parameters in the linear predictor.
+#' @param etastart starting values for the linear predictor.
+#' @param mustart starting values for the vector of means.
+#' @param offset this can be used to specify an a priori known component 
to be included in
+#'   the linear predictor during fitting. This should be NULL 
or
+#'   a numeric vector of length equal to the number of cases. 
One or more offset
+#'   terms can be included in the formula instead or as well, 
and if more than
+#'   one is specified their sum is used. See model.offset.
+#' @param control a list of parameters for controlling the fitting 
process. For glm.fit
+#'this is passed to glm.control.
+#' @param model a logical value indicating whether model frame should be 
included as
+#'  a component of the returned value.
+#' @param method the method to be used in fitting the model. The default 
method
+#'   "glm.fit" uses iteratively reweighted least squares 
(IWLS): the alternative
+#'   "model.frame" returns the model frame and does no fitting.
+#'   User-supplied fitting functions can be supplied either as 
a function or
+#'   a character string naming a function, with a function 
which takes the same
+#'   arguments as glm.fit. If specified as a character string 
it is looked up from
+#'   within the stats namespace.
+#' @param x,y logical values indicating whether the response vector and 
model matrix
+#'used in the fitting process should be returned as components 
of the returned value.
+#' @param contrasts an optional list. See the contrasts.arg of 
model.matrix.default.
+#' @param ...  arguments to be used to form the default control 
argument if it is
+#'not supplied directly.
 #' @rdname glm
+#' @details If \code{data} is a data.frame, list or environment, 
\code{glm} behaves the same as
+#'  \code{glm} in the \code{stats} package. If \code{data} is a 
SparkDataFrame,
+#'  \code{spark.glm} is called.
 #' @export
 setGeneric("glm")
 
+#' @param object a fitted ML model object.
+#' @param ... additional argument(s) passed to the method.
--- End diff --

`Currently not used` - we don't use this in any `predict` implementation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693174
  
--- Diff: R/pkg/R/generics.R ---
@@ -1277,8 +1384,11 @@ setGeneric("spark.naiveBayes", function(data, 
formula, ...) { standardGeneric("s
 
 #' @rdname spark.survreg
 #' @export
-setGeneric("spark.survreg", function(data, formula, ...) { 
standardGeneric("spark.survreg") })
+setGeneric("spark.survreg", function(data, formula) { 
standardGeneric("spark.survreg") })
 
+#' @param object a fitted ML model object.
+#' @param path the directory where the model is saved.
+#' @param ... additional argument(s) passed to the method.
 #' @rdname write.ml
 #' @export
 setGeneric("write.ml", function(object, path, ...) { 
standardGeneric("write.ml") })
--- End diff --

again, no reason for this generics to have `...`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693158
  
--- Diff: R/pkg/R/mllib.R ---
@@ -142,15 +143,6 @@ setMethod("spark.glm", signature(data = 
"SparkDataFrame", formula = "formula"),
 #' Generalized Linear Models (R-compliant)
 #'
 #' Fits a generalized linear model, similarly to R's glm().
-#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
--- End diff --

why are we moving this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693142
  
--- Diff: R/pkg/R/mllib.R ---
@@ -298,14 +304,15 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 #' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
 #' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
 #'
-#' @param data SparkDataFrame for training
-#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#' @param data a SparkDataFrame for training.
+#' @param formula a symbolic description of the model to be fitted. 
Currently only a few formula
 #'operators are supported, including '~', '.', ':', '+', 
and '-'.
 #'Note that the response variable of formula is empty in 
spark.kmeans.
-#' @param k Number of centers
-#' @param maxIter Maximum iteration number
-#' @param initMode The initialization algorithm choosen to fit the model
-#' @return \code{spark.kmeans} returns a fitted k-means model
+#' @param ... additional argument(s) passed to the method.
--- End diff --

for all the `spark.something` functions, if we are not actually using `...` 
for optional parameters, lets remove it. There is no reason to have unused 
unnamed parameters for a function?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14630: [SPARK-16966] [SQL] [CORE] App Name is a randomUU...

2016-08-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14630


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693106
  
--- Diff: R/pkg/R/mllib.R ---
@@ -346,8 +339,11 @@ setMethod("spark.kmeans", signature(data = 
"SparkDataFrame", formula = "formula"
 #' Get fitted result from a k-means model, similarly to R's fitted().
 #' Note: A saved-loaded model does not support this method.
 #'
-#' @param object A fitted k-means model
-#' @return \code{fitted} returns a SparkDataFrame containing fitted values
+#' @param object a fitted k-means model.
+#' @param method type of fitted results, \code{"centers"} for cluster 
centers
+#'or \code{"classes"} for assigned classes.
+#' @param ... additional argument(s) passed to the method.
--- End diff --

let's remove `...` in the function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14630: [SPARK-16966] [SQL] [CORE] App Name is a randomUUID even...

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14630
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693097
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,11 +411,12 @@ setMethod("predict", signature(object = 
"KMeansModel"),
 #' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
 #' Only categorical data is supported.
 #'
-#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
-#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#' @param data a \code{SparkDataFrame} of observations and labels for 
model fitting.
+#' @param formula a symbolic description of the model to be fitted. 
Currently only a few formula
 #'   operators are supported, including '~', '.', ':', '+', 
and '-'.
-#' @param smoothing Smoothing parameter
-#' @return \code{spark.naiveBayes} returns a fitted naive Bayes model
+#' @param ... additional argument(s) passed to the method. Currently only 
\code{smoothing}.
--- End diff --

this is removed, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14630: [SPARK-16966] [SQL] [CORE] App Name is a randomUUID even...

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14630
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693087
  
--- Diff: .gitignore ---
@@ -77,3 +77,8 @@ spark-warehouse/
 # For R session data
 .RData
 .RHistory
+.Rhistory
--- End diff --

I'm not sure if this is essential for this PR. I'd suggest leaving this out


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74693053
  
--- Diff: R/pkg/R/mllib.R ---
@@ -602,14 +599,14 @@ setMethod("spark.survreg", signature(data = 
"SparkDataFrame", formula = "formula
 # Returns a summary of the AFT survival regression model produced by 
spark.survreg,
 # similarly to R's summary().
 
-#' @param object A fitted AFT survival regression model
+#' @param object a fitted AFT survival regression model.
 #' @return \code{summary} returns a list containing the model's 
coefficients,
 #' intercept and log(scale)
 #' @rdname spark.survreg
 #' @export
 #' @note summary(AFTSurvivalRegressionModel) since 2.0.0
 setMethod("summary", signature(object = "AFTSurvivalRegressionModel"),
-  function(object, ...) {
+  function(object) {
--- End diff --

thinking more about, I think the reason is because R's base::summary has in 
fact the `...` 
https://stat.ethz.ch/R-manual/R-devel/library/base/html/summary.html

Could you test if base::summary still work? I think we would prefer 
omitting `...` for all our `summary` functions but we need to make sure doing 
so doesn't mask it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14558#discussion_r74692979
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -3184,6 +3200,7 @@ setMethod("histogram",
 #' @param x A SparkDataFrame
 #' @param url JDBC database url of the form `jdbc:subprotocol:subname`
 #' @param tableName The name of the table in the external database
+#' @param ... additional JDBC database connection propertie(s).
--- End diff --

the singular form is property and plural is properties, so the `()` doesn't 
really work in this case. let's just leave it as properties.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14631
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63736/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14631
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14631
  
**[Test build #63736 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63736/consoleFull)**
 for PR 14631 at commit 
[`b1c1be4`](https://github.com/apache/spark/commit/b1c1be48cf4fd31e3de1c0a5e402e86770a2158a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14625
  
FYI, found the original JIRA who delivered the first 25 auto_join test 
cases to Hive: https://issues.apache.org/jira/browse/HIVE-1642


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14625
  
@rxin Sure, will do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14631
  
**[Test build #63736 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63736/consoleFull)**
 for PR 14631 at commit 
[`b1c1be4`](https://github.com/apache/spark/commit/b1c1be48cf4fd31e3de1c0a5e402e86770a2158a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14631
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14631
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63735/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14631
  
**[Test build #63735 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63735/consoleFull)**
 for PR 14631 at commit 
[`962a052`](https://github.com/apache/spark/commit/962a052fcb76ed364b77cb328d5bd9b2a5c83775).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should preserve mi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14631
  
**[Test build #63735 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63735/consoleFull)**
 for PR 14631 at commit 
[`962a052`](https://github.com/apache/spark/commit/962a052fcb76ed364b77cb328d5bd9b2a5c83775).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14631: [SPARK-17035][SQL][PYSPARK] Timestamp should pres...

2016-08-13 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/14631

[SPARK-17035][SQL][PYSPARK] Timestamp should preserve microseconds part

## What changes were proposed in this pull request?

**Before**
```
>>> from datetime import datetime
>>> from pyspark.sql import Row
>>> from pyspark.sql.types import StructType, StructField, TimestampType
>>> schema = StructType([StructField("dt", TimestampType(), False)])
>>> data = [{"dt": datetime.max}]
>>> sql_data = [schema.toInternal(row) for row in data]
>>> sql_data
[(2534023296,)]
```

**After**
```
>>> from datetime import datetime
>>> from pyspark.sql import Row
>>> from pyspark.sql.types import StructType, StructField, TimestampType
>>> schema = StructType([StructField("dt", TimestampType(), False)])
>>> data = [{"dt": datetime.max}]
>>> sql_data = [schema.toInternal(row) for row in data]
>>> sql_data
[(2534023295,)]
```

## How was this patch tested?

Pass the Jenkins test with a new test case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-17035

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14631.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14631


commit 962a052fcb76ed364b77cb328d5bd9b2a5c83775
Author: Dongjoon Hyun 
Date:   2016-08-13T21:35:32Z

[SPARK-17035][SQL][PYSPARK] Timestamp should not lost microseconds part




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14580
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14580
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63733/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14583: [SPARK-16994][SQL] PushDownPredicate should not ignore l...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14583
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14580
  
**[Test build #63733 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63733/consoleFull)**
 for PR 14580 at commit 
[`11f2509`](https://github.com/apache/spark/commit/11f250921c6eef0c10915abbf26515f7599abd64).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14583: [SPARK-16994][SQL] PushDownPredicate should not ignore l...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14583
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63732/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14583: [SPARK-16994][SQL] PushDownPredicate should not ignore l...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14583
  
**[Test build #63732 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63732/consoleFull)**
 for PR 14583 at commit 
[`91fb344`](https://github.com/apache/spark/commit/91fb344729d2d24ff2f1f968a59a87f5ca91c355).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-13 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14447
  
These are the last few comments. @shivaram what do you think - with those 
fixed we are good to merge this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74692317
  
--- Diff: R/pkg/R/mllib.R ---
@@ -533,6 +628,27 @@ setMethod("write.ml", signature(object = 
"KMeansModel", path = "character"),
 invisible(callJMethod(writer, "save", path))
   })
 
+# Saves the Multilayer Perceptron Classification Model to the input path.
+
+#' @param path The directory where the model is saved
--- End diff --

please add a `@param` for `object`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74692309
  
--- Diff: R/pkg/R/mllib.R ---
@@ -533,6 +628,27 @@ setMethod("write.ml", signature(object = 
"KMeansModel", path = "character"),
 invisible(callJMethod(writer, "save", path))
   })
 
+# Saves the Multilayer Perceptron Classification Model to the input path.
+
+#' @param path The directory where the model is saved
+#' @param overwrite Overwrites or not if the output path already exists. 
Default is FALSE
+#'  which means throw exception if the output path exists.
+#'
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame-method
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14527: [SPARK-16938][SQL] `drop/dropDuplicate` should handle th...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14527
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63734/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14527: [SPARK-16938][SQL] `drop/dropDuplicate` should handle th...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14527
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14527: [SPARK-16938][SQL] `drop/dropDuplicate` should handle th...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14527
  
**[Test build #63734 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63734/consoleFull)**
 for PR 14527 at commit 
[`fbd4c2c`](https://github.com/apache/spark/commit/fbd4c2ccfdf46be82584cd1469dc3c4517db8538).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74692263
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,94 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Integer vector containing the number of nodes for each 
layer
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
+jobj <- 
callJStatic("org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper",
+"fit", data@sdf, as.integer(blockSize), 
as.array(layers),
+as.character(solver), as.integer(maxIter), 
as.numeric(tol),
+as.numeric(stepSize), as.integer(seed))
+return(new("MultilayerPerceptronClassificationModel", jobj = 
jobj))
+  })
+
+# Makes predictions from a model produced by spark.mlp().
+
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
labeled in a column named
+#' "prediction"
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame-method
+#' @export
+#' @note predict(MultilayerPerceptronClassificationModel) since 2.1.0
+setMethod("predict", signature(object = 
"MultilayerPerceptronClassificationModel"),
+  function(object, newData) {
+return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
+  })
+
+# Returns the summary of a Multilayer Perceptron Classification Model 
produced by \code{spark.mlp}
+
+#' @param object A Multilayer Perceptron Classification Model fitted by 
\code{spark.mlp}
+#' @return \code{summary} returns a list containing \code{layers}, the 
label distribution, and
+#' \code{tables}, conditional probabilities given the target label
+#' @rdname spark.mlp
+#' @export
+#' @aliases spark.mlp,SparkDataFrame-method
--- End diff --

same here
`summary,MultilayerPerceptronClassificationModel-method`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74692255
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,94 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Integer vector containing the number of nodes for each 
layer
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
+jobj <- 
callJStatic("org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper",
+"fit", data@sdf, as.integer(blockSize), 
as.array(layers),
+as.character(solver), as.integer(maxIter), 
as.numeric(tol),
+as.numeric(stepSize), as.integer(seed))
+return(new("MultilayerPerceptronClassificationModel", jobj = 
jobj))
+  })
+
+# Makes predictions from a model produced by spark.mlp().
+
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
labeled in a column named
+#' "prediction"
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame-method
--- End diff --

wait, this alias doesn't seem right. It should be the name of the function 
+ signature.
ie.
`@aliases predict,MultilayerPerceptronClassificationModel-method`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74692228
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,94 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Integer vector containing the number of nodes for each 
layer
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
--- End diff --

was there a reason for `...`? we are finding CRAN check would warn about it 
if we have it and we don't have a `@param` for it. I think it's best we don't 
have that in the function definition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse proxy ...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13950
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse proxy ...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13950
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63731/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse proxy ...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13950
  
**[Test build #63731 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63731/consoleFull)**
 for PR 13950 at commit 
[`222fd1d`](https://github.com/apache/spark/commit/222fd1dc5944a851f4aa267a01c2a015b12b3ec9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14625
  
@gatorsmile the comment should apply not only to data, but also query (e.g. 
what case we are testing ...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...

2016-08-13 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14629
  
Yes that's a good question. A 0-column DataFrame is valid, though that's a 
little different from being able to select 0 columns from a DataFrame. I don't 
have a database handy, but can you select no columns in any SQL syntax? Maybe 
best to emulate that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-13 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14559
  
Oh... good point. I see the existing `truncate` option isn't documented 
either. Yes that should be done in `sql-programming-guide.md`. We can follow up 
on this one or make a new small issue for it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14607: [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up)

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14607
  
@davies  Can you create a new JIRA ticket for this change? It is a 
non-trivial follow-up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14629
  
Why do we want to enforce this? It is valid to have a DataFrame without any 
columns.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14559
  
Are the options here documented anywhere?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14527: [SPARK-16938][SQL] `drop/dropDuplicate` should handle th...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14527
  
**[Test build #63734 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63734/consoleFull)**
 for PR 14527 at commit 
[`fbd4c2c`](https://github.com/apache/spark/commit/fbd4c2ccfdf46be82584cd1469dc3c4517db8538).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14583: [SPARK-16994][SQL] PushDownPredicate should not ignore l...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14583
  
**[Test build #63732 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63732/consoleFull)**
 for PR 14583 at commit 
[`91fb344`](https://github.com/apache/spark/commit/91fb344729d2d24ff2f1f968a59a87f5ca91c355).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14580
  
**[Test build #63733 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63733/consoleFull)**
 for PR 14580 at commit 
[`11f2509`](https://github.com/apache/spark/commit/11f250921c6eef0c10915abbf26515f7599abd64).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse proxy ...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13950
  
**[Test build #63731 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63731/consoleFull)**
 for PR 13950 at commit 
[`222fd1d`](https://github.com/apache/spark/commit/222fd1dc5944a851f4aa267a01c2a015b12b3ec9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse proxy ...

2016-08-13 Thread gurvindersingh
Github user gurvindersingh commented on the issue:

https://github.com/apache/spark/pull/13950
  
@vanzin addressed most of your comments. Wondering if the test for location 
header is really needed, as it is a very simple two if checks. Let me know if 
you feel strongly about that or any other changes are required.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...

2016-08-13 Thread gurvindersingh
Github user gurvindersingh commented on a diff in the pull request:

https://github.com/apache/spark/pull/13950#discussion_r74690638
  
--- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala ---
@@ -186,6 +188,67 @@ private[spark] object JettyUtils extends Logging {
 contextHandler
   }
 
+  /** Create a handler for proxying request to Workers and Application 
Drivers */
+  def createProxyHandler(
--- End diff --

added a test under UISuite, let me know if that is not the right place for 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14625
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63730/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14625
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14625
  
**[Test build #63730 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63730/consoleFull)**
 for PR 14625 at commit 
[`3fe55f1`](https://github.com/apache/spark/commit/3fe55f184e5e8771c88b826f9bcccb76d9817624).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r74689929
  
--- Diff: R/pkg/R/mllib.R ---
@@ -299,6 +308,91 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#'
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (TRUE) or
+#' antitonic/decreasing (FALSE)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @param weightCol The weight column name.
+#' @return \code{spark.isoreg} returns a fitted Isotonic Regression model
+#' @rdname spark.isoreg
+#' @aliases spark.isoreg,SparkDataFrame,formula-method
+#' @name spark.isoreg
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' data <- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+#' list(5.0, 3.0), list(1.0, 4.0))
+#' df <- createDataFrame(data, c("label", "feature"))
+#' model <- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+#' # return model boundaries and prediction as lists
+#' result <- summary(model, df)
+#' # prediction based on fitted model
+#' predict_data <- list(list(-2.0), list(-1.0), list(0.5),
+#' list(0.75), list(1.0), list(2.0), list(9.0))
+#' predict_df <- createDataFrame(predict_data, c("feature"))
+#' # get prediction column
+#' predict_result <- collect(select(predict(model, predict_df), 
"prediction"))
+#'
+#' # save fitted model to input path
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#'
+#' # can also read back the saved model and print
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.isoreg since 2.1.0
+setMethod("spark.isoreg", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, isotonic = TRUE, featureIndex = 0, 
weightCol = NULL) {
+formula <- paste0(deparse(formula), collapse = "")
+
+if (is.null(weightCol)) {
+  weightCol <- ""
+}
+
+jobj <- 
callJStatic("org.apache.spark.ml.r.IsotonicRegressionWrapper", "fit",
+data@sdf, formula, as.logical(isotonic), 
as.integer(featureIndex),
+  as.character(weightCol))
+return(new("IsotonicRegressionModel", jobj = jobj))
+  })
+
+#  Predicted values based on an isotonicRegression model
+
+#' @param object a fitted IsotonicRegressionModel
+#' @param newData SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
values
+#' @rdname spark.isoreg
+#' @export
+#' @note predict(IsotonicRegressionModel) since 2.1.0
+setMethod("predict", signature(object = "IsotonicRegressionModel"),
+  function(object, newData) {
+return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
+  })
+
+#  Get the summary of an IsotonicRegressionModel model
+
+#' @param object a fitted IsotonicRegressionModel
--- End diff --

The function is defined as `function(object, ...) `
Even though `...` is not a thing and it is not in the `signature()` field, 
CRAN checks (which we are working on) will flag this since it doesn't have a 
`@param` line.

To address this, add a line `@param ... unused additional parameters` or 
similar. Please see another functions for examples


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14613: [SPARK-16883][SparkR]:SQL decimal type is not pro...

2016-08-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14613#discussion_r74689825
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -354,6 +354,24 @@ setMethod("colnames<-",
 dataFrame(sdf)
   })
 
+specialtypeshandle <- function(type) {
--- End diff --

maybe this should go into types.R?
Could you add some documentation comment what this is doing and why?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14522: [Spark-16508][SparkR] Split docs for arrange and orderBy...

2016-08-13 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14522
  
Since this fixes some CRAN check warnings, let's merge this? @shivaram what 
do you think?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread zero323
Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/14614
  
@srowen By all means. Here you are. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14630: [SPARK-16966] [SQL] [CORE] App Name is a randomUUID even...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14630
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63727/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14630: [SPARK-16966] [SQL] [CORE] App Name is a randomUUID even...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14630
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14630: [SPARK-16966] [SQL] [CORE] App Name is a randomUUID even...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14630
  
**[Test build #63727 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63727/consoleFull)**
 for PR 14630 at commit 
[`bccfc7d`](https://github.com/apache/spark/commit/bccfc7d1a2c8329736122d570a685dfcce39447e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14614
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63729/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14614
  
**[Test build #63729 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63729/consoleFull)**
 for PR 14614 at commit 
[`efc95c5`](https://github.com/apache/spark/commit/efc95c5839341c419b3bebd028f110d0256bcc1e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14625: [SPARK-17045] [SQL] Moving Auto_Joins from HiveCompatibi...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14625
  
**[Test build #63730 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63730/consoleFull)**
 for PR 14625 at commit 
[`3fe55f1`](https://github.com/apache/spark/commit/3fe55f184e5e8771c88b826f9bcccb76d9817624).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14614
  
**[Test build #63729 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63729/consoleFull)**
 for PR 14614 at commit 
[`efc95c5`](https://github.com/apache/spark/commit/efc95c5839341c419b3bebd028f110d0256bcc1e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14614
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63728/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14614
  
**[Test build #63728 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63728/consoleFull)**
 for PR 14614 at commit 
[`afcee39`](https://github.com/apache/spark/commit/afcee393701abbda3ac155207ff39503d4752737).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14614
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14614
  
**[Test build #63728 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63728/consoleFull)**
 for PR 14614 at commit 
[`afcee39`](https://github.com/apache/spark/commit/afcee393701abbda3ac155207ff39503d4752737).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2016-08-13 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14557#discussion_r74688576
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -798,6 +798,19 @@ private[spark] class TaskSetManager(
   }
 }
 maybeFinishTaskSet()
+
+// kill running task if stage failed
+if(reason.isInstanceOf[FetchFailed]) {
+  killTasks(runningTasksSet, taskInfos)
+}
+  }
+
+  def killTasks(tasks: HashSet[Long], taskInfo: HashMap[Long, TaskInfo]): 
Boolean = {
+tasks.foreach { task =>
+  val executorId = taskInfo(task).executorId
+  sched.sc.schedulerBackend.killTask(task, executorId, true)
--- End diff --

I think `true` would look nicer with the name of the parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2016-08-13 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14557#discussion_r74688557
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -798,6 +798,19 @@ private[spark] class TaskSetManager(
   }
 }
 maybeFinishTaskSet()
+
+// kill running task if stage failed
+if(reason.isInstanceOf[FetchFailed]) {
+  killTasks(runningTasksSet, taskInfos)
+}
+  }
+
+  def killTasks(tasks: HashSet[Long], taskInfo: HashMap[Long, TaskInfo]): 
Boolean = {
+tasks.foreach { task =>
+  val executorId = taskInfo(task).executorId
+  sched.sc.schedulerBackend.killTask(task, executorId, true)
+}
+true
--- End diff --

Why are you returning `true` if you don't use it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2016-08-13 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14557#discussion_r74688550
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -798,6 +798,19 @@ private[spark] class TaskSetManager(
   }
 }
 maybeFinishTaskSet()
+
+// kill running task if stage failed
+if(reason.isInstanceOf[FetchFailed]) {
--- End diff --

A space between `if` and `(`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >