[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-25 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143318263
  
Also could you change the PR title to `[SPARK-10807][SPARKR]` ? This 
matches the format we use for all PRs -- 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-PullRequest
 has more information


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-25 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143306461
  
Thanks @olarayej -- I just had a couple of minor style comments

One more thing is that it will be good to add a unit test for this. You 
could just add a test case to some of the existing tests at 
https://github.com/apache/spark/blob/922338812c03eba43f2f1a6c414d1b6b049811cf/R/pkg/inst/tests/test_sparkSQL.R#L215


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-25 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/8908#discussion_r40457649
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1848,3 +1848,28 @@ setMethod("crosstab",
 sct <- callJMethod(statFunctions, "crosstab", col1, col2)
 collect(dataFrame(sct))
   })
+
+
+#' This function downloads the contents of a DataFrame into an R's 
data.frame.
+#' Since data.frames are held in memory, ensure that you have enough memory
+#' in your system to accommodate the contents.
+#' 
+#' @title Download data from a DataFrame into a data.frame
+#' @param x a DataFrame
+#' @return a data.frame
+#' @rdname as.data.frame
+#' @examples \dontrun{
+#' 
+#' irisDF <- createDataFrame(sqlContext, iris)
+#' df <- as.data.frame(irisDF[irisDF$Species == "setosa", ])
+#' }
+setGeneric("as.data.frame")
--- End diff --

We have all the generics used in SparkR in R/generics.R -- Could you move 
this to that file as well ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-25 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/8908#discussion_r40457173
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1848,3 +1848,28 @@ setMethod("crosstab",
 sct <- callJMethod(statFunctions, "crosstab", col1, col2)
 collect(dataFrame(sct))
   })
+
+
+#' This function downloads the contents of a DataFrame into an R's 
data.frame.
+#' Since data.frames are held in memory, ensure that you have enough memory
+#' in your system to accommodate the contents.
+#' 
+#' @title Download data from a DataFrame into a data.frame
+#' @param x a DataFrame
+#' @return a data.frame
+#' @rdname as.data.frame
+#' @examples \dontrun{
+#' 
+#' irisDF <- createDataFrame(sqlContext, iris)
+#' df <- as.data.frame(irisDF[irisDF$Species == "setosa", ])
+#' }
+setGeneric("as.data.frame")
+setMethod(f = "as.data.frame", signature = "DataFrame", definition =
--- End diff --

Minor style comments -- To keep this similar to the other methods in this 
function I think this can be 
```
setMethod("as.data.frame",
   signature(x = "DataFrame"),
   function(x, ...) {
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-25 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/8908#discussion_r40457233
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1848,3 +1848,28 @@ setMethod("crosstab",
 sct <- callJMethod(statFunctions, "crosstab", col1, col2)
 collect(dataFrame(sct))
   })
+
+
+#' This function downloads the contents of a DataFrame into an R's 
data.frame.
+#' Since data.frames are held in memory, ensure that you have enough memory
+#' in your system to accommodate the contents.
+#' 
+#' @title Download data from a DataFrame into a data.frame
+#' @param x a DataFrame
+#' @return a data.frame
+#' @rdname as.data.frame
+#' @examples \dontrun{
+#' 
+#' irisDF <- createDataFrame(sqlContext, iris)
+#' df <- as.data.frame(irisDF[irisDF$Species == "setosa", ])
+#' }
+setGeneric("as.data.frame")
+setMethod(f = "as.data.frame", signature = "DataFrame", definition =
+  function(x, ...) {
+# Check if additional parameters have been passed
+if (length(list(...)) > 0) {
+  stop(paste("Unused argument(s): ", paste(list(...), 
collapse=", ")))
+}
+return(collect(x))
--- End diff --

Unless explicitly required we don't use `return`, so this can just be 
`collect(x)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143072477
  
  [Test build #42989 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42989/consoleFull)
 for   PR 8908 at commit 
[`cee871c`](https://github.com/apache/spark/commit/cee871c3a1a00b9da18da535f872ea752307fa92).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143055725
  
  [Test build #42981 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42981/consoleFull)
 for   PR 8908 at commit 
[`e9e34b5`](https://github.com/apache/spark/commit/e9e34b54f22ad99a80ee144774fd852a6634ed4e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143056182
  
  [Test build #42981 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42981/console)
 for   PR 8908 at commit 
[`e9e34b5`](https://github.com/apache/spark/commit/e9e34b54f22ad99a80ee144774fd852a6634ed4e).
 * This patch **fails R style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143056188
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143056191
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42981/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143051220
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread olarayej
GitHub user olarayej opened a pull request:

https://github.com/apache/spark/pull/8908

SPARK-10807. Added as.data.frame as a synonym for collect().



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/olarayej/spark SPARK-10807

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8908.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8908


commit 461714d727457d0aa4a3f39c5ff046860a1c7b9a
Author: Oscar D. Lara Yejas 
Date:   2015-09-24T21:01:39Z

SPARK-10807. Added as.data.frame as a synonym for collect().




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143066210
  
You can run `dev/lint-r` in your tree and it should do the style checks. 
Also the jenkins log has the error at the end 
```
R/DataFrame.R:1867:69: style: Trailing whitespace is superfluous.
setMethod(f = "as.data.frame", signature = "DataFrame", definition = 
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143072021
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread olarayej
Github user olarayej commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143065273
  
How can I check the style in my local environment? I ran the unit tests but 
got no issues. Tried the link below, but it's only for Python and Scala: 
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide, 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143053859
  
Jenkins, ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143054348
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143054374
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143077368
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42989/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143077224
  
  [Test build #42989 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42989/console)
 for   PR 8908 at commit 
[`cee871c`](https://github.com/apache/spark/commit/cee871c3a1a00b9da18da535f872ea752307fa92).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143077365
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread olarayej
Github user olarayej commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143071669
  
Yeah, I saw that one. Thank you so much! I have fixed it, and ran 
dev/lint-r on my local. It should be good now :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10807. Added as.data.frame as a synonym ...

2015-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8908#issuecomment-143072009
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org