[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread yu-iskw
Github user yu-iskw commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135624869
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135624510
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135624503
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10049][SPARKR] Support collecting data ...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8458#issuecomment-135623885
  
  [Test build #41723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41723/consoleFull)
 for   PR 8458 at commit 
[`2bc97ad`](https://github.com/apache/spark/commit/2bc97adc8a081301e0bc7394d35dd617a9ae49a5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5753] [SQL] add JDBCRDD support for pos...

2015-08-27 Thread jakajancar
Github user jakajancar commented on the pull request:

https://github.com/apache/spark/pull/4549#issuecomment-135623856
  
@lepfhty Has any progress been made on this? Can you point me to a 
PR/branch/JIRA issue/... that I can subscribe to?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread yu-iskw
Github user yu-iskw commented on a diff in the pull request:

https://github.com/apache/spark/pull/8495#discussion_r38168825
  
--- Diff: R/pkg/R/generics.R ---
@@ -413,7 +413,7 @@ setGeneric("dropna",
 #' @rdname nafunctions
 #' @export
 setGeneric("na.omit",
--- End diff --

I'm just testing if we need this generic or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10049][SPARKR] Support collecting data ...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8458#issuecomment-135623380
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135623342
  
OK. I guess the main question at here is if we want to have a different 
semantic with hive on `array_contains` regarding `null`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10049][SPARKR] Support collecting data ...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8458#issuecomment-135623371
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135622466
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135622468
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41721/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135621721
  
  [Test build #41722 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41722/consoleFull)
 for   PR 8495 at commit 
[`4758a87`](https://github.com/apache/spark/commit/4758a87ea3b74914ffd2870e1a736472944c4a04).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135621561
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135621549
  
postgresql's output regarding `in`...

```
yhuai=# select cast(null as char(10)) in ('1', cast(null as char(10))); 
 ?column? 
--
 
(1 row)

yhuai=# select cast(null as char(10)) in ('1', cast(null as char(10))) is 
null;
 ?column? 
--
 t
(1 row)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135621545
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread yu-iskw
Github user yu-iskw commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135621454
  
@shivaram All right. I am checking this on my local machine. Please give me 
some minutes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135620933
  
@yu-iskw I also found a minor bug in lint-r that I just fixed. Please let 
me know if that is good. With this change lint-r passes on my machine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135620803
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135620819
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135620750
  
Here is the output of some sample tests using hive 1.2.1
```
hive> select cast(null as string) in ('1', cast(null as string));
OK
NULL
Time taken: 0.042 seconds, Fetched: 1 row(s)
hive> select array_contains(array('1', cast(null as string)), cast(null as 
string));
OK
false
Time taken: 0.042 seconds, Fetched: 1 row(s)
hive> 
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread shivaram
GitHub user shivaram opened a pull request:

https://github.com/apache/spark/pull/8495

[SPARKR] [SPARK-10328] Fix generic for na.omit

S3 function is at 
https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shivaram/spark-1 na-omit-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8495.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8495


commit ff733d2aef7bc204a9361bcbb0415b97841a71b1
Author: Shivaram Venkataraman 
Date:   2015-08-28T03:01:32Z

Fix generic for na.omit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10328] Fix generic for na.omit

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8495#issuecomment-135620501
  
cc @yu-iskw 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135620315
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135620319
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41718/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135620132
  
  [Test build #41718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41718/console)
 for   PR 8343 at commit 
[`472c767`](https://github.com/apache/spark/commit/472c76714c25b909e281d8079b7ead6c152d4512).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10327][SQL] Cache Table is not working ...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8494#issuecomment-135620007
  
  [Test build #41720 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41720/consoleFull)
 for   PR 8494 at commit 
[`bfd40d9`](https://github.com/apache/spark/commit/bfd40d999b6530bc04fc03ea6591c0093e10e534).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9741][SQL] Approximate Count Distinct u...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8362#issuecomment-135619890
  
  [Test build #41719 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41719/consoleFull)
 for   PR 8362 at commit 
[`1ea722b`](https://github.com/apache/spark/commit/1ea722b44745036ef568447f9db93a7ebade8b12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10327][SQL] Cache Table is not working ...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8494#issuecomment-135619546
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10327][SQL] Cache Table is not working ...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8494#issuecomment-135619597
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135619437
  
I think I found the problem. Our `setGeneric` for `na.omit` is wrong. It is 
being too restrictive. We need a diff which looks like 

```
-   function(x, how = c("any", "all"), minNonNulls = NULL, cols = 
NULL) {
+   function(object, ...) {
```

I'll send a PR in a minute


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9741][SQL] Approximate Count Distinct u...

2015-08-27 Thread hvanhovell
Github user hvanhovell commented on the pull request:

https://github.com/apache/spark/pull/8362#issuecomment-135619245
  
Implemented initial non-sparse HLL++. I am going to take a look at the 
sparse version next week. The results are still equal to the Clearspring HLL+ 
implementation in non-sparse mode.

I also need to clean-up the docs for the main HLL++ class a bit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10327][SQL] Cache Table is not working ...

2015-08-27 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/8494#issuecomment-135618510
  
cc @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10327][SQL] Cache Table is not working ...

2015-08-27 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/8494

[SPARK-10327][SQL] Cache Table is not working while subquery has alias in 
its project list

```scala
import org.apache.spark.sql.hive.execution.HiveTableScan
sql("select key, value, key + 1 from src").registerTempTable("abc")
cacheTable("abc")

val sparkPlan = sql(
  """select a.key, b.key, c.key from
|abc a join abc b on a.key=b.key
|join abc c on a.key=c.key""".stripMargin).queryExecution.sparkPlan

assert(sparkPlan.collect { case e: InMemoryColumnarTableScan => e 
}.size === 3) // failed
assert(sparkPlan.collect { case e: HiveTableScan => e }.size === 0) // 
failed
```

The actual plan is:

```
== Parsed Logical Plan ==
'Project 
[unresolvedalias('a.key),unresolvedalias('b.key),unresolvedalias('c.key)]
 'Join Inner, Some(('a.key = 'c.key))
  'Join Inner, Some(('a.key = 'b.key))
   'UnresolvedRelation [abc], Some(a)
   'UnresolvedRelation [abc], Some(b)
  'UnresolvedRelation [abc], Some(c)

== Analyzed Logical Plan ==
key: int, key: int, key: int
Project [key#14,key#61,key#66]
 Join Inner, Some((key#14 = key#66))
  Join Inner, Some((key#14 = key#61))
   Subquery a
Subquery abc
 Project [key#14,value#15,(key#14 + 1) AS _c2#16]
  MetastoreRelation default, src, None
   Subquery b
Subquery abc
 Project [key#61,value#62,(key#61 + 1) AS _c2#58]
  MetastoreRelation default, src, None
  Subquery c
   Subquery abc
Project [key#66,value#67,(key#66 + 1) AS _c2#63]
 MetastoreRelation default, src, None

== Optimized Logical Plan ==
Project [key#14,key#61,key#66]
 Join Inner, Some((key#14 = key#66))
  Project [key#14,key#61]
   Join Inner, Some((key#14 = key#61))
Project [key#14]
 InMemoryRelation [key#14,value#15,_c2#16], true, 1, 
StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 
1) AS _c2#16]), Some(abc)
Project [key#61]
 MetastoreRelation default, src, None
  Project [key#66]
   MetastoreRelation default, src, None

== Physical Plan ==
TungstenProject [key#14,key#61,key#66]
 BroadcastHashJoin [key#14], [key#66], BuildRight
  TungstenProject [key#14,key#61]
   BroadcastHashJoin [key#14], [key#61], BuildRight
ConvertToUnsafe
 InMemoryColumnarTableScan [key#14], (InMemoryRelation 
[key#14,value#15,_c2#16], true, 1, StorageLevel(true, true, false, true, 
1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc))
ConvertToUnsafe
 HiveTableScan [key#61], (MetastoreRelation default, src, None)
  ConvertToUnsafe
   HiveTableScan [key#66], (MetastoreRelation default, src, None)
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark weird_cache

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8494.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8494


commit bfd40d999b6530bc04fc03ea6591c0093e10e534
Author: Cheng Hao 
Date:   2015-08-28T02:41:56Z

Cache Table is not working while subquery has alias in its project list




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135618198
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41714/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135618193
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135618028
  
  [Test build #41714 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41714/console)
 for   PR 8484 at commit 
[`35371fb`](https://github.com/apache/spark/commit/35371fb629217ee27ccda451c931d04137c05f93).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9741][SQL] Approximate Count Distinct u...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8362#issuecomment-135617993
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread yu-iskw
Github user yu-iskw commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135617969
  
@shivaram I see. I'll investigate the cause. Thank you for letting me know. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9741][SQL] Approximate Count Distinct u...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8362#issuecomment-135617983
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135617728
  
@yu-iskw I actually an error in Jenkins which says. I guess this is from 
the PR that added na.omit to the NAMESPACE yesterday
```
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function 'na.omit' for signature 
'"integer"'
Calls: lint_package ... ends -> as.igraph.es -> inherits -> na.omit -> 

```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-27 Thread yu-iskw
Github user yu-iskw commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135617611
  
@shivaram thank you for merging it. I keep watching the Jenkins in a couple 
of hours. If it will go well, I will inform the community about this lint 
script.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135617402
  
  [Test build #41718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41718/consoleFull)
 for   PR 8343 at commit 
[`472c767`](https://github.com/apache/spark/commit/472c76714c25b909e281d8079b7ead6c152d4512).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135616536
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135616551
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135616422
  
@lresende I'm just retesting this as we merged a R style checker. I'm sure 
this PR should be fine, but just want to run this through to make sure things 
are working fine.

FYI @yu-iskw 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7883


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135616345
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135616119
  
Thanks @lresende -- LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8813][SQL] Combine files when there're ...

2015-08-27 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/8125#discussion_r38167285
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/sources/CombineSmallFile.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources
+
+import org.apache.hadoop.fs.{FileStatus, FileSystem, Path}
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.SQLContext
+
+object CombineSmallFile {
+  def combineWithFiles[T](rdd: RDD[T], sqlContext: SQLContext, inputFiles: 
Array[FileStatus])
+  : RDD[T] = {
+if (sqlContext.conf.combineSmallFile) {
+  val totalLen = inputFiles.map { file =>
+if (file.isDir) 0L else file.getLen
+  }.sum
+  val numPartitions = (totalLen / sqlContext.conf.splitSize + 1).toInt
+  rdd.coalesce(numPartitions)
--- End diff --

I think this is a very hack way to solve this problem. As we can not tell 
how the the data source to be split, even for Hadoop, the split size just a 
hint, use that for computing the partition number probably too risky for a 
generic data process framework.

And the `RDD.coalesce` actually will combine the splits in a arbitrary way, 
it's probably causes the data skew, as we most likely combine the large 
partitions into a a single task.

IMO, I'd like to deep investigate how Hive to combine the small partitions, 
by using the `CombineHiveInputFormat` or `HiveInputFormat`, which seems has a 
strategy to select the partitions according to both input format, and also keep 
the balance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135615906
  
Alright I'm going to merge this as its better to do so before more breaking 
style changes get in. Will watch Jenkins for the next couple of hours to make 
sure things are fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/8464#discussion_r38166499
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/local/LimitNode.scala ---
@@ -0,0 +1,45 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.local
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+
+
+case class LimitNode(limit: Int, child: LocalNode) extends UnaryLocalNode {
--- End diff --

I think we still need `filter`, or `map` for these iterator trees. @rxin is 
there anything I misunderstand for the `LocalNode` design?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8399#issuecomment-135605403
  
  [Test build #1700 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1700/console)
 for   PR 8399 at commit 
[`bada453`](https://github.com/apache/spark/commit/bada4539227a3705337beea7e08bdc45183e2903).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-9545, SPARK-9547: Use Maven in PRB if ti...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7878#issuecomment-135604744
  
  [Test build #41711 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41711/console)
 for   PR 7878 at commit 
[`cf58c49`](https://github.com/apache/spark/commit/cf58c49c3be31c8e33639ba68eca16398f98c7f6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-9545, SPARK-9547: Use Maven in PRB if ti...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7878#issuecomment-135604771
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41711/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-9545, SPARK-9547: Use Maven in PRB if ti...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7878#issuecomment-135604767
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8464#issuecomment-135603859
  
  [Test build #41717 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41717/consoleFull)
 for   PR 8464 at commit 
[`7dcd502`](https://github.com/apache/spark/commit/7dcd502fc7278978fab5a233f4a81fefcca8bf72).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8464#issuecomment-135603152
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/8464#discussion_r38165750
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/local/LimitNode.scala ---
@@ -0,0 +1,45 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.local
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+
+
+case class LimitNode(limit: Int, child: LocalNode) extends UnaryLocalNode {
+
+  private[this] var count = 0
+
+  override def output: Seq[Attribute] = child.output
+
+  override def open(): Unit = child.open()
+
+  override def close(): Unit = child.close()
+
+  override def get(): InternalRow = child.get()
--- End diff --

Renamed to `fetch`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8399#issuecomment-135603184
  
  [Test build #1700 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1700/consoleFull)
 for   PR 8399 at commit 
[`bada453`](https://github.com/apache/spark/commit/bada4539227a3705337beea7e08bdc45183e2903).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8464#issuecomment-135603138
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/8464#discussion_r38165742
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/local/LocalNodeTest.scala
 ---
@@ -0,0 +1,189 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.local
+
+import scala.util.control.NonFatal
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
+import org.apache.spark.sql.catalyst.util._
+import org.apache.spark.sql.{DataFrame, Row}
+import org.apache.spark.sql.types.StructType
+
+class LocalNodeTest extends SparkFunSuite {
+
+  /**
+   * Runs the LocalNode and makes sure the answer matches the expected 
result.
+   * @param input the input data to be used.
+   * @param nodeFunction a function which accepts the input LocalNode and 
uses it to instantiate
+   * the local physical operator that's being tested.
+   * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
+   * @param sortAnswers if true, the answers will be sorted by their 
toString representations prior
+   *to being compared.
+   */
+  protected def checkAnswer(
+  input: DataFrame,
+  nodeFunction: LocalNode => LocalNode,
+  expectedAnswer: Seq[Row],
+  sortAnswers: Boolean = true): Unit = {
+doCheckAnswer(
+  input :: Nil,
+  nodes => nodeFunction(nodes.head),
+  expectedAnswer,
+  sortAnswers)
+  }
+
+  /**
+   * Runs the LocalNode and makes sure the answer matches the expected 
result.
+   * @param left the left input data to be used.
+   * @param right the right input data to be used.
+   * @param nodeFunction a function which accepts the input LocalNode and 
uses it to instantiate
+   * the local physical operator that's being tested.
+   * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
+   * @param sortAnswers if true, the answers will be sorted by their 
toString representations prior
+   *to being compared.
+   */
+  protected def checkAnswer2(
+  left: DataFrame,
+  right: DataFrame,
+  nodeFunction: (LocalNode, LocalNode) => LocalNode,
+  expectedAnswer: Seq[Row],
+  sortAnswers: Boolean = true): Unit = {
+doCheckAnswer(
+  left :: right :: Nil,
+  nodes => nodeFunction(nodes(0), nodes(1)),
+  expectedAnswer,
+  sortAnswers)
+  }
+
+  /**
+   * Runs the `LocalNode`s and makes sure the answer matches the expected 
result.
+   * @param input the input data to be used.
+   * @param nodeFunction a function which accepts a sequence of input 
`LocalNode`s and uses them to
+   * instantiate the local physical operator that's 
being tested.
+   * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
+   * @param sortAnswers if true, the answers will be sorted by their 
toString representations prior
+   *to being compared.
+   */
+  protected def doCheckAnswer(
+input: Seq[DataFrame],
+nodeFunction: Seq[LocalNode] => LocalNode,
+expectedAnswer: Seq[Row],
+sortAnswers: Boolean = true): Unit = {
+LocalNodeTest.checkAnswer(
+  input.map(dataFrameToSeqScanNode), nodeFunction, expectedAnswer, 
sortAnswers) match {
+  case Some(errorMessage) => fail(errorMessage)
+  case None =>
+}
+  }
+
+  protected def dataFrameToSeqScanNode(df: DataFrame): SeqScanNode = {
+new SeqScanNode(
+  df.queryExecution.sparkPlan.output,
+  df.queryExecution.toRdd.map(_.copy()).collect())
+  }
+
+}
+
+/**
+ * Helper methods for writing tests of individual local physical operators.
+ */
+object LocalNodeTest {
+
+  /**
+   * Runs the `LocalNode`s and makes 

[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...

2015-08-27 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/8399#issuecomment-135603112
  
Ping @mengxr In case I can't check this soon, it would be great to get this 
into 1.5 if there is an RC3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10311][Streaming]Reload appId and attem...

2015-08-27 Thread XuTingjun
Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135602834
  
Sorry that I don't declare the problem clearly.

When an app starts with CheckPoint file using [getOrCreate 
method](https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala#L829),
 the new AM process will new a SparkContext object, but just using the [old 
SparkConf](https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala#L140),
 So the new attemptId set by new AM process doesn't do anything.

Also the appId is the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [CORE][MINOR] Whitespace fixes in RangePartiti...

2015-08-27 Thread ihainan
Github user ihainan closed the pull request at:

https://github.com/apache/spark/pull/8480


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10326] [yarn] Fix app submission on win...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8493#issuecomment-135602276
  
  [Test build #41716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41716/consoleFull)
 for   PR 8493 at commit 
[`a14dba5`](https://github.com/apache/spark/commit/a14dba5233526f844a68d77c5d765d98b0534e2a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/8464#discussion_r38165165
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/local/LocalNodeTest.scala
 ---
@@ -0,0 +1,189 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.local
+
+import scala.util.control.NonFatal
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
+import org.apache.spark.sql.catalyst.util._
+import org.apache.spark.sql.{DataFrame, Row}
+import org.apache.spark.sql.types.StructType
+
+class LocalNodeTest extends SparkFunSuite {
+
+  /**
+   * Runs the LocalNode and makes sure the answer matches the expected 
result.
+   * @param input the input data to be used.
+   * @param nodeFunction a function which accepts the input LocalNode and 
uses it to instantiate
+   * the local physical operator that's being tested.
+   * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
+   * @param sortAnswers if true, the answers will be sorted by their 
toString representations prior
+   *to being compared.
+   */
+  protected def checkAnswer(
+  input: DataFrame,
+  nodeFunction: LocalNode => LocalNode,
+  expectedAnswer: Seq[Row],
+  sortAnswers: Boolean = true): Unit = {
+doCheckAnswer(
+  input :: Nil,
+  nodes => nodeFunction(nodes.head),
+  expectedAnswer,
+  sortAnswers)
+  }
+
+  /**
+   * Runs the LocalNode and makes sure the answer matches the expected 
result.
+   * @param left the left input data to be used.
+   * @param right the right input data to be used.
+   * @param nodeFunction a function which accepts the input LocalNode and 
uses it to instantiate
+   * the local physical operator that's being tested.
+   * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
+   * @param sortAnswers if true, the answers will be sorted by their 
toString representations prior
+   *to being compared.
+   */
+  protected def checkAnswer2(
--- End diff --

It needs to be `checkAnswer2` because there is a default parameter 
`sortAnswers` and it cannot work with `overload`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10326] [yarn] Fix app submission on win...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8493#issuecomment-135601503
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10326] [yarn] Fix app submission on win...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8493#issuecomment-135601511
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10326] [yarn] Fix app submission on win...

2015-08-27 Thread vanzin
GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/8493

[SPARK-10326] [yarn] Fix app submission on windows.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-10326

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8493.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8493


commit a14dba5233526f844a68d77c5d765d98b0534e2a
Author: Marcelo Vanzin 
Date:   2015-08-28T01:38:41Z

[SPARK-10326] [yarn] Fix app submission on windows.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10049][SPARKR] Support collecting data ...

2015-08-27 Thread sun-rui
Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/8458#issuecomment-135601195
  
@davies , @shivaram , Could you help to review it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9986][SPARK-9991][SPARK-9993][SQL]Creat...

2015-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/8464#discussion_r38164639
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/local/LimitNode.scala ---
@@ -0,0 +1,45 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.local
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+
+
+case class LimitNode(limit: Int, child: LocalNode) extends UnaryLocalNode {
+
+  private[this] var count = 0
+
+  override def output: Seq[Attribute] = child.output
+
+  override def open(): Unit = child.open()
--- End diff --

LocalNode cannot be reused, just like Iterator. So it's not necessary to 
reset it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8057][Core]Call TaskAttemptContext.getT...

2015-08-27 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/6599#issuecomment-135598700
  
> I think that we should also backport this to branch-1.4.

+1 since we fix it in 1.5.0. Just confirmed this one didn't have conflicts 
with branch-1.4.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135598617
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41715/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135598611
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135598213
  
  [Test build #41715 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41715/console)
 for   PR 8343 at commit 
[`472c767`](https://github.com/apache/spark/commit/472c76714c25b909e281d8079b7ead6c152d4512).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135596210
  
  [Test build #41715 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41715/consoleFull)
 for   PR 8343 at commit 
[`472c767`](https://github.com/apache/spark/commit/472c76714c25b909e281d8079b7ead6c152d4512).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135595941
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8952] [SPARKR] - Wrap normalizePath cal...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8343#issuecomment-135595930
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-08-27 Thread rotationsymmetry
Github user rotationsymmetry commented on a diff in the pull request:

https://github.com/apache/spark/pull/8022#discussion_r38163035
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingDecay.scala ---
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.regression
+
+import org.apache.spark.Logging
+import org.apache.spark.annotation.Experimental
+
+/**
+ * :: Experimental ::
+ * Supplies an interface for the discount value in
+ * the forgetful update rule in StreamingLinearAlgorithm.
+ * Actual implementation is provided in StreamingDecaySetter[T].
+ */
+@Experimental
+trait StreamingDecay {
+  /**
+   * Derive the discount factor.
+   *
+   * @param numNewDataPoints number of data points for the RDD arriving at 
time t.
+   * @return Discount factor
+   */
+  def getDiscount(numNewDataPoints: Long): Double
+}
+
+/**
+ * :: Experimental ::
+ * StreamingDecaySetter provides the concrete implementation
+ * of getDiscount in StreamingDecay and setters for decay factor
+ * and half-life.
+ */
+@Experimental
+private[mllib] trait StreamingDecaySetter[T] extends Logging {
+  self: T =>
+  private var decayFactor: Double = 0
+  private var timeUnit: String = StreamingDecay.BATCHES
+
+  /**
+   * Set the decay factor for the forgetful algorithms.
+   * The decay factor should be between 0 and 1, inclusive.
+   * decayFactor = 0: only the data from the most recent RDD will be used.
+   * decayFactor = 1: all data since the beginning of the DStream will be 
used.
+   * decayFactor is default to zero.
+   *
+   * @param decayFactor the decay factor
+   */
+  def setDecayFactor(decayFactor: Double): T = {
+this.decayFactor = decayFactor
+this
+  }
+
+
+  /**
+   * Set the half life and time unit ("batches" or "points") for the 
forgetful algorithm.
+   * The half life along with the time unit provides an alternative way to 
specify decay factor.
+   * The decay factor is calculated such that, for data acquired at time t,
+   * its contribution by time t + halfLife will have dropped to 0.5.
+   * The unit of time can be specified either as batches or points;
+   * see StreamingDecay companion object.
+   *
+   * @param halfLife the half life
+   * @param timeUnit the time unit
+   */
+  def setHalfLife(halfLife: Double, timeUnit: String): T = {
+if (timeUnit != StreamingDecay.BATCHES && timeUnit != 
StreamingDecay.POINTS) {
+  throw new IllegalArgumentException("Invalid time unit for decay: " + 
timeUnit)
+}
+this.decayFactor = math.exp(math.log(0.5) / halfLife)
+logInfo("Setting decay factor to: %g ".format (this.decayFactor))
+this.timeUnit = timeUnit
+this
+  }
+
+  /**
+   * Derive the discount factor.
+   *
+   * @param numNewDataPoints number of data points for the RDD arriving at 
time t.
+   * @return Discount factor
+   */
+  def getDiscount(numNewDataPoints: Long): Double = timeUnit match {
+case StreamingDecay.BATCHES => decayFactor
+case StreamingDecay.POINTS => math.pow(decayFactor, numNewDataPoints)
+  }
+}
+
+/**
+ * :: Experimental ::
+ * Provides the String constants for allowed time unit in the forgetful 
algorithm.
+ */
+@Experimental
+object StreamingDecay {
+  /**
+   * Each RDD in the DStream will be treated as 1 time unit.
+   *
+   */
+  final val BATCHES = "batches"
--- End diff --

I am all for this approach because if offers much higher type safety and 
the IDE goodies. 

The reason I have the `String` implementation is to follow 
[StreamingKMeans](http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.clustering.StreamingKMeans).
 

Shall we consolidate StreamingKMeans to use case object? If so, I will open 
a JIRA for that.

[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-08-27 Thread rotationsymmetry
Github user rotationsymmetry commented on a diff in the pull request:

https://github.com/apache/spark/pull/8022#discussion_r38162690
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.scala
 ---
@@ -47,6 +52,7 @@ class StreamingLinearRegressionWithSGD private[mllib] (
 private var numIterations: Int,
 private var miniBatchFraction: Double)
   extends StreamingLinearAlgorithm[LinearRegressionModel, 
LinearRegressionWithSGD]
+  with StreamingDecaySetter[StreamingLinearRegressionWithSGD]
--- End diff --

Thanks for the suggestions. I agree we can consolidate `StreamingDecay` and 
`StreamingDecaySetter` into one class. 

Regarding the implementation. If we do the following
``` scala
trait StreamingDecay[T] {
self: T =>
  def setX: T = this
}
class StreamingLinearAlgorithm extends 
StreamingDecay[StreamingLinearAlgorithm]
class StreamingLinearRegressionWithSGD extends StreamingLinearAlgorithm
val s = new StreamingLinearRegressionWithSGD()
```
Then the return type of `s.setX` will be `StreamingLinearAlgorithm` since 
this is what the generic `T` refers to.

So I propose we override the `setX` method to get the correct type. 

``` scala
trait StreamingDecay {
  def setX: this.type = this
}
class StreamingLinearAlgorithm extends StreamingDecay
class StreamingLinearRegressionWithSGD extends StreamingLinearAlgorithm {
  override def setX: this.type = {
super.setX
this
  } 
}
val s = new StreamingLinearRegressionWithSGD()
```

I have run this proposed implementation and the code is working. Shall we 
proceed?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135592006
  
  [Test build #41714 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41714/consoleFull)
 for   PR 8484 at commit 
[`35371fb`](https://github.com/apache/spark/commit/35371fb629217ee27ccda451c931d04137c05f93).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135591739
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135591740
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41712/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135591715
  
  [Test build #41712 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41712/console)
 for   PR 8492 at commit 
[`c3c65f8`](https://github.com/apache/spark/commit/c3c65f864d2c39dc9bebd652cc009cfe56790c90).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905][ML][Doc] Adds LinearRegressionSum...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8491#issuecomment-135591495
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41713/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905][ML][Doc] Adds LinearRegressionSum...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8491#issuecomment-135591494
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135591410
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905][ML][Doc] Adds LinearRegressionSum...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8491#issuecomment-135591430
  
  [Test build #41713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41713/console)
 for   PR 8491 at commit 
[`4e5aaeb`](https://github.com/apache/spark/commit/4e5aaebcbef92287887906071637cf65407c85c9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class LinearRegressionWithElasticNetExample `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135591401
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10065][SQL] Avoid triple copying of var...

2015-08-27 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/8484#issuecomment-135591299
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905][ML][Doc] Adds LinearRegressionSum...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8491#issuecomment-135589491
  
  [Test build #41713 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41713/consoleFull)
 for   PR 8491 at commit 
[`4e5aaeb`](https://github.com/apache/spark/commit/4e5aaebcbef92287887906071637cf65407c85c9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135589035
  
  [Test build #41712 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41712/consoleFull)
 for   PR 8492 at commit 
[`c3c65f8`](https://github.com/apache/spark/commit/c3c65f864d2c39dc9bebd652cc009cfe56790c90).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135588705
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135588718
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905][ML][Doc] Adds LinearRegressionSum...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8491#issuecomment-135588731
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905][ML][Doc] Adds LinearRegressionSum...

2015-08-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8491#issuecomment-135588711
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9905] Adds LinearRegressionSummary user...

2015-08-27 Thread feynmanliang
GitHub user feynmanliang opened a pull request:

https://github.com/apache/spark/pull/8491

[SPARK-9905] Adds LinearRegressionSummary user guide 

* Adds user guide for LinearRegressionSummary
* Fixes unresolved issues in  #8197

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/feynmanliang/spark SPARK-9905

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8491.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8491


commit c5d731e5c5c5da24840ddd7394e075ea4c17128c
Author: Feynman Liang 
Date:   2015-08-27T23:29:11Z

Cleans up Manoj's work

commit 4e5aaebcbef92287887906071637cf65407c85c9
Author: Feynman Liang 
Date:   2015-08-28T00:03:04Z

Adds LinearRegressionSummary docs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread davies
GitHub user davies opened a pull request:

https://github.com/apache/spark/pull/8492

[SPARK-10323] [SQL] fix nullability of In/InSet/ArrayContain

After this PR, In/InSet/ArrayContain will return null if value is null, 
instead of false. They also will return null even if there is a null in the 
set/array.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davies/spark fix_in

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8492.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8492


commit c3c65f864d2c39dc9bebd652cc009cfe56790c90
Author: Davies Liu 
Date:   2015-08-28T00:02:16Z

fix nullability of In/InSet/ArrayContain




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10323] [SQL] fix nullability of In/InSe...

2015-08-27 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/8492#issuecomment-135588372
  
cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10321] sizeInBytes in HadoopFsRelation

2015-08-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8490


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >