[GitHub] spark pull request: [SPARK-8126] [BUILD] Use custom temp directory...

2015-06-08 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/6674#issuecomment-110252820
  
All of the 1.4 builds have succeeded since this patch, some a few times. 
The exception is: 
https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.4-Maven-with-YARN/  This 
succeeded after, then failed, and the failure in the Kafka suite looks 
unrelated since it doesn't involve a temp file. I'm declaring victory and 
moving to 1.3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6583][SQL] Support aggregated function ...

2015-06-08 Thread watermen
Github user watermen commented on the pull request:

https://github.com/apache/spark/pull/5290#issuecomment-110248184
  
@cloud-fan Can you review it for me?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8202] [PYSPARK] fix infinite loop durin...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6714#issuecomment-110245779
  
  [Test build #34490 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34490/consoleFull)
 for   PR 6714 at commit 
[`e746aec`](https://github.com/apache/spark/commit/e746aeca630448b3bb9d425d8aefa496385f39ed).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8202] [PYSPARK] fix infinite loop durin...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6714#issuecomment-110245490
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8202] [PYSPARK] fix infinite loop durin...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6714#issuecomment-110245483
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8202] [PYSPARK] fix infinite loop durin...

2015-06-08 Thread davies
GitHub user davies opened a pull request:

https://github.com/apache/spark/pull/6714

[SPARK-8202] [PYSPARK] fix infinite loop during external sort in PySpark

The batch size during external sort will grow up to max 1, then shrink 
down to zero, causing infinite loop.
Given the assumption that the items usually have similar size, so we don't 
need to adjust the batch size after first spill.

cc @JoshRosen @rxin @angelini

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davies/spark batch_size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6714.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6714


commit e746aeca630448b3bb9d425d8aefa496385f39ed
Author: Davies Liu 
Date:   2015-06-09T06:26:29Z

fix batch size during sort




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Fix SPARK-8200

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6713#issuecomment-110244763
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Fix SPARK-8200

2015-06-08 Thread pparkkin
GitHub user pparkkin opened a pull request:

https://github.com/apache/spark/pull/6713

Fix SPARK-8200

Test cases for both StreamingLinearRegression and 
StreamingLogisticRegression, and code fix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pparkkin/spark streamingmodel-empty-rdd

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6713.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6713


commit b4cda931892ac5cab580b5e89d90829f709d3bbb
Author: Paavo 
Date:   2015-06-09T01:59:11Z

Test case for empty stream.

commit 4cb7b0fc73c4ccc698465956254df4aa95ee1587
Author: Paavo 
Date:   2015-06-09T02:19:15Z

Ignore empty RDDs.

commit e3e358f362856a439c5f7f745e97665299a5f0a6
Author: Paavo 
Date:   2015-06-09T02:27:39Z

Test case for empty stream.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7990][SQL] Add methods to facilitate eq...

2015-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6616


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7990][SQL] Add methods to facilitate eq...

2015-06-08 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/6616#issuecomment-110244099
  
Thanks. I'm merging this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110241035
  
  [Test build #34487 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34487/console)
 for   PR 6712 at commit 
[`9f3b75a`](https://github.com/apache/spark/commit/9f3b75a0377c3272f5e5ef27c4dfa11f24a82806).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class OverrideFunctionRegistry(underlying: FunctionRegistry) extends 
FunctionRegistry `
  * `class SimpleFunctionRegistry extends FunctionRegistry `
  * `case class Rand(seed: Long) extends RDG(seed) `
  * `case class Randn(seed: Long) extends RDG(seed) `
  * `class StringKeyHashMap[T](normalizer: (String) => String) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110241043
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110239365
  
  [Test build #34489 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34489/consoleFull)
 for   PR 6710 at commit 
[`6930822`](https://github.com/apache/spark/commit/69308222d1b65bd75c3b40b1e4da8f6161958535).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110239326
  
  [Test build #34488 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34488/consoleFull)
 for   PR 6712 at commit 
[`d554d60`](https://github.com/apache/spark/commit/d554d60438f5a71f8caef8f6e461ee659de46793).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110239051
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110239056
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110239057
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110239050
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110238898
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110238873
  
  [Test build #34485 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34485/console)
 for   PR 6712 at commit 
[`dea550b`](https://github.com/apache/spark/commit/dea550b3ddf594aca4640f73d585870dd0b38d68).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class OverrideFunctionRegistry(underlying: FunctionRegistry) extends 
FunctionRegistry `
  * `class SimpleFunctionRegistry extends FunctionRegistry `
  * `case class Rand(seed: Long) extends RDG(seed) `
  * `case class Randn(seed: Long) extends RDG(seed) `
  * `class StringKeyHashMap[T](normalizer: (String) => String) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110232198
  
  [Test build #34484 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34484/console)
 for   PR 6710 at commit 
[`b802c9a`](https://github.com/apache/spark/commit/b802c9a296596e3fc711baf352441516f59fb736).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SimpleFunctionRegistry extends FunctionRegistry `
  * `case class Rand(seed: Long) extends RDG(seed) `
  * `case class Randn(seed: Long) extends RDG(seed) `
  * `class StringKeyHashMap[T](normalizer: (String) => String) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110232203
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110231863
  
@hqzizania Not needed, never mind.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8162][BUILD] Run spark-shell cause Null...

2015-06-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at:

https://github.com/apache/spark/pull/6704


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8162][BUILD] Run spark-shell cause Null...

2015-06-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request:

https://github.com/apache/spark/pull/6704#issuecomment-110231290
  
Close it first as PR #6711 can fix NPE, if we find the root cause of why 
the `@VisibleForTesting` annotation causes a NPE in the shell then reopen it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110228780
  
  [Test build #34487 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34487/consoleFull)
 for   PR 6712 at commit 
[`9f3b75a`](https://github.com/apache/spark/commit/9f3b75a0377c3272f5e5ef27c4dfa11f24a82806).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110228708
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110228716
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5748#issuecomment-110228546
  
  [Test build #34486 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34486/consoleFull)
 for   PR 5748 at commit 
[`14ee596`](https://github.com/apache/spark/commit/14ee5960ced3079231543dfe103075ae12e40e05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5748#issuecomment-110228363
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5748#issuecomment-110228369
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110228254
  
Its fine - it will make one line of code more complex and remove one line 
of code. If you want you can make a follow up PR, its up to you and @davies :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110228207
  
  [Test build #34485 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34485/consoleFull)
 for   PR 6712 at commit 
[`dea550b`](https://github.com/apache/spark/commit/dea550b3ddf594aca4640f73d585870dd0b38d68).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/5748#issuecomment-110228137
  
@jkbradley ping?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread hqzizania
Github user hqzizania commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110228092
  
@shivaram  oops, I haven't fix the Nit davies said.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/5748#issuecomment-110228122
  
jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110227993
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6712#issuecomment-110227979
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Use FunctionRegistry for built-in...

2015-06-08 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/6712

[SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext.

This builds on https://github.com/apache/spark/pull/6710



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark udf-registry-hive

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6712.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6712


commit 8616924ebb90eaf5e88a6a843dc99744b3dbc2b8
Author: Santiago M. Mola 
Date:   2015-05-28T17:42:59Z

[SPARK-7886] Add built-in expressions to FunctionRegistry.

- ExpressionBuilders is provided with helpers to create a function builder
  for each Expression.
- Built-in functions removed from SqlParser when possible. Added to
  FunctionRegistry.

TO DO:

- Decide between the reflection and macro implementations of the
  expression builder helpers.
- Fix Substring (whose constructor is not well suited for the helper).
- Apply changes to Hive.

commit 2a2a149672589e303f6a8dbc1ef295a7c2541825
Author: Reynold Xin 
Date:   2015-06-08T20:18:12Z

Merge pull request #6463 from smola/SPARK-7886

[SPARK-7886][SQL] Add built-in expressions to FunctionRegistry.

commit 77b46f18c9a563f0673ddf097bf1e08c7be0ca1d
Author: Reynold Xin 
Date:   2015-06-08T23:23:22Z

Simplified the code.

commit ff906f233c8c367d52254d58da1da388d4298f58
Author: Reynold Xin 
Date:   2015-06-09T00:41:09Z

More robust constructor calling.

commit ee7854f7eb3b0226ec3826afe11bcae0f1b0a250
Author: Reynold Xin 
Date:   2015-06-09T00:58:16Z

Improved error reporting.

commit 52ddabaaaf9ea13f51d44a2602991f37532a9106
Author: Reynold Xin 
Date:   2015-06-09T01:00:13Z

Fixed compilation.

commit e76a3c1c35197858036891102e4394aa0027
Author: Reynold Xin 
Date:   2015-06-09T03:26:41Z

Fixed parser.

commit 852f9c09d3653ae040100b813d2a9203470d41ee
Author: Reynold Xin 
Date:   2015-06-09T03:44:51Z

Fixed style violation.

commit e60d815cf18877018da46a055e326f535623f9de
Author: Reynold Xin 
Date:   2015-06-09T04:28:45Z

Made UDF case insensitive.

commit b802c9a296596e3fc711baf352441516f59fb736
Author: Reynold Xin 
Date:   2015-06-09T04:33:42Z

Made UDF case insensitive.

commit dea550b3ddf594aca4640f73d585870dd0b38d68
Author: Reynold Xin 
Date:   2015-06-09T05:15:29Z

[SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31982518
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala
 ---
@@ -79,7 +93,7 @@ private[streaming] class BlockManagerBasedBlockHandler(
   throw new SparkException(
 s"Could not store $blockId to block manager with storage level 
$storageLevel")
 }
-BlockManagerBasedStoreResult(blockId)
+BlockManagerBasedStoreResult(blockId, numRecords)
--- End diff --

@tdas @zsxwing what you think ? Is it fine to count ByteBufferBlock as 1 
count ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4352][YARN][WIP] Incorporate locality p...

2015-06-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/6394#discussion_r31982517
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -225,12 +243,74 @@ private[yarn] class YarnAllocator(
   logInfo(s"Will request $missing executor containers, each with 
${resource.getVirtualCores} " +
 s"cores and ${resource.getMemory} MB memory including 
$memoryOverhead MB overhead")
 
-  for (i <- 0 until missing) {
-val request = createContainerRequest(resource)
-amClient.addContainerRequest(request)
-val nodes = request.getNodes
-val hostStr = if (nodes == null || nodes.isEmpty) "Any" else 
nodes.last
-logInfo(s"Container request (host: $hostStr, capability: 
$resource)")
+  // Calculated the number of executors we expected to satisfy all the 
preferred locality tasks
--- End diff --

Hi @sryza , will this `getNumPendingAtLocation(ANY_HOST)` get all the 
pending requests, including some locality specified requests? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110224579
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread dibbhatt
Github user dibbhatt commented on the pull request:

https://github.com/apache/spark/pull/6707#issuecomment-110224537
  
taken care couple of comments given by @harishreedharan
Not sure what to do with ByteBuffer case as there is no way to count number 
of messages in a ByteBufferBlock


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread GeorgeDittmar
Github user GeorgeDittmar commented on the pull request:

https://github.com/apache/spark/pull/6112#issuecomment-110224081
  
@mengxr @MechCoder Ok should be good to go I think. I cleaned up the rest 
of the unit tests and found a new more style issues that I cleaned up. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6190


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110221655
  
  [Test build #34484 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34484/consoleFull)
 for   PR 6710 at commit 
[`b802c9a`](https://github.com/apache/spark/commit/b802c9a296596e3fc711baf352441516f59fb736).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110221648
  
Thanks @hqzizania - LGTM. Merging this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8168] [MLLIB] Add Python friendly const...

2015-06-08 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/6709#issuecomment-110221571
  
Thanks. Merged in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110221504
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110221495
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8168] [MLLIB] Add Python friendly const...

2015-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6709


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110221121
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110221102
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/6190#discussion_r31981889
  
--- Diff: R/pkg/R/serialize.R ---
@@ -37,6 +37,14 @@ writeObject <- function(con, object, writeType = TRUE) {
   # passing in vectors as arrays and instead require arrays to be passed
   # as lists.
   type <- class(object)[[1]]  # class of POSIXlt is c("POSIXlt", "POSIXt")
+  # Checking types is needed here, since ‘is.na’ only handles atomic 
vectors,
+  # lists and pairlists
+  if (type %in% c("integer", "character", "logical", "double", "numeric")) 
{
+if (is.na(object)) {
+  object <- NULL
+  type <- "NULL"
--- End diff --

I see, never mind.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8168] [MLLIB] Add Python friendly const...

2015-06-08 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/6709#issuecomment-110219781
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110217328
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110217326
  
  [Test build #34482 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34482/console)
 for   PR 6710 at commit 
[`852f9c0`](https://github.com/apache/spark/commit/852f9c09d3653ae040100b813d2a9203470d41ee).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Rand(seed: Long) extends RDG(seed) `
  * `case class Randn(seed: Long) extends RDG(seed) `
  * `class StringKeyHashMap[T](normalizer: (String) => String) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-7780][MLLIB] Intercept in logisticregre...

2015-06-08 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/6386#issuecomment-110216881
  
Oh, get you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110216015
  
  [Test build #34482 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34482/consoleFull)
 for   PR 6710 at commit 
[`852f9c0`](https://github.com/apache/spark/commit/852f9c09d3653ae040100b813d2a9203470d41ee).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110215818
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread harishreedharan
Github user harishreedharan commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31981074
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala
 ---
@@ -32,7 +32,10 @@ import org.apache.spark.{Logging, SparkConf, 
SparkException}
 
 /** Trait that represents the metadata related to storage of blocks */
 private[streaming] trait ReceivedBlockStoreResult {
-  def blockId: StreamBlockId  // Any implementation of this trait will 
store a block id
+  // Any implementation of this trait will store a block id
+  def blockId: StreamBlockId
+  // Any implementation of this trait will have to return the number of 
records
+  def numRecords: Option[Long]
--- End diff --

Ah, ok. I just find the `num*` method calls weird, when it could be called 
`*count`. But if it is consistent with everything else, then it is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110215799
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread harishreedharan
Github user harishreedharan commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31981048
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala
 ---
@@ -79,7 +93,7 @@ private[streaming] class BlockManagerBasedBlockHandler(
   throw new SparkException(
 s"Could not store $blockId to block manager with storage level 
$storageLevel")
 }
-BlockManagerBasedStoreResult(blockId)
+BlockManagerBasedStoreResult(blockId, numRecords)
--- End diff --

Well, technically it is a single record - though I agree that is not 
exactly right either, but it must count as at least 1, correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread hqzizania
Github user hqzizania commented on a diff in the pull request:

https://github.com/apache/spark/pull/6190#discussion_r31980891
  
--- Diff: R/pkg/R/serialize.R ---
@@ -37,6 +37,14 @@ writeObject <- function(con, object, writeType = TRUE) {
   # passing in vectors as arrays and instead require arrays to be passed
   # as lists.
   type <- class(object)[[1]]  # class of POSIXlt is c("POSIXlt", "POSIXt")
+  # Checking types is needed here, since ‘is.na’ only handles atomic 
vectors,
+  # lists and pairlists
+  if (type %in% c("integer", "character", "logical", "double", "numeric")) 
{
+if (is.na(object)) {
+  object <- NULL
+  type <- "NULL"
--- End diff --

But the "type" in the %in% line also need to be changed into "class(object)"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110212549
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110212557
  
LGTM, just one minor comment, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110212546
  
  [Test build #34481 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34481/console)
 for   PR 6710 at commit 
[`e76a3c1`](https://github.com/apache/spark/commit/e76a3c1c35197858036891102e4394aa0027).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Rand(seed: Long) extends RDG(seed) `
  * `case class Randn(seed: Long) extends RDG(seed) `
  * `class StringKeyHashMap[T](normalizer: (String) => String) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/6190#discussion_r31980616
  
--- Diff: R/pkg/R/serialize.R ---
@@ -37,6 +37,14 @@ writeObject <- function(con, object, writeType = TRUE) {
   # passing in vectors as arrays and instead require arrays to be passed
   # as lists.
   type <- class(object)[[1]]  # class of POSIXlt is c("POSIXlt", "POSIXt")
+  # Checking types is needed here, since ‘is.na’ only handles atomic 
vectors,
+  # lists and pairlists
+  if (type %in% c("integer", "character", "logical", "double", "numeric")) 
{
+if (is.na(object)) {
+  object <- NULL
+  type <- "NULL"
--- End diff --

Nit: move these before `type <- class`, then this line is not needed 
anymore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110212428
  
  [Test build #34481 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34481/consoleFull)
 for   PR 6710 at commit 
[`e76a3c1`](https://github.com/apache/spark/commit/e76a3c1c35197858036891102e4394aa0027).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110211892
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110211959
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110204932
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110204887
  
  [Test build #34480 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34480/console)
 for   PR 6652 at commit 
[`2ef2d39`](https://github.com/apache/spark/commit/2ef2d39344000bb1d08f37e3d889f3b8975c33c4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7786][STREAMING] Allow StreamingListene...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6380#issuecomment-110203759
  
  [Test build #34479 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34479/console)
 for   PR 6380 at commit 
[`c94982f`](https://github.com/apache/spark/commit/c94982f25f57abf488bc75a253be44e3bfbab20d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class StatsReportListener(numBatchInfos: Int) extends 
StreamingListener `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7786][STREAMING] Allow StreamingListene...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6380#issuecomment-110203770
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8162] [HOTFIX] Fix NPE in spark-shell

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6711#issuecomment-110203159
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8162] [HOTFIX] Fix NPE in spark-shell

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6711#issuecomment-110203153
  
  [Test build #34477 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34477/console)
 for   PR 6711 at commit 
[`bf62ecc`](https://github.com/apache/spark/commit/bf62ecce7f021ccad67f3ed6b6e14292bd7f9129).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8156][SQL]create table to specific data...

2015-06-08 Thread baishuo
Github user baishuo commented on the pull request:

https://github.com/apache/spark/pull/6695#issuecomment-110202776
  
hi @yhuai  would you please help me review this pr when you have time?  i 
think may it was the base of  https://github.com/apache/spark/pull/6494 .  
thanks:)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110202731
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6820][SPARKR]Convert NAs to null type i...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6190#issuecomment-110202723
  
  [Test build #34476 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34476/console)
 for   PR 6190 at commit 
[`1641f9e`](https://github.com/apache/spark/commit/1641f9e03d99341e5b53f170c072b94678c544d2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110199775
  
  [Test build #34474 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34474/console)
 for   PR 6652 at commit 
[`f5be578`](https://github.com/apache/spark/commit/f5be5784235813c0e28b13f234f89de04bc4849d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110199784
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31976730
  
--- Diff: 
streaming/src/test/scala/org/apache/spark/streaming/ReceivedBlockHandlerSuite.scala
 ---
@@ -62,6 +61,19 @@ class ReceivedBlockHandlerSuite
   var blockManagerMaster: BlockManagerMaster = null
   var blockManager: BlockManager = null
   var tempDirectory: File = null
+  var storageLevel = StorageLevel.MEMORY_ONLY_SER
+
+  private def makeBlockManager(
--- End diff --

Sure. will change it 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-08 Thread XuTingjun
Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-110192207
  
Hi @squito, I think I need your help, I am not clearly know how to write 
this test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-08 Thread XuTingjun
Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-110191346
  
@squito, I think you can help me, I am not  clearly know how to write this 
test, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4352][YARN][WIP] Incorporate locality p...

2015-06-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/6394#discussion_r31976524
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -225,12 +243,74 @@ private[yarn] class YarnAllocator(
   logInfo(s"Will request $missing executor containers, each with 
${resource.getVirtualCores} " +
 s"cores and ${resource.getMemory} MB memory including 
$memoryOverhead MB overhead")
 
-  for (i <- 0 until missing) {
-val request = createContainerRequest(resource)
-amClient.addContainerRequest(request)
-val nodes = request.getNodes
-val hostStr = if (nodes == null || nodes.isEmpty) "Any" else 
nodes.last
-logInfo(s"Container request (host: $hostStr, capability: 
$resource)")
+  // Calculated the number of executors we expected to satisfy all the 
preferred locality tasks
+  val localityAwareTaskCores = localityAwarePendingTaskNum * 
CPUS_PER_TASK
+  val expectedLocalityAwareContainerNum =
+(localityAwareTaskCores + resource.getVirtualCores - 1) / 
resource.getVirtualCores
+
+  // Get the all the existed and locality matched containers
+  val existedMatchedContainers = allocatedHostToContainersMap.filter { 
case (host, _) =>
+preferredLocalityToCounts.contains(host)
+  }
+  val existedMatchedContainerNum = 
existedMatchedContainers.values.map(_.size).sum
+
+  // The number of containers to allocate, divided into two groups, 
one with node locality,
+  // and the other without locality preference.
+  var requiredLocalityFreeContainerNum: Int = 0
+  var requiredLocalityAwareContainerNum: Int = 0
+
+  if (expectedLocalityAwareContainerNum <= existedMatchedContainerNum) 
{
+// If the current allocated executor can satisfy all the locality 
preferred tasks,
--- End diff --

Oh, my bad, sorry for missing this part, I will change the strategy 
accordingly :).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31976427
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala
 ---
@@ -199,3 +221,16 @@ private[streaming] object 
WriteAheadLogBasedBlockHandler {
 new Path(checkpointDir, new Path("receivedData", 
streamId.toString)).toString
   }
 }
+
+/**
+ * A utility that will wrap the Iterator to get the count
+ */
+private class CountingIterator[T](iterator: Iterator[T]) extends 
Iterator[T] {
+   var count = 0
+   def hasNext(): Boolean = iterator.hasNext
+   def isFullyConsumed: Boolean = !iterator.hasNext
+   def next(): T = {
+count+=1
--- End diff --

Will change it ..thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31976373
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala
 ---
@@ -32,7 +32,10 @@ import org.apache.spark.{Logging, SparkConf, 
SparkException}
 
 /** Trait that represents the metadata related to storage of blocks */
 private[streaming] trait ReceivedBlockStoreResult {
-  def blockId: StreamBlockId  // Any implementation of this trait will 
store a block id
+  // Any implementation of this trait will store a block id
+  def blockId: StreamBlockId
+  // Any implementation of this trait will have to return the number of 
records
+  def numRecords: Option[Long]
--- End diff --

For all other place where count is recorded (refer to this PR 
https://github.com/apache/spark/pull/6659/files), it call as numRecords. Just 
wanted to keep this consistent naming across all classes.  



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8121] [SQL] Backports PR #6669 to branc...

2015-06-08 Thread liancheng
Github user liancheng closed the pull request at:

https://github.com/apache/spark/pull/6705


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110188806
  
  [Test build #34478 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34478/console)
 for   PR 6710 at commit 
[`52ddaba`](https://github.com/apache/spark/commit/52ddabaaaf9ea13f51d44a2602991f37532a9106).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Rand(seed: Long) extends RDG(seed) `
  * `case class Randn(seed: Long) extends RDG(seed) `
  * `class StringKeyHashMap[T](normalizer: (String) => String) `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7886] Add built-in expressions to Funct...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6710#issuecomment-110188814
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-08 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6707#discussion_r31976054
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala
 ---
@@ -79,7 +93,7 @@ private[streaming] class BlockManagerBasedBlockHandler(
   throw new SparkException(
 s"Could not store $blockId to block manager with storage level 
$storageLevel")
 }
-BlockManagerBasedStoreResult(blockId)
+BlockManagerBasedStoreResult(blockId, numRecords)
--- End diff --

But how we can count ByteBufferBlock ? if you count one block as 1 message, 
that is also wrong. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110188129
  
  [Test build #34480 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34480/consoleFull)
 for   PR 6652 at commit 
[`2ef2d39`](https://github.com/apache/spark/commit/2ef2d39344000bb1d08f37e3d889f3b8975c33c4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4352][YARN][WIP] Incorporate locality p...

2015-06-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/6394#discussion_r31975860
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1374,12 +1374,15 @@ class SparkContext(config: SparkConf) extends 
Logging with ExecutorAllocationCli
* This can result in canceling pending requests or filing additional 
requests.
* This is currently only supported in YARN mode. Return whether the 
request is received.
*/
-  private[spark] override def requestTotalExecutors(numExecutors: Int): 
Boolean = {
+  private[spark] override def requestTotalExecutors(
+  numExecutors: Int,
+  localityAwarePendingTasks: Int,
+  preferredLocalityToCount: scala.Predef.Map[String, Int]): Boolean = {
--- End diff --

Hi @squito , thanks a lot for your explanation, since the previous code 
already import `scala.collection.Map`, so here I have to write a full qualified 
name here, I will change to immutable map :).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110187900
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6652#issuecomment-110187889
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/6652#discussion_r31975738
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
@@ -137,6 +137,23 @@ class DAGScheduler(
   private[scheduler] val eventProcessLoop = new 
DAGSchedulerEventProcessLoop(this)
   taskScheduler.setDAGScheduler(this)
 
+  // Flag to control if reduce tasks are assigned preferred locations
+  private val shuffleLocalityEnabled =
+sc.getConf.getBoolean("spark.shuffle.reduceLocality.enabled", true)
+  // Number of map, reduce tasks above which we do not assign preferred 
locations
+  // based on map output sizes. We limit the size of jobs for which assign 
preferred locations
+  // as sorting the locations by size becomes expensive.
--- End diff --

Fixed now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/6652#discussion_r31975749
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -800,6 +800,50 @@ class DAGSchedulerSuite
 assertDataStructuresEmpty()
   }
 
+  test("shuffle with reducer locality") {
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/6652#discussion_r31975752
  
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -284,6 +291,54 @@ private[spark] class MapOutputTrackerMaster(conf: 
SparkConf)
 cachedSerializedStatuses.contains(shuffleId) || 
mapStatuses.contains(shuffleId)
   }
 
+  /**
+   * Return a list of locations which have fraction of map output greater 
than specified threshold.
+   *
+   * @param shuffleId id of the shuffle
+   * @param reducerId id of the reduce task
+   * @param numReducers total number of reducers in the shuffle
+   * @param fractionThreshold fraction of total map output size that a 
location must have
+   *  for it to be considered large.
+   *
+   * This method is not thread-safe
+   */
+  def getLocationsWithLargestOutputs(
+  shuffleId: Int,
+  reducerId: Int,
+  numReducers: Int,
+  fractionThreshold: Double)
+: Option[Array[BlockManagerId]] = {
+
+if (mapStatuses.contains(shuffleId)) {
+  // Pre-compute the top locations for each reducer and cache it
+  val statuses = mapStatuses(shuffleId)
+  if (statuses.nonEmpty) {
+// HashMap to add up sizes of all blocks at the same location
+val locs = new HashMap[BlockManagerId, Long]
+var totalOutputSize = 0L
+var mapIdx = 0
+while (mapIdx < statuses.length) {
--- End diff --

Good idea. Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

2015-06-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/6652#discussion_r31975747
  
--- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala 
---
@@ -205,4 +205,36 @@ class MapOutputTrackerSuite extends SparkFunSuite {
 //masterTracker.stop() // this throws an exception
 rpcEnv.shutdown()
   }
+
+  test("getLocationsWithLargestOutputs with multiple outputs in same 
machine") {
+val rpcEnv = createRpcEnv("test")
+val tracker = new MapOutputTrackerMaster(conf)
+tracker.trackerEndpoint = 
rpcEnv.setupEndpoint(MapOutputTracker.ENDPOINT_NAME,
+  new MapOutputTrackerMasterEndpoint(rpcEnv, tracker, conf))
+// Setup 3 map tasks
+// on hostA with output size 2
+// on hostA with output size 2
+// on hostB with output size 3
+tracker.registerShuffle(10, 3)
+tracker.registerMapOutput(10, 0, MapStatus(BlockManagerId("a", 
"hostA", 1000),
+Array(2L)))
+tracker.registerMapOutput(10, 1, MapStatus(BlockManagerId("a", 
"hostA", 1000),
+Array(2L)))
+tracker.registerMapOutput(10, 2, MapStatus(BlockManagerId("b", 
"hostB", 1000),
+Array(3L)))
+
+val topLocs50 = tracker.getLocationsWithLargestOutputs(10, 0, 1, 0.5)
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >