[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5879#issuecomment-98975034
  
  [Test build #31843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31843/consoleFull)
 for   PR 5879 at commit 
[`b0b506c`](https://github.com/apache/spark/commit/b0b506c363968b1f0ef27d71986618bb514f0f3c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5879#issuecomment-98974685
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5879#issuecomment-98974640
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7314][SPARK-3524][PySpark] upgrade Pyro...

2015-05-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5850


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7314][SPARK-3524][PySpark] upgrade Pyro...

2015-05-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5850#issuecomment-98971798
  
LGTM.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5657][Examples][PySpark] Add PySpark Av...

2015-05-04 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/4434#issuecomment-98971230
  
I've got no specific problem with adding such an example, as it could be a 
useful illustration of writing custom data via PySpark/Avro.

However I guess the preferred mechanism for Avro would be via 
https://github.com/databricks/spark-avro

@marmbrus @pwendell @JoshRosen @davies thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4449][Core] Specify port range in spark

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5722#issuecomment-98971434
  
  [Test build #31842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31842/consoleFull)
 for   PR 5722 at commit 
[`28a3adf`](https://github.com/apache/spark/commit/28a3adf1e58c03e0cb5c274ea5dfe60965878ae4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98971390
  
  [Test build #31841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31841/consoleFull)
 for   PR 5885 at commit 
[`25d7451`](https://github.com/apache/spark/commit/25d74513a1f145453f1d8b471a8109c42d0ee0ae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4449][Core] Specify port range in spark

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5722#issuecomment-98970770
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98970769
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98970719
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4449][Core] Specify port range in spark

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5722#issuecomment-98970729
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98970514
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4449][Core] Specify port range in spark

2015-05-04 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/5722#issuecomment-98970083
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MLLIB][SPARK-4675] Find similar products and ...

2015-05-04 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/3536#issuecomment-98969251
  
Not sure I follow completely - do you mean you compared cosine sim between 
raw (ie "rating") item vectors, and cosine sim computed from item factor 
vectors? I would imagine they would be quite different...

I always just use factor vectors


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5664#issuecomment-98969210
  
  [Test build #31831 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31831/consoleFull)
 for   PR 5664 at commit 
[`dfbe1d6`](https://github.com/apache/spark/commit/dfbe1d6f4cc2405067ec9f29c8ad6d9037304a89).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5664#issuecomment-98969215
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31831/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5664#issuecomment-98969214
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-98968516
  
@jkbradley ok, will close and reopen this.  Could you please tell us why 
this happened?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread FlytxtRnD
Github user FlytxtRnD closed the pull request at:

https://github.com/apache/spark/pull/5647


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread FlytxtRnD
GitHub user FlytxtRnD reopened a pull request:

https://github.com/apache/spark/pull/5647

[SPARK-6612] [MLLib] [PySpark] Python KMeans parity

The following items are added to Python kmeans:

kmeans - setEpsilon, setInitializationSteps
KMeansModel - computeCost, k

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/FlytxtRnD/spark newPyKmeansAPI

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5647.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5647


commit b61939a685f5cbdd6b0ef655b1d5a825f5646782
Author: Hrishikesh Subramonian 
Date:   2015-04-22T10:29:48Z

Python Kmeans - setEpsilon, setInitializationSteps, k and computeCost added.

commit 990383761841b444506e91f3052c2de3736d6052
Author: Hrishikesh Subramonian 
Date:   2015-04-22T11:31:10Z

added arguments in python tests

commit 1084663d0217b7adac40fb63b991476086ebd1fa
Author: Hrishikesh Subramonian 
Date:   2015-04-28T04:59:15Z

python 3 fixes

commit 7ecfd000af37899a920cae838cc41bcc5ceca053
Author: Hrishikesh Subramonian 
Date:   2015-04-28T05:02:01Z

Merge remote-tracking branch 'upstream/master' into newPyKmeansAPI

commit 703e8f609a8eb81b2a1b2492611909a562b0fbed
Author: Hrishikesh Subramonian 
Date:   2015-04-29T03:53:17Z

doc test corrections

commit d6d3a093719fb5ba606996b35cb3da2dfbf90c1f
Author: Hrishikesh Subramonian 
Date:   2015-04-29T03:54:34Z

Merge remote-tracking branch 'upstream/master' into newPyKmeansAPI

commit 9351b62f16371b538ab0715461011bfcba2cea31
Author: Hrishikesh Subramonian 
Date:   2015-04-30T04:37:39Z

set seed to fixed value in doc test

commit 0319821db7406f3cca359af5bc021d2f3fd92a17
Author: Hrishikesh Subramonian 
Date:   2015-04-30T04:41:13Z

Merge remote-tracking branch 'upstream/master' into newPyKmeansAPI

commit ba49eb1625b1190d8aaf2c55dc1f6309ac3e080c
Author: DB Tsai 
Date:   2015-04-30T04:44:41Z

Some code clean up.

Author: DB Tsai 

Closes #5794 from dbtsai/clean and squashes the following commits:

ad639dd [DB Tsai] Indentation
834d527 [DB Tsai] Some code clean up.

commit 4459514497eb76e6f2465d071857854390453805
Author: Zhongshuai Pei <799203...@qq.com>
Date:   2015-04-30T05:44:14Z

[SPARK-7225][SQL] CombineLimits optimizer does not work

SQL
```
select key from (select key from src limit 100) t2 limit 10
```
Optimized Logical Plan before modifying
```
== Optimized Logical Plan ==
Limit 10
Limit 100
Project key#3
MetastoreRelation default, src, None
```
Optimized Logical Plan after modifying
```
== Optimized Logical Plan ==
Limit 10
 Project [key#1]
  MetastoreRelation default, src, None
```

Author: Zhongshuai Pei <799203...@qq.com>
Author: DoingDone9 <799203...@qq.com>

Closes #5770 from DoingDone9/limitOptimizer and squashes the following 
commits:

c68eaa7 [Zhongshuai Pei] Update CombiningLimitsSuite.scala
97e18cf [Zhongshuai Pei] Update Optimizer.scala
19ab875 [Zhongshuai Pei] Update CombiningLimitsSuite.scala
7db4566 [Zhongshuai Pei] Update CombiningLimitsSuite.scala
e2a491d [Zhongshuai Pei] Update Optimizer.scala
f03fe7f [Zhongshuai Pei] Merge pull request #12 from apache/master
f12fa50 [Zhongshuai Pei] Merge pull request #10 from apache/master
f61210c [Zhongshuai Pei] Merge pull request #9 from apache/master
34b1a9a [Zhongshuai Pei] Merge pull request #8 from apache/master
802261c [DoingDone9] Merge pull request #7 from apache/master
d00303b [DoingDone9] Merge pull request #6 from apache/master
98b134f [DoingDone9] Merge pull request #5 from apache/master
161cae3 [DoingDone9] Merge pull request #4 from apache/master
c87e8b6 [DoingDone9] Merge pull request #3 from apache/master
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master

commit 254e0509762937acc9c72b432d5d953bf72c3c52
Author: Vincenzo Selvaggio 
Date:   2015-04-30T06:21:21Z

[SPARK-1406] Mllib pmml model export

See PDF attached to the JIRA issue 1406.

The contribution is my original work and I license the work to the project 
under the project's open source license.

Author: Vincenzo Selvaggio 
Author: Xiangrui Meng 
Author: selvinsource 

Closes #3062 from selvinsource/mllib_pmml_model_export_SPARK-1406 and 
squashes the following commits:

852aac6 [Vincenzo Selvaggio] [SPARK-1406] Update JPMML version to 1.1.15 in 
LICENSE file
085cf42 [Vincenzo Selvaggio] [SPARK-1406] Added Double Min and Max Fixed 
scala style
30165c4 [Vincenzo Selvaggio] [SPARK-1406] Fixed extreme cases for logit
7a5e0ec [Vincenzo Selvaggio] [SPARK-1406] Binary 

[GitHub] spark pull request: [SPARK-7015][MLLib][WIP] Multiclass to Binary ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5830#issuecomment-98968367
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7318][Streaming] DStream cleans objects...

2015-05-04 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/5860#discussion_r29646250
  
--- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ---
@@ -179,6 +179,11 @@ private[spark] object ClosureCleaner extends Logging {
   cleanTransitively: Boolean,
   accessedFields: Map[Class[_], Set[String]]): Unit = {
 
+if (!isClosure(func.getClass)) {
+  logWarning("Expected a closure; got " + func.getClass.getName)
--- End diff --

Any reason not to make this an assertion? Isn't it simply invalid if we 
call this on something that is not a closure?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7015][MLLib][WIP] Multiclass to Binary ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5830#issuecomment-98968371
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31830/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7015][MLLib][WIP] Multiclass to Binary ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5830#issuecomment-98968323
  
  [Test build #31830 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31830/consoleFull)
 for   PR 5830 at commit 
[`26f1ddb`](https://github.com/apache/spark/commit/26f1ddb2c40538abf225483fac4e929030771e9b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-98966982
  
@FlytxtRnD The update confused github.  Can you please try closing and 
re-opening this PR to force github to recompute the diff?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...

2015-05-04 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/5664#issuecomment-98966596
  
@vanzin Not sure if we should modify `createAkkaConfig`, please give some 
reference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98965193
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-7357 Improving HBaseTest example

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5904#issuecomment-98966159
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-7357 Improving HBaseTest example

2015-05-04 Thread JihongMA
GitHub user JihongMA opened a pull request:

https://github.com/apache/spark/pull/5904

SPARK-7357 Improving HBaseTest example



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JihongMA/spark-1 SPARK-7357

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5904.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5904


commit 7d6153a4d2bdd1530c5b68776cd21afa497e9cd4
Author: Jihong MA 
Date:   2015-05-05T05:52:35Z

SPARK-7357 Improving HBaseTest example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98965421
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98965420
  
  [Test build #31840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31840/consoleFull)
 for   PR 5885 at commit 
[`25d7451`](https://github.com/apache/spark/commit/25d74513a1f145453f1d8b471a8109c42d0ee0ae).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class BinaryClassificationEvaluator(JavaEvaluator, HasLabelCol, 
HasRawPredictionCol):`
  * `class HasRawPredictionCol(Params):`
  * `class Evaluator(object):`
  * `class JavaEvaluator(Evaluator, JavaWrapper):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98965422
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31840/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98965304
  
  [Test build #31840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31840/consoleFull)
 for   PR 5885 at commit 
[`25d7451`](https://github.com/apache/spark/commit/25d74513a1f145453f1d8b471a8109c42d0ee0ae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5885#discussion_r29645700
  
--- Diff: python/pyspark/sql/_types.py ---
@@ -652,7 +652,7 @@ def _python_to_sql_converter(dataType):
 
 if isinstance(dataType, StructType):
 names, types = zip(*[(f.name, f.dataType) for f in 
dataType.fields])
-converters = map(_python_to_sql_converter, types)
+converters = [_python_to_sql_converter(t) for t in types]
--- End diff --

In Python 3, `map` returns a map object instead of a list. So I changed it 
to `[...]` that is compatible with both 2 & 3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5885#issuecomment-98965210
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5879#discussion_r29645695
  
--- Diff: 
external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
 ---
@@ -301,6 +301,49 @@ class DirectKafkaStreamSuite
 ssc.stop()
   }
 
+  test("Direct Kafka stream report input information") {
+val topic = "report-test"
+val data = Map("a" -> 7, "b" -> 9)
+kafkaTestUtils.createTopic(topic)
+kafkaTestUtils.sendMessages(topic, data)
+
+val totalSent = data.values.sum
+val kafkaParams = Map(
+  "metadata.broker.list" -> kafkaTestUtils.brokerAddress,
+  "auto.offset.reset" -> "smallest"
+)
+
+import DirectKafkaStreamSuite._
+ssc = new StreamingContext(sparkConf, Milliseconds(200))
+val collector = new InputInfoCollector
+ssc.addStreamingListener(collector)
+
+val stream = withClue("Error creating direct stream") {
+  KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](
+ssc, kafkaParams, Set(topic))
+}
+
+val allReceived = new ArrayBuffer[(String, String)]
+
+stream.foreachRDD { rdd => allReceived ++= rdd.collect() }
+ssc.start()
+eventually(timeout(2.milliseconds), interval(200.milliseconds)) {
+  assert(allReceived.size === totalSent,
+"didn't get expected number of messages, messages:\n" + 
allReceived.mkString("\n"))
+}
+ssc.stop()
+
+// Calculate all the record number collected in the StreamingListener.
+val numRecordsSubmitted = 
collector.streamIdToNumRecordsSubmitted.map(_.values.sum).sum
--- End diff --

OK, I get it, thanks a lot for your explanation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5879#discussion_r29645626
  
--- Diff: 
external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
 ---
@@ -313,4 +356,24 @@ class DirectKafkaStreamSuite
 object DirectKafkaStreamSuite {
   val collectedData = new mutable.ArrayBuffer[String]()
   var total = -1L
+
+  class InputInfoCollector extends StreamingListener {
+val streamIdToNumRecordsSubmitted = new ArrayBuffer[Map[Int, Long]]()
--- End diff --

Yeah, that's not necessary, I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964986
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31839/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964982
  
  [Test build #31839 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31839/consoleFull)
 for   PR 5903 at commit 
[`1e6f13e`](https://github.com/apache/spark/commit/1e6f13e0d75f1bfaeaa6a0203e55c1a492778019).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964985
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5879#discussion_r29645575
  
--- Diff: 
external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
 ---
@@ -301,6 +301,49 @@ class DirectKafkaStreamSuite
 ssc.stop()
   }
 
+  test("Direct Kafka stream report input information") {
+val topic = "report-test"
+val data = Map("a" -> 7, "b" -> 9)
+kafkaTestUtils.createTopic(topic)
+kafkaTestUtils.sendMessages(topic, data)
+
+val totalSent = data.values.sum
+val kafkaParams = Map(
+  "metadata.broker.list" -> kafkaTestUtils.brokerAddress,
+  "auto.offset.reset" -> "smallest"
+)
+
+import DirectKafkaStreamSuite._
+ssc = new StreamingContext(sparkConf, Milliseconds(200))
+val collector = new InputInfoCollector
+ssc.addStreamingListener(collector)
+
+val stream = withClue("Error creating direct stream") {
+  KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](
+ssc, kafkaParams, Set(topic))
+}
+
+val allReceived = new ArrayBuffer[(String, String)]
+
+stream.foreachRDD { rdd => allReceived ++= rdd.collect() }
+ssc.start()
+eventually(timeout(2.milliseconds), interval(200.milliseconds)) {
+  assert(allReceived.size === totalSent,
+"didn't get expected number of messages, messages:\n" + 
allReceived.mkString("\n"))
+}
+ssc.stop()
+
+// Calculate all the record number collected in the StreamingListener.
+val numRecordsSubmitted = 
collector.streamIdToNumRecordsSubmitted.map(_.values.sum).sum
--- End diff --

1. JVM does not guarantee different threads will see the same values within 
any bounded period of time until some kind of synchronization is used. Has 
caused flakiness in the past. 

2. The `StreamingListener` events are sent on an async thread. So there is 
a time gap between when the last job finishes and the posting of the 
`StreamingListenerBatchCompleted ` event. In the current code, the system may 
satisfy the eventually and stop the streamingContext before the event is 
dispatched and `InputInfoCollector. onBatchCompleted()` is called. In which 
case, things will fail. This will be fine probably 99.99% of the time. But on a 
place like Jenkins, that 0.01% chance causes annoying flakiness.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964860
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964865
  
  [Test build #31839 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31839/consoleFull)
 for   PR 5903 at commit 
[`1e6f13e`](https://github.com/apache/spark/commit/1e6f13e0d75f1bfaeaa6a0203e55c1a492778019).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964849
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6939][Streaming][WebUI] Add timeline an...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5533#issuecomment-98964569
  
  [Test build #31838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31838/consoleFull)
 for   PR 5533 at commit 
[`deacc3f`](https://github.com/apache/spark/commit/deacc3fccbc2c28ab02dae36152884c6904a8d03).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98964510
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6939][Streaming][WebUI] Add timeline an...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5533#issuecomment-98964485
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6939][Streaming][WebUI] Add timeline an...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5533#issuecomment-98964477
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread brkyvz
Github user brkyvz commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98963848
  
Retest this please
On May 4, 2015 10:41 PM, "UCB AMPLab"  wrote:

> Merged build finished. Test FAILed.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-98963782
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31828/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-98963781
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-98963773
  
  [Test build #31828 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31828/consoleFull)
 for   PR 5647 at commit 
[`8aac002`](https://github.com/apache/spark/commit/8aac002c67e44ec3520740fd92a89dda4b885700).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class Evaluator extends Params `
  * `abstract class PipelineStage extends Params with Logging `
  * `class BinaryClassificationEvaluator extends Evaluator with 
HasRawPredictionCol with HasLabelCol `
  * `case class Not(child: Expression) extends UnaryExpression with 
Predicate with ExpectsInputTypes `
  * `case class And(left: Expression, right: Expression)`
  * `case class Or(left: Expression, right: Expression)`
  * `abstract class BinaryComparison extends BinaryExpression with 
Predicate `
  * `trait StringRegexExpression extends ExpectsInputTypes `
  * `trait CaseConversionExpression extends ExpectsInputTypes `
  * `case class Substring(str: Expression, pos: Expression, len: 
Expression)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [TEST] Ignoring flaky tests

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5901#issuecomment-98963355
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [TEST] Ignoring flaky tests

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5901#issuecomment-98963346
  
  [Test build #31827 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31827/consoleFull)
 for   PR 5901 at commit 
[`9cd8667`](https://github.com/apache/spark/commit/9cd866733c43e538b10c4eb9b4d9280d6ce3b5de).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [TEST] Ignoring flaky tests

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5901#issuecomment-98963357
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31827/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5879#discussion_r29645169
  
--- Diff: 
external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
 ---
@@ -301,6 +301,49 @@ class DirectKafkaStreamSuite
 ssc.stop()
   }
 
+  test("Direct Kafka stream report input information") {
+val topic = "report-test"
+val data = Map("a" -> 7, "b" -> 9)
+kafkaTestUtils.createTopic(topic)
+kafkaTestUtils.sendMessages(topic, data)
+
+val totalSent = data.values.sum
+val kafkaParams = Map(
+  "metadata.broker.list" -> kafkaTestUtils.brokerAddress,
+  "auto.offset.reset" -> "smallest"
+)
+
+import DirectKafkaStreamSuite._
+ssc = new StreamingContext(sparkConf, Milliseconds(200))
+val collector = new InputInfoCollector
+ssc.addStreamingListener(collector)
+
+val stream = withClue("Error creating direct stream") {
+  KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](
+ssc, kafkaParams, Set(topic))
+}
+
+val allReceived = new ArrayBuffer[(String, String)]
+
+stream.foreachRDD { rdd => allReceived ++= rdd.collect() }
+ssc.start()
+eventually(timeout(2.milliseconds), interval(200.milliseconds)) {
+  assert(allReceived.size === totalSent,
+"didn't get expected number of messages, messages:\n" + 
allReceived.mkString("\n"))
+}
+ssc.stop()
+
+// Calculate all the record number collected in the StreamingListener.
+val numRecordsSubmitted = 
collector.streamIdToNumRecordsSubmitted.map(_.values.sum).sum
--- End diff --

>There is probably race condition here. The collector may not have received 
the batch completed signal when the allReceived.size == totalSent is satisfied 
and the context is stopped. Better to put all of these asserts under the 
eventually.

Not sure why all the assert should be in the `eventually`, from my 
understanding it is OK the last signal is missed, since we only test the total 
number of completed records.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7333][MLLIB] Add BinaryClassificationEv...

2015-05-04 Thread oefirouz
Github user oefirouz commented on a diff in the pull request:

https://github.com/apache/spark/pull/5885#discussion_r29645163
  
--- Diff: python/pyspark/ml/param/shared.py ---
@@ -165,6 +165,35 @@ def getPredictionCol(self):
 return self.getOrDefault(self.predictionCol)
 
 
+class HasRawPredictionCol(Params):
+"""
+Mixin for param rawPredictionCol: raw prediction column name.
+"""
+
+# a placeholder to make it appear in the generated doc
+rawPredictionCol = Param(Params._dummy(), "rawPredictionCol", "raw 
prediction column name")
+
+def __init__(self):
+super(HasRawPredictionCol, self).__init__()
+#: param for raw prediction column name
+self.rawPredictionCol = Param(self, "rawPredictionCol", "raw 
prediction column name")
+if 'rawPrediction' is not None:
--- End diff --

Ah, my mistake!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98960452
  
  [Test build #31837 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31837/consoleFull)
 for   PR 5900 at commit 
[`b30ace2`](https://github.com/apache/spark/commit/b30ace27be6e26ae536215a165d9db2fdde1fcf8).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98960461
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98960464
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31837/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98959912
  
  [Test build #31837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31837/consoleFull)
 for   PR 5900 at commit 
[`b30ace2`](https://github.com/apache/spark/commit/b30ace27be6e26ae536215a165d9db2fdde1fcf8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7113][Streaming] Support input informat...

2015-05-04 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5879#discussion_r29644906
  
--- Diff: 
external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
 ---
@@ -301,6 +301,49 @@ class DirectKafkaStreamSuite
 ssc.stop()
   }
 
+  test("Direct Kafka stream report input information") {
+val topic = "report-test"
+val data = Map("a" -> 7, "b" -> 9)
+kafkaTestUtils.createTopic(topic)
+kafkaTestUtils.sendMessages(topic, data)
+
+val totalSent = data.values.sum
+val kafkaParams = Map(
+  "metadata.broker.list" -> kafkaTestUtils.brokerAddress,
+  "auto.offset.reset" -> "smallest"
+)
+
+import DirectKafkaStreamSuite._
+ssc = new StreamingContext(sparkConf, Milliseconds(200))
+val collector = new InputInfoCollector
+ssc.addStreamingListener(collector)
+
+val stream = withClue("Error creating direct stream") {
+  KafkaUtils.createDirectStream[String, String, StringDecoder, 
StringDecoder](
+ssc, kafkaParams, Set(topic))
+}
+
+val allReceived = new ArrayBuffer[(String, String)]
+
+stream.foreachRDD { rdd => allReceived ++= rdd.collect() }
+ssc.start()
+eventually(timeout(2.milliseconds), interval(200.milliseconds)) {
+  assert(allReceived.size === totalSent,
+"didn't get expected number of messages, messages:\n" + 
allReceived.mkString("\n"))
+}
+ssc.stop()
+
+// Calculate all the record number collected in the StreamingListener.
+val numRecordsSubmitted = 
collector.streamIdToNumRecordsSubmitted.map(_.values.sum).sum
--- End diff --

I dont think there's a multi-thread issue, I tested the number of records 
until the StreamingContext is stopped, so I think at that point there's no 
other thread will access collector object.

Anyway I just only test total number of records, so AtomicLong is enough, I 
will change to that way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98959836
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98959829
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6939][Streaming][WebUI] Add timeline an...

2015-05-04 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/5533#issuecomment-98957826
  
LGTM. Except that one unnecessary import in spark.ui.UIUtils. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6939][Streaming][WebUI] Add timeline an...

2015-05-04 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5533#discussion_r29644658
  
--- Diff: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ---
@@ -18,6 +18,7 @@
 package org.apache.spark.ui
 
 import java.text.SimpleDateFormat
+import java.util.concurrent.TimeUnit
--- End diff --

This is not needed any more since Spark's UIUtils is not modified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7139][Streaming] Allow received block m...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5732#issuecomment-98954711
  
  [Test build #31836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31836/consoleFull)
 for   PR 5732 at commit 
[`575476e`](https://github.com/apache/spark/commit/575476ef71113d3b2448593f48fd844b9e57e1c5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use UNSAFE.getLong() to speed up BitSetMethods...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5897#issuecomment-98954646
  
  [Test build #31835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31835/consoleFull)
 for   PR 5897 at commit 
[`093b7a4`](https://github.com/apache/spark/commit/093b7a408905b0d760e1f4e6413cb7e7ec54f9d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7139][Streaming] Allow received block m...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5732#issuecomment-98954031
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7139][Streaming] Allow received block m...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5732#issuecomment-98954018
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use UNSAFE.getLong() to speed up BitSetMethods...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5897#issuecomment-98954015
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use UNSAFE.getLong() to speed up BitSetMethods...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5897#issuecomment-98954023
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7294][SQL] ADD BETWEEN

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5839#issuecomment-98953819
  
  [Test build #31834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31834/consoleFull)
 for   PR 5839 at commit 
[`d2e7f72`](https://github.com/apache/spark/commit/d2e7f722bbe32ec1b5e0adce2f749feb67102926).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use UNSAFE.getLong() to speed up BitSetMethods...

2015-05-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5897#issuecomment-98953784
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7139][Streaming] Allow received block m...

2015-05-04 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/5732#issuecomment-98953767
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7294][SQL] ADD BETWEEN

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5839#issuecomment-98953700
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7294][SQL] ADD BETWEEN

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5839#issuecomment-98953706
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7294][SQL] ADD BETWEEN

2015-05-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5839#issuecomment-98953643
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98953338
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31833/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98953321
  
  [Test build #31833 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31833/consoleFull)
 for   PR 5903 at commit 
[`1e6f13e`](https://github.com/apache/spark/commit/1e6f13e0d75f1bfaeaa6a0203e55c1a492778019).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98953336
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98952910
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98952946
  
  [Test build #31833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31833/consoleFull)
 for   PR 5903 at commit 
[`1e6f13e`](https://github.com/apache/spark/commit/1e6f13e0d75f1bfaeaa6a0203e55c1a492778019).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98952941
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5938][SPARK-5443][SQL] Improve JsonRDD ...

2015-05-04 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/5801#issuecomment-98952772
  
@NathanHowell I will take a final check tomorrow. Can you also add the 
performance number of selecting all columns in the description? You can use 
`df.rdd.count` as the command to compare two versions. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5900#discussion_r29644349
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala 
---
@@ -102,9 +102,9 @@ private[sql] object StatFunctions extends Logging {
   /** Generate a table of frequencies for the elements of two columns. */
   private[sql] def crossTabulate(df: DataFrame, col1: String, col2: 
String): DataFrame = {
 val tableName = s"${col1}_$col2"
-val counts = df.groupBy(col1, col2).agg(col(col1), col(col2), 
count("*")).take(1e8.toInt)
-if (counts.length == 1e8.toInt) {
-  logWarning("The maximum limit of 1e8 pairs have been collected, 
which may not be all of " +
+val counts = df.groupBy(col1, col2).agg(col(col1), col(col2), 
count("*")).take(1e6.toInt)
+if (counts.length == 1e6.toInt) {
--- End diff --

we should also update the user facing javadoc/python docstring to say we 
get max of 1 million entries ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98952209
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98951448
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31832/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98951445
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98951437
  
  [Test build #31832 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31832/consoleFull)
 for   PR 5903 at commit 
[`1e6f13e`](https://github.com/apache/spark/commit/1e6f13e0d75f1bfaeaa6a0203e55c1a492778019).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98950573
  
  [Test build #31832 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31832/consoleFull)
 for   PR 5903 at commit 
[`1e6f13e`](https://github.com/apache/spark/commit/1e6f13e0d75f1bfaeaa6a0203e55c1a492778019).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98950051
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5903#issuecomment-98950066
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/5903#discussion_r29644079
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -261,8 +265,9 @@ class DAGSchedulerSuite
   override def taskSucceeded(partition: Int, value: Any) = numResults 
+= 1
   override def jobFailed(exception: Exception) = throw exception
 }
-submit(new MyRDD(sc, 0, Nil), Array(), listener = fakeListener)
+val jobId = submit(new MyRDD(sc, 0, Nil), Array(), listener = 
fakeListener)
 assert(numResults === 0)
+cancel(jobId)
--- End diff --

Need to cancel it, or `scheduler.stop()` will trigger `jobFailed` and make 
this test fail.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5074][Core][Tests] Fix the flakey test ...

2015-05-04 Thread zsxwing
GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/5903

[SPARK-5074][Core][Tests] Fix the flakey test 'run shuffle with map stage 
failure' in DAGSchedulerSuite

Test failure: 
https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because all tests share the same `JobListener`. Because after each 
test, `scheduler` isn't stopped. So actually it's still running. When running 
the test `run shuffle with map stage failure`, some previous test may trigger 
[ResubmitFailedStages](https://github.com/apache/spark/blob/ebc25a4ddfe07a67668217cec59893bc3b8cf730/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1120)
 logic, and report `jobFailed` and override the global `failure` variable.

This PR uses `after` to call `scheduler.stop()` for each test.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark SPARK-5074

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5903.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5903


commit 1e6f13e0d75f1bfaeaa6a0203e55c1a492778019
Author: zsxwing 
Date:   2015-05-05T04:48:07Z

Fix the flakey test 'run shuffle with map stage failure' in 
DAGSchedulerSuite

Test failure: 
https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/2240/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/run_shuffle_with_map_stage_failure/

This is because all tests share the same `JobListener`. Because after each 
test, `scheduler` isn't stopped. So actually it's still running. When running 
the test `run shuffle with map stage failure`, some previous test may trigger 
`ResubmitFailedStages` logic and override the global `failure` variable.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5938][SPARK-5443][SQL] Improve JsonRDD ...

2015-05-04 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/5801#issuecomment-98949753
  
Yeah, given that there is a flag I think we can still include this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98949664
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98949660
  
  [Test build #31821 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31821/consoleFull)
 for   PR 5900 at commit 
[`a417ba5`](https://github.com/apache/spark/commit/a417ba5ec0f0e04e66b5de88dcbc1cead8fe937c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Not(child: Expression) extends UnaryExpression with 
Predicate with ExpectsInputTypes `
  * `case class And(left: Expression, right: Expression)`
  * `case class Or(left: Expression, right: Expression)`
  * `abstract class BinaryComparison extends BinaryExpression with 
Predicate `
  * `trait StringRegexExpression extends ExpectsInputTypes `
  * `trait CaseConversionExpression extends ExpectsInputTypes `
  * `case class Substring(str: Expression, pos: Expression, len: 
Expression)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7243][SQL] Reduce size for Contingency ...

2015-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5900#issuecomment-98949666
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31821/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >