date:20160211

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183219842
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183219843
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51174/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183219699
  
**[Test build #51174 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51174/consoleFull)**
 for PR 9893 at commit 
[`fe79873`](https://github.com/apache/spark/commit/fe79873ef416f3fd4ca29b6970cc2991fb43d017).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Documentation] Added pygments.rb dependancy

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11180#issuecomment-183219234
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Added pygments.rb dependancy

2016-02-11 Thread amitdev

GitHub user amitdev opened a pull request:

https://github.com/apache/spark/pull/11180

Added pygments.rb dependancy

Looks like pygments.rb gem is also required for jekyll build to work. At 
least on Ubuntu/RHEL I could not do build without this dependency. So added 
this to steps.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amitdev/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11180


commit f705e9bbe7f1e6a6393062c07e239b23ebf53ac8
Author: Amit Dev 
Date:   2016-02-12T07:43:13Z

Added pygments.rb dependancy




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183216520
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183216521
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51172/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183216221
  
**[Test build #51172 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51172/consoleFull)**
 for PR 9893 at commit 
[`e61ec6a`](https://github.com/apache/spark/commit/e61ec6a4a3b603d34c6f7de697d61ee559786337).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183211245
  
**[Test build #51176 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51176/consoleFull)**
 for PR 11178 at commit 
[`bef62eb`](https://github.com/apache/spark/commit/bef62ebb8ec5065061ff0ca49a4cb7e0182c47b6).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183211271
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51176/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183211270
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183206102
  
**[Test build #51176 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51176/consoleFull)**
 for PR 11178 at commit 
[`bef62eb`](https://github.com/apache/spark/commit/bef62ebb8ec5065061ff0ca49a4cb7e0182c47b6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183206172
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183206177
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51173/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183205820
  
**[Test build #51173 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51173/consoleFull)**
 for PR 11179 at commit 
[`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11140#issuecomment-183201556
  
**[Test build #51175 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51175/consoleFull)**
 for PR 11140 at commit 
[`837252a`](https://github.com/apache/spark/commit/837252a74ec87e8f1ac07e80406bf0410c9088d7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...

2016-02-11 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/10838#issuecomment-183200705
  
@redsanket what's your JIRA account name? I want to assign it to you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...

2016-02-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10838


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...

2016-02-11 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/10838#issuecomment-183200242
  
Merging to master. Thanks, @redsanket


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...

2016-02-11 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/11140#issuecomment-183198830
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

2016-02-11 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8#issuecomment-183198008
  
Just saw this got merged. I'm probably missing some context, but can 
somebody explain to me why something so conceptually simple leads to such a big 
patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2016-02-11 Thread ygcao

Github user ygcao commented on the pull request:

https://github.com/apache/spark/pull/10152#issuecomment-183197942
  
addressed new comments. still kept the if statement as I explained by 
sample codes.
reran test and lint test. Jenkins should still be happy :fireworks: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183197817
  
**[Test build #51173 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51173/consoleFull)**
 for PR 11179 at commit 
[`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...

2016-02-11 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11153#issuecomment-183197064
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11100#issuecomment-183195272
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51169/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183195289
  
**[Test build #51174 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51174/consoleFull)**
 for PR 9893 at commit 
[`fe79873`](https://github.com/apache/spark/commit/fe79873ef416f3fd4ca29b6970cc2991fb43d017).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11100#issuecomment-183195269
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11100#issuecomment-183194619
  
**[Test build #51169 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51169/consoleFull)**
 for PR 11100 at commit 
[`79c11de`](https://github.com/apache/spark/commit/79c11de8954e137e134d3a8645b6936cd625f38e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13196] [MLlib] Optimize the iterator in...

2016-02-11 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/11078#issuecomment-183194684
  
@hhbyyh Did you test it? `Iterator` is lazy. I think the new version would 
consume more memory because `modified` would store all the values.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183194055
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2016-02-11 Thread ygcao

Github user ygcao commented on a diff in the pull request:

https://github.com/apache/spark/pull/10152#discussion_r52708705
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -289,24 +301,20 @@ class Word2Vec extends Serializable with Logging {
 val expTable = sc.broadcast(createExpTable())
 val bcVocab = sc.broadcast(vocab)
 val bcVocabHash = sc.broadcast(vocabHash)
-
-val sentences: RDD[Array[Int]] = words.mapPartitions { iter =>
-  new Iterator[Array[Int]] {
-def hasNext: Boolean = iter.hasNext
-
-def next(): Array[Int] = {
-  val sentence = ArrayBuilder.make[Int]
-  var sentenceLength = 0
-  while (iter.hasNext && sentenceLength < MAX_SENTENCE_LENGTH) {
-val word = bcVocabHash.value.get(iter.next())
-word match {
-  case Some(w) =>
-sentence += w
-sentenceLength += 1
-  case None =>
-}
+// each partition is a collection of sentences,
+// will be translated into arrays of Index integer
+val sentences: RDD[Array[Int]] = dataset.mapPartitions { sentenceIter 
=>
+  // Each sentence will map to 0 or more Array[Int]
+  sentenceIter.flatMap { sentence => {
+  // Sentence of words, some of which map to a word index
+  val wordIndexes = sentence.flatMap(bcVocabHash.value.get)
+  if (wordIndexes.nonEmpty) {
--- End diff --

Sorry, still not quite sure about this. did a test, turns out I am right 
:grinning: 
scala> val sentences=List("test sen 1","","testsen 2")
sentences: List[String] = List(test sen 1, "", testsen 2)

scala> val rdd=sc.parallelize(sentences)
rdd: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at 
parallelize at :23

scala> val results=rdd.flatMap(sen=>sen.split(" ").grouped(1))
results: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[1] at 
flatMap at :25

scala> results.collect
res0: Array[Array[String]] = Array(Array(test), Array(sen), Array(1), 
**Array("")**, Array(testsen), Array(2))

if we don't have the if statement, we'll result empty things which could 
cause trouble for following steps. I'd like to be on the safe side. if 
statement is cheap enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183194060
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51170/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183193865
  
**[Test build #51170 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51170/consoleFull)**
 for PR 11179 at commit 
[`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183191566
  
@yanboliang Could you take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183191594
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183189659
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51171/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183189658
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9893#issuecomment-183189627
  
**[Test build #51172 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51172/consoleFull)**
 for PR 9893 at commit 
[`e61ec6a`](https://github.com/apache/spark/commit/e61ec6a4a3b603d34c6f7de697d61ee559786337).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...

2016-02-11 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11153#discussion_r52707357
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -572,98 +572,64 @@ class Analyzer(
   // Skip sort with aggregate. This will be handled in 
ResolveAggregateFunctions
   case sa @ Sort(_, _, child: Aggregate) => sa
 
-  case s @ Sort(_, _, child) if !s.resolved && child.resolved =>
-val (newOrdering, missingResolvableAttrs) = 
collectResolvableMissingAttrs(s.order, child)
-
-if (missingResolvableAttrs.isEmpty) {
-  val unresolvableAttrs = s.order.filterNot(_.resolved)
-  logDebug(s"Failed to find $unresolvableAttrs in 
${child.output.mkString(", ")}")
-  s // Nothing we can do here. Return original plan.
-} else {
-  // Add the missing attributes into projectList of Project/Window 
or
-  //   aggregateExpressions of Aggregate, if they are in the 
inputSet
-  //   but not in the outputSet of the plan.
-  val newChild = child transformUp {
-case p: Project =>
-  p.copy(projectList = p.projectList ++
-missingResolvableAttrs.filter((p.inputSet -- 
p.outputSet).contains))
-case w: Window =>
-  w.copy(projectList = w.projectList ++
-missingResolvableAttrs.filter((w.inputSet -- 
w.outputSet).contains))
-case a: Aggregate =>
-  val resolvableAttrs = 
missingResolvableAttrs.filter(a.groupingExpressions.contains)
-  val notResolvedAttrs = 
resolvableAttrs.filterNot(a.aggregateExpressions.contains)
-  val newAggregateExpressions = a.aggregateExpressions ++ 
notResolvedAttrs
-  a.copy(aggregateExpressions = newAggregateExpressions)
-case o => o
-  }
-
+  case s @ Sort(order, _, child) if !s.resolved && child.resolved =>
+val newOrder = order.map(resolveExpressionRecursively(_, 
child).asInstanceOf[SortOrder])
+val requiredAttrs = AttributeSet(newOrder).filter(_.resolved)
+val missingAttrs = requiredAttrs -- child.outputSet
+if (missingAttrs.nonEmpty) {
   // Add missing attributes and then project them away after the 
sort.
   Project(child.output,
-Sort(newOrdering, s.global, newChild))
+Sort(newOrder, s.global, addMissingAttr(child, missingAttrs)))
+} else if (newOrder != order) {
+  s.copy(order = newOrder)
+} else {
+  s
 }
 }
 
 /**
- * Traverse the tree until resolving the sorting attributes
- * Return all the resolvable missing sorting attributes
- */
-@tailrec
-private def collectResolvableMissingAttrs(
-ordering: Seq[SortOrder],
-plan: LogicalPlan): (Seq[SortOrder], Seq[Attribute]) = {
+  * Add the missing attributes into projectList of Project/Window or 
aggregateExpressions of
+  * Aggregate.
+  */
+private def addMissingAttr(plan: LogicalPlan, missingAttrs: 
AttributeSet): LogicalPlan = {
+  if (missingAttrs.isEmpty) {
+return plan
+  }
   plan match {
-// Only Windows and Project have projectList-like attribute.
-case un: UnaryNode if un.isInstanceOf[Project] || 
un.isInstanceOf[Window] =>
-  val (newOrdering, missingAttrs) = 
resolveAndFindMissing(ordering, un, un.child)
-  // If missingAttrs is non empty, that means we got it and return 
it;
-  // Otherwise, continue to traverse the tree.
-  if (missingAttrs.nonEmpty) {
-(newOrdering, missingAttrs)
-  } else {
-collectResolvableMissingAttrs(ordering, un.child)
-  }
+case p: Project =>
+  val missing = missingAttrs -- p.child.outputSet
+  Project(p.projectList ++ missingAttrs, addMissingAttr(p.child, 
missing))
+case w: Window =>
+  val missing = missingAttrs -- w.child.outputSet
+  w.copy(projectList = w.projectList ++ missingAttrs,
+child = addMissingAttr(w.child, missing))
 case a: Aggregate =>
-  val (newOrdering, missingAttrs) = 
resolveAndFindMissing(ordering, a, a.child)
-  // For Aggregate, all the order by columns must be specified in 
group by clauses
-  if (missingAttrs.nonEmpty &&
-  missingAttrs.forall(ar => 
a.groupingExpressions.exists(_.semanticEquals(ar {
-(newOrdering, missingAttrs)
-  } else {
-// If missingAttrs is empty, we are unable to res

[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...

2016-02-11 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11153#discussion_r52707329
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -572,98 +572,64 @@ class Analyzer(
   // Skip sort with aggregate. This will be handled in 
ResolveAggregateFunctions
   case sa @ Sort(_, _, child: Aggregate) => sa
 
-  case s @ Sort(_, _, child) if !s.resolved && child.resolved =>
-val (newOrdering, missingResolvableAttrs) = 
collectResolvableMissingAttrs(s.order, child)
-
-if (missingResolvableAttrs.isEmpty) {
-  val unresolvableAttrs = s.order.filterNot(_.resolved)
-  logDebug(s"Failed to find $unresolvableAttrs in 
${child.output.mkString(", ")}")
-  s // Nothing we can do here. Return original plan.
-} else {
-  // Add the missing attributes into projectList of Project/Window 
or
-  //   aggregateExpressions of Aggregate, if they are in the 
inputSet
-  //   but not in the outputSet of the plan.
-  val newChild = child transformUp {
-case p: Project =>
-  p.copy(projectList = p.projectList ++
-missingResolvableAttrs.filter((p.inputSet -- 
p.outputSet).contains))
-case w: Window =>
-  w.copy(projectList = w.projectList ++
-missingResolvableAttrs.filter((w.inputSet -- 
w.outputSet).contains))
-case a: Aggregate =>
-  val resolvableAttrs = 
missingResolvableAttrs.filter(a.groupingExpressions.contains)
-  val notResolvedAttrs = 
resolvableAttrs.filterNot(a.aggregateExpressions.contains)
-  val newAggregateExpressions = a.aggregateExpressions ++ 
notResolvedAttrs
-  a.copy(aggregateExpressions = newAggregateExpressions)
-case o => o
-  }
-
+  case s @ Sort(order, _, child) if !s.resolved && child.resolved =>
+val newOrder = order.map(resolveExpressionRecursively(_, 
child).asInstanceOf[SortOrder])
+val requiredAttrs = AttributeSet(newOrder).filter(_.resolved)
+val missingAttrs = requiredAttrs -- child.outputSet
+if (missingAttrs.nonEmpty) {
   // Add missing attributes and then project them away after the 
sort.
   Project(child.output,
-Sort(newOrdering, s.global, newChild))
+Sort(newOrder, s.global, addMissingAttr(child, missingAttrs)))
+} else if (newOrder != order) {
+  s.copy(order = newOrder)
+} else {
+  s
 }
 }
 
 /**
- * Traverse the tree until resolving the sorting attributes
- * Return all the resolvable missing sorting attributes
- */
-@tailrec
-private def collectResolvableMissingAttrs(
-ordering: Seq[SortOrder],
-plan: LogicalPlan): (Seq[SortOrder], Seq[Attribute]) = {
+  * Add the missing attributes into projectList of Project/Window or 
aggregateExpressions of
+  * Aggregate.
+  */
+private def addMissingAttr(plan: LogicalPlan, missingAttrs: 
AttributeSet): LogicalPlan = {
--- End diff --

It makes sense to me. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-183185609
  
**[Test build #51170 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51170/consoleFull)**
 for PR 11179 at commit 
[`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-11 Thread NarineK

GitHub user NarineK opened a pull request:

https://github.com/apache/spark/pull/11179

[SPARK-13295] [ ML, MLlib ] AFTSurvivalRegression.AFTAggregator 
improvements - Avoids creating new instances of arrays/vectors for each record

As also mentioned/marked by TODO in AFTAggregator.AFTAggregator.add(data: 
AFTPoint) a new array is being created for intercept value and it is being 
concatenated
with another array which contains the betas, the resulted Array is being 
converted into a Dense vector which in it's turn is being converted into breeze 
vector.
This is expensive and not necessarily beautiful.

I've tried to solve above mentioned problem by simple algebraic 
decompositions - keeping and treating intercept independently.

Please let me know what do you think and if you have any questions.

Thanks,
Narine


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NarineK/spark survivaloptim

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11179.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11179


commit 8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7
Author: Narine Kokhlikyan 
Date:   2016-02-12T02:42:08Z

Initial commit - AFTSurvivalRegression improvements




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183181035
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183181025
  
**[Test build #51168 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51168/consoleFull)**
 for PR 11178 at commit 
[`5528c48`](https://github.com/apache/spark/commit/5528c48a7524952d3cc1f2d2a2bd303696c07f59).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183181036
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51168/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11100#issuecomment-183180286
  
**[Test build #51169 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51169/consoleFull)**
 for PR 11100 at commit 
[`79c11de`](https://github.com/apache/spark/commit/79c11de8954e137e134d3a8645b6936cd625f38e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183179580
  
**[Test build #51168 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51168/consoleFull)**
 for PR 11178 at commit 
[`5528c48`](https://github.com/apache/spark/commit/5528c48a7524952d3cc1f2d2a2bd303696c07f59).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-11 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11100#issuecomment-183178340
  
Thank you! @davies @aray 

Yeah, my first fix is very similar to what you proposed above. Will 
remember what you said regarding `GROUPING__ID`. After the release of 2.0, I 
will try to deprecate it and issue an error message. 

BTW, just tried the code changes and it works well in my local environment. 
Updated the codes. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10894#issuecomment-183172788
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51167/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10894#issuecomment-183172787
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10894#issuecomment-183172697
  
**[Test build #51167 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51167/consoleFull)**
 for PR 10894 at commit 
[`7151a73`](https://github.com/apache/spark/commit/7151a737f36eacf9f367068e025c63f281c7d8c5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10705#issuecomment-183170957
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10705#issuecomment-183170958
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51146/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10705#issuecomment-183170927
  
**[Test build #51146 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51146/consoleFull)**
 for PR 10705 at commit 
[`ef7d885`](https://github.com/apache/spark/commit/ef7d88508af04b81d6671fd7ccf55111ca3e7856).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2016-02-11 Thread squito

Github user squito commented on the pull request:

https://github.com/apache/spark/pull/6935#issuecomment-183169031
  
ps can you close this one now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

2016-02-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

2016-02-11 Thread squito

Github user squito commented on the pull request:

https://github.com/apache/spark/pull/8#issuecomment-183168840
  
merged to master, thanks @steveloughran!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11125#issuecomment-183159966
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51166/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11125#issuecomment-183159907
  
**[Test build #51166 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51166/consoleFull)**
 for PR 11125 at commit 
[`7e3ea32`](https://github.com/apache/spark/commit/7e3ea32fdd51f2e5a631602b23576b6330d9f112).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11125#issuecomment-183159964
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...

2016-02-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/11169#issuecomment-183158218
  
@rxin Can we maybe merge this for now and then take the optimisation into 
account in another PR?

This optimisation would apply to all the pruned scan as well and I think I 
should deal with this in another PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11125#issuecomment-183158035
  
**[Test build #51166 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51166/consoleFull)**
 for PR 11125 at commit 
[`7e3ea32`](https://github.com/apache/spark/commit/7e3ea32fdd51f2e5a631602b23576b6330d9f112).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10894#issuecomment-183154802
  
**[Test build #51167 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51167/consoleFull)**
 for PR 10894 at commit 
[`7151a73`](https://github.com/apache/spark/commit/7151a737f36eacf9f367068e025c63f281c7d8c5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11043


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...

2016-02-11 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/11125#issuecomment-183153830
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-11 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/11043#issuecomment-183153739
  
Merged into master and branch-1.6. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...

2016-02-11 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/10697#issuecomment-183153557
  
Merged into master. Thanks! @Earthson I didn't merge it into branch-1.6 
because `checkColumnTypes` is not available on branch-1.6. I don't think this 
is a critical bug for backporting. But if you have time to prepare a PR for 
branch-1.6, I'm happy to merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand

2016-02-11 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11177#discussion_r52700130
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala ---
@@ -71,9 +80,76 @@ case class Expand(
 idx = 0
   }
 
+  numOutputRows += 1
   result
 }
   }
 }
   }
+
+  override def upstream(): RDD[InternalRow] = {
+child.asInstanceOf[CodegenSupport].upstream()
+  }
+
+  protected override def doProduce(ctx: CodegenContext): String = {
+child.asInstanceOf[CodegenSupport].produce(ctx, this)
+  }
+
+  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): 
String = {
+val uniqExprs: IndexedSeq[Set[Expression]] = output.indices.map { i =>
+  projections.map(p => p(i)).toSet
+}
+
+ctx.currentVars = input
+val resultVars = uniqExprs.zipWithIndex.map { case (exprs, i) =>
+  val expr = exprs.head
+  if (exprs.size == 1) {
+// it's common to have same expression for some columns in all the 
projections, for example,
+// GroupingSet will copy all the output from child as the first 
part of output.
+// We should only generate the columns once.
+BindReferences.bindReference(expr, child.output).gen(ctx)
+  } else {
+val isNull = ctx.freshName("isNull")
+val value = ctx.freshName("value")
+val code =
+  s"""
+ |boolean $isNull = true;
+ |${ctx.javaType(expr.dataType)} $value = 
${ctx.defaultValue(expr.dataType)};
+ """.stripMargin
+ExprCode(code, isNull, value)
+  }
+}
+
+// In order to prevent code exploration, we can't call `consume()` 
many times, so we call
+// that in a loop, and use swith/case to select the projections.
+val projectCodes = projections.zipWithIndex.map { case (exprs, i) =>
--- End diff --

i find the body of this loop pretty hard to understand. can we add some 
high level comment to explain what's going on?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10705#issuecomment-183153291
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51132/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10705#issuecomment-183153257
  
**[Test build #51132 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51132/consoleFull)**
 for PR 10705 at commit 
[`a0c5bb3`](https://github.com/apache/spark/commit/a0c5bb336c0dc06ec9ffdf2ff12cb4f7aae3bc1d).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10705#issuecomment-183153290
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand

2016-02-11 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11177#discussion_r52699940
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala ---
@@ -71,9 +80,76 @@ case class Expand(
 idx = 0
   }
 
+  numOutputRows += 1
   result
 }
   }
 }
   }
+
+  override def upstream(): RDD[InternalRow] = {
+child.asInstanceOf[CodegenSupport].upstream()
+  }
+
+  protected override def doProduce(ctx: CodegenContext): String = {
+child.asInstanceOf[CodegenSupport].produce(ctx, this)
+  }
+
+  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): 
String = {
+val uniqExprs: IndexedSeq[Set[Expression]] = output.indices.map { i =>
+  projections.map(p => p(i)).toSet
+}
+
+ctx.currentVars = input
+val resultVars = uniqExprs.zipWithIndex.map { case (exprs, i) =>
+  val expr = exprs.head
+  if (exprs.size == 1) {
+// it's common to have same expression for some columns in all the 
projections, for example,
+// GroupingSet will copy all the output from child as the first 
part of output.
+// We should only generate the columns once.
+BindReferences.bindReference(expr, child.output).gen(ctx)
+  } else {
+val isNull = ctx.freshName("isNull")
+val value = ctx.freshName("value")
+val code =
+  s"""
+ |boolean $isNull = true;
+ |${ctx.javaType(expr.dataType)} $value = 
${ctx.defaultValue(expr.dataType)};
+ """.stripMargin
+ExprCode(code, isNull, value)
+  }
+}
+
+// In order to prevent code exploration, we can't call `consume()` 
many times, so we call
--- End diff --

what do you mean by "code exploration"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...

2016-02-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10697


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...

2016-02-11 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10894#discussion_r52699952
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala
 ---
@@ -138,3 +138,32 @@ object GenerateOrdering extends 
CodeGenerator[Seq[SortOrder], Ordering[InternalR
 
CodeGenerator.compile(code).generate(ctx.references.toArray).asInstanceOf[BaseOrdering]
   }
 }
+
+/**
+ * A lazily generate row ordering comparator.
+ */
+class LazilyGenerateOrdering(val ordering: Seq[SortOrder]) extends 
Ordering[InternalRow] {
+
+  def this(ordering: Seq[SortOrder], inputSchema: Seq[Attribute]) =
+this(ordering.map(BindReferences.bindReference(_, inputSchema)))
+
+  @transient
+  lazy val generatedOrdering = GenerateOrdering.generate(ordering)
--- End diff --

Ah, yes, it might cause a performance penalty.
I'll try to rewrite as you mentioned.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11177#issuecomment-183152706
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51156/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11177#issuecomment-183152703
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11177#issuecomment-183152562
  
**[Test build #51156 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51156/consoleFull)**
 for PR 11177 at commit 
[`22ceda9`](https://github.com/apache/spark/commit/22ceda9a82c050abbe0d885513a713e9c2dceb29).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13277][BUILD] Follow-up ANTLR warnings ...

2016-02-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11174


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11043#issuecomment-183151944
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11043#issuecomment-183151946
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51161/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11043#issuecomment-183151833
  
**[Test build #51161 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51161/consoleFull)**
 for PR 11043 at commit 
[`06e06f7`](https://github.com/apache/spark/commit/06e06f701886916f4710079962e6deae081dc872).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13277][BUILD] Follow-up ANTLR warnings ...

2016-02-11 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11174#issuecomment-183151703
  
Thanks - going to merge this in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13277][BUILD] Follow-up ANTLR warnings ...

2016-02-11 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11174#issuecomment-183151336
  
LGTM, looks like jenkins has generated proposed warning and caught it 
correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10697#issuecomment-183151260
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10697#issuecomment-183151263
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51163/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10697#issuecomment-183151137
  
**[Test build #51163 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51163/consoleFull)**
 for PR 10697 at commit 
[`9cd7ced`](https://github.com/apache/spark/commit/9cd7ced823eeaaf27e793959e8e8e8ad34ee1443).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand

2016-02-11 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11177#issuecomment-183150208
  
As always, can you paste the generated code? :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...

2016-02-11 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11178#issuecomment-183150046
  
LGTM provided tests pass.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13154][PYTHON] Add linting for pydocs

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11109#issuecomment-183149783
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51148/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13154][PYTHON] Add linting for pydocs

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11109#issuecomment-183149781
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13154][PYTHON] Add linting for pydocs

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11109#issuecomment-183149629
  
**[Test build #51148 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51148/consoleFull)**
 for PR 11109 at commit 
[`c086135`](https://github.com/apache/spark/commit/c086135cdaf7b80ca7abba54986d8347c52b1ac9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13017][Docs] Replace example code in ml...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11142#issuecomment-183149434
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51164/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13017][Docs] Replace example code in ml...

2016-02-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11142#issuecomment-183149432
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13017][Docs] Replace example code in ml...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11142#issuecomment-183149328
  
**[Test build #51164 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51164/consoleFull)**
 for PR 11142 at commit 
[`6c3122a`](https://github.com/apache/spark/commit/6c3122a91bc637446ff8ba8cfce53e12a1718e58).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...

2016-02-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11170


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...

2016-02-11 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11170#issuecomment-183147622
  
I'm going to merge this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...

2016-02-11 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11170#issuecomment-183147129
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...

2016-02-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/11169#issuecomment-183146848
  
This 
[CSVRelation.scala#L193-L199](https://github.com/HyukjinKwon/spark/blob/SPARK-13260/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala#L193-L199)
 will make sure it parses everything when drop-malformed mode but it does not 
in other modes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...

2016-02-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11170#issuecomment-183146671
  
**[Test build #2537 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2537/consoleFull)**
 for PR 11170 at commit 
[`ec716ea`](https://github.com/apache/spark/commit/ec716ea3be977a18d63713d31de738fb80a135cc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...

2016-02-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/11169#issuecomment-183146606
  
I think I should have described this in more details. This works identical 
with the original CSV datasource.

 When the parsing mode is drop-malformed, then it will try to parse all and 
in other modes, it would not.

The similar issue was found here 
https://github.com/databricks/spark-csv/issues/218 and it was fixed here 
https://github.com/databricks/spark-csv/pull/220.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 655 matches

Mail list logo