[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183219842 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183219843 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51174/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183219699 **[Test build #51174 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51174/consoleFull)** for PR 9893 at commit [`fe79873`](https://github.com/apache/spark/commit/fe79873ef416f3fd4ca29b6970cc2991fb43d017). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Documentation] Added pygments.rb dependancy
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11180#issuecomment-183219234 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added pygments.rb dependancy
GitHub user amitdev opened a pull request: https://github.com/apache/spark/pull/11180 Added pygments.rb dependancy Looks like pygments.rb gem is also required for jekyll build to work. At least on Ubuntu/RHEL I could not do build without this dependency. So added this to steps. You can merge this pull request into a Git repository by running: $ git pull https://github.com/amitdev/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11180.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11180 commit f705e9bbe7f1e6a6393062c07e239b23ebf53ac8 Author: Amit Dev Date: 2016-02-12T07:43:13Z Added pygments.rb dependancy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183216520 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183216521 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51172/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183216221 **[Test build #51172 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51172/consoleFull)** for PR 9893 at commit [`e61ec6a`](https://github.com/apache/spark/commit/e61ec6a4a3b603d34c6f7de697d61ee559786337). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183211245 **[Test build #51176 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51176/consoleFull)** for PR 11178 at commit [`bef62eb`](https://github.com/apache/spark/commit/bef62ebb8ec5065061ff0ca49a4cb7e0182c47b6). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183211271 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51176/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183211270 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183206102 **[Test build #51176 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51176/consoleFull)** for PR 11178 at commit [`bef62eb`](https://github.com/apache/spark/commit/bef62ebb8ec5065061ff0ca49a4cb7e0182c47b6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183206172 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183206177 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51173/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183205820 **[Test build #51173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51173/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11140#issuecomment-183201556 **[Test build #51175 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51175/consoleFull)** for PR 11140 at commit [`837252a`](https://github.com/apache/spark/commit/837252a74ec87e8f1ac07e80406bf0410c9088d7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10838#issuecomment-183200705 @redsanket what's your JIRA account name? I want to assign it to you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10838 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10838#issuecomment-183200242 Merging to master. Thanks, @redsanket --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11140#issuecomment-183198830 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/8#issuecomment-183198008 Just saw this got merged. I'm probably missing some context, but can somebody explain to me why something so conceptually simple leads to such a big patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on the pull request: https://github.com/apache/spark/pull/10152#issuecomment-183197942 addressed new comments. still kept the if statement as I explained by sample codes. reran test and lint test. Jenkins should still be happy :fireworks: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183197817 **[Test build #51173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51173/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11153#issuecomment-183197064 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183195272 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51169/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183195289 **[Test build #51174 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51174/consoleFull)** for PR 9893 at commit [`fe79873`](https://github.com/apache/spark/commit/fe79873ef416f3fd4ca29b6970cc2991fb43d017). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183195269 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183194619 **[Test build #51169 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51169/consoleFull)** for PR 11100 at commit [`79c11de`](https://github.com/apache/spark/commit/79c11de8954e137e134d3a8645b6936cd625f38e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13196] [MLlib] Optimize the iterator in...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11078#issuecomment-183194684 @hhbyyh Did you test it? `Iterator` is lazy. I think the new version would consume more memory because `modified` would store all the values. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183194055 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52708705 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -289,24 +301,20 @@ class Word2Vec extends Serializable with Logging { val expTable = sc.broadcast(createExpTable()) val bcVocab = sc.broadcast(vocab) val bcVocabHash = sc.broadcast(vocabHash) - -val sentences: RDD[Array[Int]] = words.mapPartitions { iter => - new Iterator[Array[Int]] { -def hasNext: Boolean = iter.hasNext - -def next(): Array[Int] = { - val sentence = ArrayBuilder.make[Int] - var sentenceLength = 0 - while (iter.hasNext && sentenceLength < MAX_SENTENCE_LENGTH) { -val word = bcVocabHash.value.get(iter.next()) -word match { - case Some(w) => -sentence += w -sentenceLength += 1 - case None => -} +// each partition is a collection of sentences, +// will be translated into arrays of Index integer +val sentences: RDD[Array[Int]] = dataset.mapPartitions { sentenceIter => + // Each sentence will map to 0 or more Array[Int] + sentenceIter.flatMap { sentence => { + // Sentence of words, some of which map to a word index + val wordIndexes = sentence.flatMap(bcVocabHash.value.get) + if (wordIndexes.nonEmpty) { --- End diff -- Sorry, still not quite sure about this. did a test, turns out I am right :grinning: scala> val sentences=List("test sen 1","","testsen 2") sentences: List[String] = List(test sen 1, "", testsen 2) scala> val rdd=sc.parallelize(sentences) rdd: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at :23 scala> val results=rdd.flatMap(sen=>sen.split(" ").grouped(1)) results: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[1] at flatMap at :25 scala> results.collect res0: Array[Array[String]] = Array(Array(test), Array(sen), Array(1), **Array("")**, Array(testsen), Array(2)) if we don't have the if statement, we'll result empty things which could cause trouble for following steps. I'd like to be on the safe side. if statement is cheap enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183194060 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51170/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183193865 **[Test build #51170 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51170/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183191566 @yanboliang Could you take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183191594 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183189659 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51171/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183189658 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183189627 **[Test build #51172 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51172/consoleFull)** for PR 9893 at commit [`e61ec6a`](https://github.com/apache/spark/commit/e61ec6a4a3b603d34c6f7de697d61ee559786337). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11153#discussion_r52707357 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -572,98 +572,64 @@ class Analyzer( // Skip sort with aggregate. This will be handled in ResolveAggregateFunctions case sa @ Sort(_, _, child: Aggregate) => sa - case s @ Sort(_, _, child) if !s.resolved && child.resolved => -val (newOrdering, missingResolvableAttrs) = collectResolvableMissingAttrs(s.order, child) - -if (missingResolvableAttrs.isEmpty) { - val unresolvableAttrs = s.order.filterNot(_.resolved) - logDebug(s"Failed to find $unresolvableAttrs in ${child.output.mkString(", ")}") - s // Nothing we can do here. Return original plan. -} else { - // Add the missing attributes into projectList of Project/Window or - // aggregateExpressions of Aggregate, if they are in the inputSet - // but not in the outputSet of the plan. - val newChild = child transformUp { -case p: Project => - p.copy(projectList = p.projectList ++ -missingResolvableAttrs.filter((p.inputSet -- p.outputSet).contains)) -case w: Window => - w.copy(projectList = w.projectList ++ -missingResolvableAttrs.filter((w.inputSet -- w.outputSet).contains)) -case a: Aggregate => - val resolvableAttrs = missingResolvableAttrs.filter(a.groupingExpressions.contains) - val notResolvedAttrs = resolvableAttrs.filterNot(a.aggregateExpressions.contains) - val newAggregateExpressions = a.aggregateExpressions ++ notResolvedAttrs - a.copy(aggregateExpressions = newAggregateExpressions) -case o => o - } - + case s @ Sort(order, _, child) if !s.resolved && child.resolved => +val newOrder = order.map(resolveExpressionRecursively(_, child).asInstanceOf[SortOrder]) +val requiredAttrs = AttributeSet(newOrder).filter(_.resolved) +val missingAttrs = requiredAttrs -- child.outputSet +if (missingAttrs.nonEmpty) { // Add missing attributes and then project them away after the sort. Project(child.output, -Sort(newOrdering, s.global, newChild)) +Sort(newOrder, s.global, addMissingAttr(child, missingAttrs))) +} else if (newOrder != order) { + s.copy(order = newOrder) +} else { + s } } /** - * Traverse the tree until resolving the sorting attributes - * Return all the resolvable missing sorting attributes - */ -@tailrec -private def collectResolvableMissingAttrs( -ordering: Seq[SortOrder], -plan: LogicalPlan): (Seq[SortOrder], Seq[Attribute]) = { + * Add the missing attributes into projectList of Project/Window or aggregateExpressions of + * Aggregate. + */ +private def addMissingAttr(plan: LogicalPlan, missingAttrs: AttributeSet): LogicalPlan = { + if (missingAttrs.isEmpty) { +return plan + } plan match { -// Only Windows and Project have projectList-like attribute. -case un: UnaryNode if un.isInstanceOf[Project] || un.isInstanceOf[Window] => - val (newOrdering, missingAttrs) = resolveAndFindMissing(ordering, un, un.child) - // If missingAttrs is non empty, that means we got it and return it; - // Otherwise, continue to traverse the tree. - if (missingAttrs.nonEmpty) { -(newOrdering, missingAttrs) - } else { -collectResolvableMissingAttrs(ordering, un.child) - } +case p: Project => + val missing = missingAttrs -- p.child.outputSet + Project(p.projectList ++ missingAttrs, addMissingAttr(p.child, missing)) +case w: Window => + val missing = missingAttrs -- w.child.outputSet + w.copy(projectList = w.projectList ++ missingAttrs, +child = addMissingAttr(w.child, missing)) case a: Aggregate => - val (newOrdering, missingAttrs) = resolveAndFindMissing(ordering, a, a.child) - // For Aggregate, all the order by columns must be specified in group by clauses - if (missingAttrs.nonEmpty && - missingAttrs.forall(ar => a.groupingExpressions.exists(_.semanticEquals(ar { -(newOrdering, missingAttrs) - } else { -// If missingAttrs is empty, we are unable to res
[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11153#discussion_r52707329 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -572,98 +572,64 @@ class Analyzer( // Skip sort with aggregate. This will be handled in ResolveAggregateFunctions case sa @ Sort(_, _, child: Aggregate) => sa - case s @ Sort(_, _, child) if !s.resolved && child.resolved => -val (newOrdering, missingResolvableAttrs) = collectResolvableMissingAttrs(s.order, child) - -if (missingResolvableAttrs.isEmpty) { - val unresolvableAttrs = s.order.filterNot(_.resolved) - logDebug(s"Failed to find $unresolvableAttrs in ${child.output.mkString(", ")}") - s // Nothing we can do here. Return original plan. -} else { - // Add the missing attributes into projectList of Project/Window or - // aggregateExpressions of Aggregate, if they are in the inputSet - // but not in the outputSet of the plan. - val newChild = child transformUp { -case p: Project => - p.copy(projectList = p.projectList ++ -missingResolvableAttrs.filter((p.inputSet -- p.outputSet).contains)) -case w: Window => - w.copy(projectList = w.projectList ++ -missingResolvableAttrs.filter((w.inputSet -- w.outputSet).contains)) -case a: Aggregate => - val resolvableAttrs = missingResolvableAttrs.filter(a.groupingExpressions.contains) - val notResolvedAttrs = resolvableAttrs.filterNot(a.aggregateExpressions.contains) - val newAggregateExpressions = a.aggregateExpressions ++ notResolvedAttrs - a.copy(aggregateExpressions = newAggregateExpressions) -case o => o - } - + case s @ Sort(order, _, child) if !s.resolved && child.resolved => +val newOrder = order.map(resolveExpressionRecursively(_, child).asInstanceOf[SortOrder]) +val requiredAttrs = AttributeSet(newOrder).filter(_.resolved) +val missingAttrs = requiredAttrs -- child.outputSet +if (missingAttrs.nonEmpty) { // Add missing attributes and then project them away after the sort. Project(child.output, -Sort(newOrdering, s.global, newChild)) +Sort(newOrder, s.global, addMissingAttr(child, missingAttrs))) +} else if (newOrder != order) { + s.copy(order = newOrder) +} else { + s } } /** - * Traverse the tree until resolving the sorting attributes - * Return all the resolvable missing sorting attributes - */ -@tailrec -private def collectResolvableMissingAttrs( -ordering: Seq[SortOrder], -plan: LogicalPlan): (Seq[SortOrder], Seq[Attribute]) = { + * Add the missing attributes into projectList of Project/Window or aggregateExpressions of + * Aggregate. + */ +private def addMissingAttr(plan: LogicalPlan, missingAttrs: AttributeSet): LogicalPlan = { --- End diff -- It makes sense to me. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183185609 **[Test build #51170 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51170/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
GitHub user NarineK opened a pull request: https://github.com/apache/spark/pull/11179 [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegression.AFTAggregator improvements - Avoids creating new instances of arrays/vectors for each record As also mentioned/marked by TODO in AFTAggregator.AFTAggregator.add(data: AFTPoint) a new array is being created for intercept value and it is being concatenated with another array which contains the betas, the resulted Array is being converted into a Dense vector which in it's turn is being converted into breeze vector. This is expensive and not necessarily beautiful. I've tried to solve above mentioned problem by simple algebraic decompositions - keeping and treating intercept independently. Please let me know what do you think and if you have any questions. Thanks, Narine You can merge this pull request into a Git repository by running: $ git pull https://github.com/NarineK/spark survivaloptim Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11179.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11179 commit 8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7 Author: Narine Kokhlikyan Date: 2016-02-12T02:42:08Z Initial commit - AFTSurvivalRegression improvements --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183181035 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183181025 **[Test build #51168 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51168/consoleFull)** for PR 11178 at commit [`5528c48`](https://github.com/apache/spark/commit/5528c48a7524952d3cc1f2d2a2bd303696c07f59). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183181036 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51168/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183180286 **[Test build #51169 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51169/consoleFull)** for PR 11100 at commit [`79c11de`](https://github.com/apache/spark/commit/79c11de8954e137e134d3a8645b6936cd625f38e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183179580 **[Test build #51168 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51168/consoleFull)** for PR 11178 at commit [`5528c48`](https://github.com/apache/spark/commit/5528c48a7524952d3cc1f2d2a2bd303696c07f59). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183178340 Thank you! @davies @aray Yeah, my first fix is very similar to what you proposed above. Will remember what you said regarding `GROUPING__ID`. After the release of 2.0, I will try to deprecate it and issue an error message. BTW, just tried the code changes and it works well in my local environment. Updated the codes. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10894#issuecomment-183172788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51167/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10894#issuecomment-183172787 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10894#issuecomment-183172697 **[Test build #51167 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51167/consoleFull)** for PR 10894 at commit [`7151a73`](https://github.com/apache/spark/commit/7151a737f36eacf9f367068e025c63f281c7d8c5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-183170957 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-183170958 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51146/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-183170927 **[Test build #51146 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51146/consoleFull)** for PR 10705 at commit [`ef7d885`](https://github.com/apache/spark/commit/ef7d88508af04b81d6671fd7ccf55111ca3e7856). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-183169031 ps can you close this one now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8#issuecomment-183168840 merged to master, thanks @steveloughran! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11125#issuecomment-183159966 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51166/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11125#issuecomment-183159907 **[Test build #51166 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51166/consoleFull)** for PR 11125 at commit [`7e3ea32`](https://github.com/apache/spark/commit/7e3ea32fdd51f2e5a631602b23576b6330d9f112). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11125#issuecomment-183159964 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11169#issuecomment-183158218 @rxin Can we maybe merge this for now and then take the optimisation into account in another PR? This optimisation would apply to all the pruned scan as well and I think I should deal with this in another PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11125#issuecomment-183158035 **[Test build #51166 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51166/consoleFull)** for PR 11125 at commit [`7e3ea32`](https://github.com/apache/spark/commit/7e3ea32fdd51f2e5a631602b23576b6330d9f112). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10894#issuecomment-183154802 **[Test build #51167 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51167/consoleFull)** for PR 10894 at commit [`7151a73`](https://github.com/apache/spark/commit/7151a737f36eacf9f367068e025c63f281c7d8c5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11043 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13257] [Improvement] Refine naive Bayes...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11125#issuecomment-183153830 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11043#issuecomment-183153739 Merged into master and branch-1.6. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/10697#issuecomment-183153557 Merged into master. Thanks! @Earthson I didn't merge it into branch-1.6 because `checkColumnTypes` is not available on branch-1.6. I don't think this is a critical bug for backporting. But if you have time to prepare a PR for branch-1.6, I'm happy to merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/11177#discussion_r52700130 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala --- @@ -71,9 +80,76 @@ case class Expand( idx = 0 } + numOutputRows += 1 result } } } } + + override def upstream(): RDD[InternalRow] = { +child.asInstanceOf[CodegenSupport].upstream() + } + + protected override def doProduce(ctx: CodegenContext): String = { +child.asInstanceOf[CodegenSupport].produce(ctx, this) + } + + override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = { +val uniqExprs: IndexedSeq[Set[Expression]] = output.indices.map { i => + projections.map(p => p(i)).toSet +} + +ctx.currentVars = input +val resultVars = uniqExprs.zipWithIndex.map { case (exprs, i) => + val expr = exprs.head + if (exprs.size == 1) { +// it's common to have same expression for some columns in all the projections, for example, +// GroupingSet will copy all the output from child as the first part of output. +// We should only generate the columns once. +BindReferences.bindReference(expr, child.output).gen(ctx) + } else { +val isNull = ctx.freshName("isNull") +val value = ctx.freshName("value") +val code = + s""" + |boolean $isNull = true; + |${ctx.javaType(expr.dataType)} $value = ${ctx.defaultValue(expr.dataType)}; + """.stripMargin +ExprCode(code, isNull, value) + } +} + +// In order to prevent code exploration, we can't call `consume()` many times, so we call +// that in a loop, and use swith/case to select the projections. +val projectCodes = projections.zipWithIndex.map { case (exprs, i) => --- End diff -- i find the body of this loop pretty hard to understand. can we add some high level comment to explain what's going on? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-183153291 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51132/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-183153257 **[Test build #51132 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51132/consoleFull)** for PR 10705 at commit [`a0c5bb3`](https://github.com/apache/spark/commit/a0c5bb336c0dc06ec9ffdf2ff12cb4f7aae3bc1d). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-183153290 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/11177#discussion_r52699940 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala --- @@ -71,9 +80,76 @@ case class Expand( idx = 0 } + numOutputRows += 1 result } } } } + + override def upstream(): RDD[InternalRow] = { +child.asInstanceOf[CodegenSupport].upstream() + } + + protected override def doProduce(ctx: CodegenContext): String = { +child.asInstanceOf[CodegenSupport].produce(ctx, this) + } + + override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = { +val uniqExprs: IndexedSeq[Set[Expression]] = output.indices.map { i => + projections.map(p => p(i)).toSet +} + +ctx.currentVars = input +val resultVars = uniqExprs.zipWithIndex.map { case (exprs, i) => + val expr = exprs.head + if (exprs.size == 1) { +// it's common to have same expression for some columns in all the projections, for example, +// GroupingSet will copy all the output from child as the first part of output. +// We should only generate the columns once. +BindReferences.bindReference(expr, child.output).gen(ctx) + } else { +val isNull = ctx.freshName("isNull") +val value = ctx.freshName("value") +val code = + s""" + |boolean $isNull = true; + |${ctx.javaType(expr.dataType)} $value = ${ctx.defaultValue(expr.dataType)}; + """.stripMargin +ExprCode(code, isNull, value) + } +} + +// In order to prevent code exploration, we can't call `consume()` many times, so we call --- End diff -- what do you mean by "code exploration"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10697 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12976][SQL] Add LazilyGenerateOrdering ...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/10894#discussion_r52699952 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala --- @@ -138,3 +138,32 @@ object GenerateOrdering extends CodeGenerator[Seq[SortOrder], Ordering[InternalR CodeGenerator.compile(code).generate(ctx.references.toArray).asInstanceOf[BaseOrdering] } } + +/** + * A lazily generate row ordering comparator. + */ +class LazilyGenerateOrdering(val ordering: Seq[SortOrder]) extends Ordering[InternalRow] { + + def this(ordering: Seq[SortOrder], inputSchema: Seq[Attribute]) = +this(ordering.map(BindReferences.bindReference(_, inputSchema))) + + @transient + lazy val generatedOrdering = GenerateOrdering.generate(ordering) --- End diff -- Ah, yes, it might cause a performance penalty. I'll try to rewrite as you mentioned. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11177#issuecomment-183152706 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51156/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11177#issuecomment-183152703 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11177#issuecomment-183152562 **[Test build #51156 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51156/consoleFull)** for PR 11177 at commit [`22ceda9`](https://github.com/apache/spark/commit/22ceda9a82c050abbe0d885513a713e9c2dceb29). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][BUILD] Follow-up ANTLR warnings ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11174 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11043#issuecomment-183151944 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11043#issuecomment-183151946 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51161/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11043#issuecomment-183151833 **[Test build #51161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51161/consoleFull)** for PR 11043 at commit [`06e06f7`](https://github.com/apache/spark/commit/06e06f701886916f4710079962e6deae081dc872). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][BUILD] Follow-up ANTLR warnings ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11174#issuecomment-183151703 Thanks - going to merge this in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][BUILD] Follow-up ANTLR warnings ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11174#issuecomment-183151336 LGTM, looks like jenkins has generated proposed warning and caught it correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10697#issuecomment-183151260 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10697#issuecomment-183151263 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51163/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12746][ML] ArrayType(_, true) should al...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10697#issuecomment-183151137 **[Test build #51163 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51163/consoleFull)** for PR 10697 at commit [`9cd7ced`](https://github.com/apache/spark/commit/9cd7ced823eeaaf27e793959e8e8e8ad34ee1443). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13293] [SQL] generate Expand
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11177#issuecomment-183150208 As always, can you paste the generated code? :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183150046 LGTM provided tests pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13154][PYTHON] Add linting for pydocs
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11109#issuecomment-183149783 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51148/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13154][PYTHON] Add linting for pydocs
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11109#issuecomment-183149781 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13154][PYTHON] Add linting for pydocs
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11109#issuecomment-183149629 **[Test build #51148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51148/consoleFull)** for PR 11109 at commit [`c086135`](https://github.com/apache/spark/commit/c086135cdaf7b80ca7abba54986d8347c52b1ac9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13017][Docs] Replace example code in ml...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11142#issuecomment-183149434 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51164/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13017][Docs] Replace example code in ml...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11142#issuecomment-183149432 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13017][Docs] Replace example code in ml...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11142#issuecomment-183149328 **[Test build #51164 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51164/consoleFull)** for PR 11142 at commit [`6c3122a`](https://github.com/apache/spark/commit/6c3122a91bc637446ff8ba8cfce53e12a1718e58). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11170 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11170#issuecomment-183147622 I'm going to merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11170#issuecomment-183147129 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11169#issuecomment-183146848 This [CSVRelation.scala#L193-L199](https://github.com/HyukjinKwon/spark/blob/SPARK-13260/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala#L193-L199) will make sure it parses everything when drop-malformed mode but it does not in other modes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics of numOutp...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11170#issuecomment-183146671 **[Test build #2537 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2537/consoleFull)** for PR 11170 at commit [`ec716ea`](https://github.com/apache/spark/commit/ec716ea3be977a18d63713d31de738fb80a135cc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11169#issuecomment-183146606 I think I should have described this in more details. This works identical with the original CSV datasource. When the parsing mode is drop-malformed, then it will try to parse all and in other modes, it would not. The similar issue was found here https://github.com/databricks/spark-csv/issues/218 and it was fixed here https://github.com/databricks/spark-csv/pull/220. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org