[GitHub] spark pull request: [SPARK-4504][Examples] fix run-example failure...

2014-11-20 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3377#discussion_r20689898 --- Diff: bin/run-example --- @@ -35,9 +35,9 @@ else fi if [ -f "$FWDIR/RELEASE" ]; then - export SPARK_EXAMPLES_JAR="`ls "$FWDIR"/lib/spar

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63906253 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63906246 [Test build #23687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23687/consoleFull) for PR 3009 at commit [`2bbf41a`](https://gith

[GitHub] spark pull request: [SPARK-2309][MLlib] Generalize the binary logi...

2014-11-20 Thread avulanov
Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1379#issuecomment-63906173 @dbtsai Thanks for explanation! Do I understand correct, that if I want to get (num_features+1)*(num_classes) parameters from your model, I need to concatenate a vector

[GitHub] spark pull request: [SPARK-4244] [SQL] Support Hive Generic UDFs w...

2014-11-20 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3109#discussion_r20689585 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -298,6 +298,8 @@ private[hive] trait HiveInspectors {

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63904511 [Test build #23692 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23692/consoleFull) for PR 3375 at commit [`9e7d5b5`](https://gith

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63904518 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-20 Thread jongyoul
Github user jongyoul commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-63904354 At launching from spark-submit with mesos master, for the first time, resourceOffers are called, there's no task at that time because taskScheduler didn't submit a job.

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-63904285 [Test build #23696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23696/consoleFull) for PR 3393 at commit [`f20f1b3`](https://githu

[GitHub] spark pull request: [SPARK-2309][MLlib] Generalize the binary logi...

2014-11-20 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/1379#issuecomment-63904113 @avulanov I will merge this on Spark 1.3, and sorry for delay since I was very busy recently. Yes, the branch you found should work, but it can not be cleanly merged in up

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-20 Thread jongyoul
GitHub user jongyoul opened a pull request: https://github.com/apache/spark/pull/3393 [SPARK-4525] MesosSchedulerBackend.resourceOffers cannot decline unused ... ...offers from acceptedOffers - Added code for declining unused offers among acceptedOffers - Edited testCase

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63903647 I just took another pass and LGTM from a correctness perspective. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-4358][SQL] Let BigDecimal do checking t...

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3208#discussion_r20688295 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -339,18 +339,15 @@ class SqlParser extends AbstractSparkSQLParser

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20688285 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V

[GitHub] spark pull request: [SPARK-4244] [SQL] Support Hive Generic UDFs w...

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3109#discussion_r20688112 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -162,9 +161,8 @@ private[hive] case class HiveGenericUdf(functionClassName

[GitHub] spark pull request: [SPARK-4431][MLlib] Implement efficient active...

2014-11-20 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/3288#discussion_r20688070 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala --- @@ -173,4 +173,63 @@ class VectorsSuite extends FunSuite { val v =

[GitHub] spark pull request: [SPARK-4244] [SQL] Support Hive Generic UDFs w...

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3109#discussion_r20687942 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -298,6 +298,8 @@ private[hive] trait HiveInspectors { })

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2982#issuecomment-63902253 [Test build #23695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23695/consoleFull) for PR 2982 at commit [`322f882`](https://githu

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2014-11-20 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r20687545 --- Diff: bin/compute-classpath.cmd --- @@ -1,3 +1,4 @@ +<<< HEAD --- End diff -- It's a windows file, nobody uses those. :-) I'll fix it.

[GitHub] spark pull request: [SPARK-4431][MLlib] Implement efficient active...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3288#issuecomment-63901964 [Test build #23694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23694/consoleFull) for PR 3288 at commit [`1907ae1`](https://githu

[GitHub] spark pull request: [SQL] fix function description mistake

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3344#issuecomment-63901843 Thanks, merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2014-11-20 Thread markgrover
Github user markgrover commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r20687473 --- Diff: bin/compute-classpath.cmd --- @@ -1,3 +1,4 @@ +<<< HEAD --- End diff -- This looks like a left behind merge typo:-) --- If

[GitHub] spark pull request: [SPARK-4431][MLlib] Implement efficient active...

2014-11-20 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/3288#discussion_r20687461 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala --- @@ -95,22 +93,7 @@ class MultivariateOnlineSummarizer extend

[GitHub] spark pull request: [SPARK-2918] [SQL] Support the CTAS in EXPLAIN...

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3357#issuecomment-63901639 Thanks, merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63901611 [Test build #23686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23686/consoleFull) for PR 3351 at commit [`5c438d7`](https://gith

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63901624 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63901427 [Test build #23693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23693/consoleFull) for PR 1222 at commit [`3f4500f`](https://githu

[GitHub] spark pull request: [SPARK-4318][SQL] Fix empty sum distinct.

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3184#issuecomment-63901022 Thanks for cleaning all of this up. I've merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63900854 [Test build #23692 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23692/consoleFull) for PR 3375 at commit [`9e7d5b5`](https://githu

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63900803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4318][SQL] Fix empty sum distinct.

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3184#discussion_r20687030 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala --- @@ -117,15 +119,15 @@ case class GeneratedAggregate(

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63900797 [Test build #23685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23685/consoleFull) for PR 3351 at commit [`c5b9252`](https://gith

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63900710 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4493][SQL] Don't pushdown Eq, NotEq, Lt...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3367#issuecomment-63900674 [Test build #530 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/530/consoleFull) for PR 3367 at commit [`de7de28`](https://github

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63900513 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63900462 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63900346 Thanks a lot @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63900213 Thanks, merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63899911 Merged into master and branch-1.2. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-4522][SQL] Parse schema with missing me...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3392#issuecomment-63899973 [Test build #23691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23691/consoleFull) for PR 3392 at commit [`bcc6626`](https://githu

[GitHub] spark pull request: [SPARK-4522][SQL] Parse schema with missing me...

2014-11-20 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/3392 [SPARK-4522][SQL] Parse schema with missing metadata. This is just a quick fix for 1.2. SPARK-4523 describes a more complete solution. You can merge this pull request into a Git repository by run

[GitHub] spark pull request: [SPARK-4522][SQL] Parse schema with missing me...

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3392#issuecomment-63899682 /cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63899649 [Test build #23690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23690/consoleFull) for PR 1222 at commit [`3f4500f`](https://githu

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63899537 LGTM. Waiting for Jenkins ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3697] Ignore event directories that can...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3391#issuecomment-63897935 [Test build #23688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23688/consoleFull) for PR 3391 at commit [`5616fcd`](https://githu

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63897855 By the way, there are new Selenium tests for these examples of the new behavior. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63897673 It's not too hard; I'm already done! :smiley: Here's what a job details page looks like when stages were skipped: ![image](https://cloud.githubuserco

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63897403 [Test build #23687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23687/consoleFull) for PR 3009 at commit [`2bbf41a`](https://githu

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20685450 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V: Cl

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20685400 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V: Cl

[GitHub] spark pull request: [SPARK-3697] Ignore event directories that can...

2014-11-20 Thread vanzin
GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/3391 [SPARK-3697] Ignore event directories that cannot be read. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-3697 Alternatively

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3009#discussion_r20685273 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -214,6 +264,14 @@ class JobProgressListener(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63896661 I don't mind the current behavior, but IMO skipped is better if it's not too hard to implement. --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63896387 Overall this looks good - I might want to take one more pass to just make sure there are no corner cases I can think of. But all of the comments were around naming, etc

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20683755 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63893828 [Test build #23684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23684/consoleFull) for PR 3320 at commit [`8003dfc`](https://gith

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63893839 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20683611 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V: Clas

[GitHub] spark pull request: [SQL] set spark.sql.hive.convertMetastoreParqu...

2014-11-20 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3352#issuecomment-63892891 @marmbrus, sure i will open jiras for this and i am now working on this two issues, here i suggest set it false before fixing this two. --- If your project is set up for it

[GitHub] spark pull request: SPARK-2630 Input data size of CoalescedRDD cou...

2014-11-20 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2310#issuecomment-63892481 @ash211 Hopefully this could be merged into 1.2, once you rebase it to master and keep the API unchanged, thanks! --- If your project is set up for it, you can reply to t

[GitHub] spark pull request: SPARK-2630 Input data size of CoalescedRDD cou...

2014-11-20 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2310#discussion_r20682906 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -167,7 +169,7 @@ case class InputMetrics(readMethod: DataReadMethod.Value) {

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682849 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V:

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63891707 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63891697 [Test build #23683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23683/consoleFull) for PR 3387 at commit [`7198e90`](https://gith

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682620 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.streami

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63891580 @mengxr I had updated the test result, now it's as fast as before (small difference). --- If your project is set up for it, you can reply to this email and have your repl

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682591 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V: Clas

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682617 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.streami

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63891139 [Test build #23686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23686/consoleFull) for PR 3351 at commit [`5c438d7`](https://githu

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682363 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V:

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63890984 micro benchmark for possion (with fraction=0.1): old one: ``` $ python -m timeit -s from pyspark.rddsampler import RDDSamplerBase; b = RDDSamplerBase(True,

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682042 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.str

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20681845 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.str

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63890108 [Test build #23685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23685/consoleFull) for PR 3351 at commit [`c5b9252`](https://githu

[GitHub] spark pull request: [SPARK-3938][SQL] Names in-memory columnar RDD...

2014-11-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3383 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3213 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63887426 [Test build #23682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23682/consoleFull) for PR 3389 at commit [`f2136dd`](https://gith

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63887437 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63886166 Jenkins test this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63885101 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63885088 [Test build #23680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23680/consoleFull) for PR 3351 at commit [`ee17d78`](https://gith

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63885023 @davies I sent you a PR with a faster version of poisson generator: https://github.com/davies/spark/pull/1 . Could you test the performance and update the result? Thanks!

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread dwmclary
Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/3213#issuecomment-63884730 Michael; thanks for being willing to pick up the final changes! I'm happy to get a chance to contribute again. Hopefully the next PR won't require so much of

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3213#issuecomment-63884435 Since we are about to cut a 1.2 preview I'll make the final changes while merging. Thanks for working on this! I think it'll be a pretty popular feature. I used it la

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3213#discussion_r20679073 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/java/JavaSchemaRDD.scala --- @@ -126,6 +126,12 @@ class JavaSchemaRDD( // Transformations (

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3213#discussion_r20679092 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala --- @@ -35,6 +37,8 @@ import org.apache.spark.sql.catalyst.analysis._ import org.a

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3213#discussion_r20679083 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala --- @@ -131,6 +135,20 @@ class SchemaRDD( */ lazy val schema: StructType

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63883863 Regarding "phantom" stages that are skipped: What do you think about adding a "skipped" state to visually convey that there were stage dependencies that _might_

[GitHub] spark pull request: [SPARK-3974][MLlib] Distributed Block Matrix A...

2014-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3200#issuecomment-63883302 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-3974][MLlib] Distributed Block Matrix A...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3200#issuecomment-63883289 [Test build #23679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23679/consoleFull) for PR 3200 at commit [`9ae85aa`](https://gith

[GitHub] spark pull request: [SPARK-4487][SQL] Fix attribute reference reso...

2014-11-20 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3363#discussion_r20678655 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -181,7 +181,7 @@ class Analyzer(catalog: Catalog, registr

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-11-20 Thread erikerlandson
Github user erikerlandson commented on the pull request: https://github.com/apache/spark/pull/3079#issuecomment-63881800 For reference, this other issue has some overlap: https://issues.apache.org/jira/browse/SPARK-4514 --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63881779 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63880964 [Test build #23684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23684/consoleFull) for PR 3320 at commit [`8003dfc`](https://githu

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3009#discussion_r20677793 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -144,11 +146,30 @@ class JobProgressListener(conf: SparkConf) exten

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3009#discussion_r20677746 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: [SPARK-3938][SQL] Names in-memory columnar RDD...

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3383#issuecomment-63880636 Thanks! I think a lot of people will be happy with this change :) Merged to master and 1.2 --- If your project is set up for it, you can reply to this email an

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63880444 @jkbradley done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63880221 [Test build #23683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23683/consoleFull) for PR 3387 at commit [`7198e90`](https://githu

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63879332 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

<    1   2   3   4   >