date:20141120

[GitHub] spark pull request: [SPARK-4504][Examples] fix run-example failure...

2014-11-20 Thread vanzin

Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3377#discussion_r20689898 --- Diff: bin/run-example --- @@ -35,9 +35,9 @@ else fi if [ -f "$FWDIR/RELEASE" ]; then - export SPARK_EXAMPLES_JAR="`ls "$FWDIR"/lib/spar

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63906253 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63906246 [Test build #23687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23687/consoleFull) for PR 3009 at commit [`2bbf41a`](https://gith

[GitHub] spark pull request: [SPARK-2309][MLlib] Generalize the binary logi...

2014-11-20 Thread avulanov

Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1379#issuecomment-63906173 @dbtsai Thanks for explanation! Do I understand correct, that if I want to get (num_features+1)*(num_classes) parameters from your model, I need to concatenate a vector

[GitHub] spark pull request: [SPARK-4244] [SQL] Support Hive Generic UDFs w...

2014-11-20 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3109#discussion_r20689585 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -298,6 +298,8 @@ private[hive] trait HiveInspectors {

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63904511 [Test build #23692 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23692/consoleFull) for PR 3375 at commit [`9e7d5b5`](https://gith

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63904518 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-20 Thread jongyoul

Github user jongyoul commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-63904354 At launching from spark-submit with mesos master, for the first time, resourceOffers are called, there's no task at that time because taskScheduler didn't submit a job.

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-63904285 [Test build #23696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23696/consoleFull) for PR 3393 at commit [`f20f1b3`](https://githu

[GitHub] spark pull request: [SPARK-2309][MLlib] Generalize the binary logi...

2014-11-20 Thread dbtsai

Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/1379#issuecomment-63904113 @avulanov I will merge this on Spark 1.3, and sorry for delay since I was very busy recently. Yes, the branch you found should work, but it can not be cleanly merged in up

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-20 Thread jongyoul

GitHub user jongyoul opened a pull request: https://github.com/apache/spark/pull/3393 [SPARK-4525] MesosSchedulerBackend.resourceOffers cannot decline unused ... ...offers from acceptedOffers - Added code for declining unused offers among acceptedOffers - Edited testCase

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63903647 I just took another pass and LGTM from a correctness perspective. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-4358][SQL] Let BigDecimal do checking t...

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3208#discussion_r20688295 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -339,18 +339,15 @@ class SqlParser extends AbstractSparkSQLParser

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20688285 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V

[GitHub] spark pull request: [SPARK-4244] [SQL] Support Hive Generic UDFs w...

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3109#discussion_r20688112 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -162,9 +161,8 @@ private[hive] case class HiveGenericUdf(functionClassName

[GitHub] spark pull request: [SPARK-4431][MLlib] Implement efficient active...

2014-11-20 Thread dbtsai

Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/3288#discussion_r20688070 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala --- @@ -173,4 +173,63 @@ class VectorsSuite extends FunSuite { val v =

[GitHub] spark pull request: [SPARK-4244] [SQL] Support Hive Generic UDFs w...

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3109#discussion_r20687942 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -298,6 +298,8 @@ private[hive] trait HiveInspectors { })

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2982#issuecomment-63902253 [Test build #23695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23695/consoleFull) for PR 2982 at commit [`322f882`](https://githu

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2014-11-20 Thread vanzin

Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r20687545 --- Diff: bin/compute-classpath.cmd --- @@ -1,3 +1,4 @@ +<<< HEAD --- End diff -- It's a windows file, nobody uses those. :-) I'll fix it.

[GitHub] spark pull request: [SPARK-4431][MLlib] Implement efficient active...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3288#issuecomment-63901964 [Test build #23694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23694/consoleFull) for PR 3288 at commit [`1907ae1`](https://githu

[GitHub] spark pull request: [SQL] fix function description mistake

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3344#issuecomment-63901843 Thanks, merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2014-11-20 Thread markgrover

Github user markgrover commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r20687473 --- Diff: bin/compute-classpath.cmd --- @@ -1,3 +1,4 @@ +<<< HEAD --- End diff -- This looks like a left behind merge typo:-) --- If

[GitHub] spark pull request: [SPARK-4431][MLlib] Implement efficient active...

2014-11-20 Thread dbtsai

Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/3288#discussion_r20687461 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala --- @@ -95,22 +93,7 @@ class MultivariateOnlineSummarizer extend

[GitHub] spark pull request: [SPARK-2918] [SQL] Support the CTAS in EXPLAIN...

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3357#issuecomment-63901639 Thanks, merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63901611 [Test build #23686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23686/consoleFull) for PR 3351 at commit [`5c438d7`](https://gith

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63901624 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63901427 [Test build #23693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23693/consoleFull) for PR 1222 at commit [`3f4500f`](https://githu

[GitHub] spark pull request: [SPARK-4318][SQL] Fix empty sum distinct.

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3184#issuecomment-63901022 Thanks for cleaning all of this up. I've merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63900854 [Test build #23692 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23692/consoleFull) for PR 3375 at commit [`9e7d5b5`](https://githu

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63900803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4318][SQL] Fix empty sum distinct.

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3184#discussion_r20687030 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala --- @@ -117,15 +119,15 @@ case class GeneratedAggregate(

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63900797 [Test build #23685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23685/consoleFull) for PR 3351 at commit [`c5b9252`](https://gith

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread vanzin

Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63900710 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4493][SQL] Don't pushdown Eq, NotEq, Lt...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3367#issuecomment-63900674 [Test build #530 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/530/consoleFull) for PR 3367 at commit [`de7de28`](https://github

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63900513 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-63900462 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread manishamde

Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63900346 Thanks a lot @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63900213 Thanks, merged to master and 1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread mengxr

Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63899911 Merged into master and branch-1.2. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-4522][SQL] Parse schema with missing me...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3392#issuecomment-63899973 [Test build #23691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23691/consoleFull) for PR 3392 at commit [`bcc6626`](https://githu

[GitHub] spark pull request: [SPARK-4522][SQL] Parse schema with missing me...

2014-11-20 Thread marmbrus

GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/3392 [SPARK-4522][SQL] Parse schema with missing metadata. This is just a quick fix for 1.2. SPARK-4523 describes a more complete solution. You can merge this pull request into a Git repository by run

[GitHub] spark pull request: [SPARK-4522][SQL] Parse schema with missing me...

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3392#issuecomment-63899682 /cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark pull request: [SPARK-2261] Make event logger use a single fi...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1222#issuecomment-63899649 [Test build #23690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23690/consoleFull) for PR 1222 at commit [`3f4500f`](https://githu

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread mengxr

Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63899537 LGTM. Waiting for Jenkins ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3697] Ignore event directories that can...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3391#issuecomment-63897935 [Test build #23688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23688/consoleFull) for PR 3391 at commit [`5616fcd`](https://githu

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen

Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63897855 By the way, there are new Selenium tests for these examples of the new behavior. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen

Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63897673 It's not too hard; I'm already done! :smiley: Here's what a job details page looks like when stages were skipped: ![image](https://cloud.githubuserco

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63897403 [Test build #23687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23687/consoleFull) for PR 3009 at commit [`2bbf41a`](https://githu

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20685450 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V: Cl

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20685400 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V: Cl

[GitHub] spark pull request: [SPARK-3697] Ignore event directories that can...

2014-11-20 Thread vanzin

GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/3391 [SPARK-3697] Ignore event directories that cannot be read. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-3697 Alternatively

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3009#discussion_r20685273 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -214,6 +264,14 @@ class JobProgressListener(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread pwendell

Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63896661 I don't mind the current behavior, but IMO skipped is better if it's not too hard to implement. --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63896387 Overall this looks good - I might want to take one more pass to just make sure there are no corner cases I can think of. But all of the comments were around naming, etc

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20683755 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -155,83 +186,81 @@ class FileInputDStream[K: ClassTag, V

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63893828 [Test build #23684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23684/consoleFull) for PR 3320 at commit [`8003dfc`](https://gith

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63893839 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20683611 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V: Clas

[GitHub] spark pull request: [SQL] set spark.sql.hive.convertMetastoreParqu...

2014-11-20 Thread scwf

Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3352#issuecomment-63892891 @marmbrus, sure i will open jiras for this and i am now working on this two issues, here i suggest set it false before fixing this two. --- If your project is set up for it

[GitHub] spark pull request: SPARK-2630 Input data size of CoalescedRDD cou...

2014-11-20 Thread davies

Github user davies commented on the pull request: https://github.com/apache/spark/pull/2310#issuecomment-63892481 @ash211 Hopefully this could be merged into 1.2, once you rebase it to master and keep the API unchanged, thanks! --- If your project is set up for it, you can reply to t

[GitHub] spark pull request: SPARK-2630 Input data size of CoalescedRDD cou...

2014-11-20 Thread davies

Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2310#discussion_r20682906 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -167,7 +169,7 @@ case class InputMetrics(readMethod: DataReadMethod.Value) {

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682849 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V:

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63891707 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63891697 [Test build #23683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23683/consoleFull) for PR 3387 at commit [`7198e90`](https://gith

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682620 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.streami

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread davies

Github user davies commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63891580 @mengxr I had updated the test result, now it's as fast as before (small difference). --- If your project is set up for it, you can reply to this email and have your repl

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682591 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V: Clas

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682617 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.streami

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63891139 [Test build #23686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23686/consoleFull) for PR 3351 at commit [`5c438d7`](https://githu

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682363 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -37,22 +69,24 @@ class FileInputDStream[K: ClassTag, V:

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread davies

Github user davies commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63890984 micro benchmark for possion (with fraction=0.1): old one: ``` $ python -m timeit -s from pyspark.rddsampler import RDDSamplerBase; b = RDDSamplerBase(True,

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20682042 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.str

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread pwendell

Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3389#discussion_r20681845 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -17,18 +17,50 @@ package org.apache.spark.str

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63890108 [Test build #23685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23685/consoleFull) for PR 3351 at commit [`c5b9252`](https://githu

[GitHub] spark pull request: [SPARK-3938][SQL] Names in-memory columnar RDD...

2014-11-20 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3383 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3213 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63887426 [Test build #23682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23682/consoleFull) for PR 3389 at commit [`f2136dd`](https://gith

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63887437 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4518][SPARK-4519][Streaming] Refactored...

2014-11-20 Thread tdas

Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3389#issuecomment-63886166 Jenkins test this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63885101 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63885088 [Test build #23680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23680/consoleFull) for PR 3351 at commit [`ee17d78`](https://gith

[GitHub] spark pull request: [SPARK-4477] [PySpark] remove numpy from RDDSa...

2014-11-20 Thread mengxr

Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3351#issuecomment-63885023 @davies I sent you a PR with a faster version of poisson generator: https://github.com/davies/spark/pull/1 . Could you test the performance and update the result? Thanks!

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread dwmclary

Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/3213#issuecomment-63884730 Michael; thanks for being willing to pick up the final changes! I'm happy to get a chance to contribute again. Hopefully the next PR won't require so much of

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3213#issuecomment-63884435 Since we are about to cut a 1.2 preview I'll make the final changes while merging. Thanks for working on this! I think it'll be a pretty popular feature. I used it la

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3213#discussion_r20679073 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/java/JavaSchemaRDD.scala --- @@ -126,6 +126,12 @@ class JavaSchemaRDD( // Transformations (

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3213#discussion_r20679092 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala --- @@ -35,6 +37,8 @@ import org.apache.spark.sql.catalyst.analysis._ import org.a

[GitHub] spark pull request: SPARK-4228 SchemaRDD to JSON

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3213#discussion_r20679083 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala --- @@ -131,6 +135,20 @@ class SchemaRDD( */ lazy val schema: StructType

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread JoshRosen

Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3009#issuecomment-63883863 Regarding "phantom" stages that are skipped: What do you think about adding a "skipped" state to visually convey that there were stage dependencies that _might_

[GitHub] spark pull request: [SPARK-3974][MLlib] Distributed Block Matrix A...

2014-11-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3200#issuecomment-63883302 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-3974][MLlib] Distributed Block Matrix A...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3200#issuecomment-63883289 [Test build #23679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23679/consoleFull) for PR 3200 at commit [`9ae85aa`](https://gith

[GitHub] spark pull request: [SPARK-4487][SQL] Fix attribute reference reso...

2014-11-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/3363#discussion_r20678655 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -181,7 +181,7 @@ class Analyzer(catalog: Catalog, registr

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-11-20 Thread erikerlandson

Github user erikerlandson commented on the pull request: https://github.com/apache/spark/pull/3079#issuecomment-63881800 For reference, this other issue has some overlap: https://issues.apache.org/jira/browse/SPARK-4514 --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread jkbradley

Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63881779 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63880964 [Test build #23684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23684/consoleFull) for PR 3320 at commit [`8003dfc`](https://githu

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3009#discussion_r20677793 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -144,11 +146,30 @@ class JobProgressListener(conf: SparkConf) exten

[GitHub] spark pull request: [SPARK-4145] [WIP] Web UI job pages

2014-11-20 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/3009#discussion_r20677746 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: [SPARK-3938][SQL] Names in-memory columnar RDD...

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3383#issuecomment-63880636 Thanks! I think a lot of people will be happy with this change :) Merged to master and 1.2 --- If your project is set up for it, you can reply to this email an

[GitHub] spark pull request: [SPARK-4439] [MLlib] add python api for random...

2014-11-20 Thread davies

Github user davies commented on the pull request: https://github.com/apache/spark/pull/3320#issuecomment-63880444 @jkbradley done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread SparkQA

Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63880221 [Test build #23683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23683/consoleFull) for PR 3387 at commit [`7198e90`](https://githu

[GitHub] spark pull request: [SPARK-4513][SQL] Support relational operator ...

2014-11-20 Thread marmbrus

Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3387#issuecomment-63879332 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

< 1 2 3 4 >

101 - 200 of 323 matches

Mail list logo