[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64322096 @liancheng thanks for your help! My local compilation with -Phive-0.13.1 is totally OK.. Anyway, it is ok now, with that import line back again.:) --- If you

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20847415 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -97,30 +97,57 @@ class StandardScalerModel private[mllib] ( ov

[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3446#issuecomment-64321564 [Test build #23824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23824/consoleFull) for PR 3446 at commit [`e20a2b9`](https://githu

[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...

2014-11-24 Thread dbtsai
GitHub user dbtsai opened a pull request: https://github.com/apache/spark/pull/3446 [SPARK-4596][MLLib] Refactorize Normalizer to make code cleaner In this refactoring, the performance will be slightly increased due to removing the overhead from breeze vector. The bottleneck is

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20847175 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -97,30 +97,57 @@ class StandardScalerModel private[mllib] ( ov

[GitHub] spark pull request: [SQL] enable empty aggr test case

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3445#issuecomment-64320853 [Test build #23823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23823/consoleFull) for PR 3445 at commit [`982575e`](https://githu

[GitHub] spark pull request: [SQL] enable empty aggr test case

2014-11-24 Thread adrian-wang
GitHub user adrian-wang opened a pull request: https://github.com/apache/spark/pull/3445 [SQL] enable empty aggr test case This is fixed by SPARK-4318 #3184 You can merge this pull request into a Git repository by running: $ git pull https://github.com/adrian-wang/spark emptya

[GitHub] spark pull request: [SPARK-4595][Core] Fix MetricsServlet not work...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3444#issuecomment-64320122 [Test build #23822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23822/consoleFull) for PR 3444 at commit [`f779fe0`](https://githu

[GitHub] spark pull request: [SPARK-4595][Core] Fix MetricsServlet not work...

2014-11-24 Thread jerryshao
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/3444 [SPARK-4595][Core] Fix MetricsServlet not work issue `MetricsServlet` handler should be added to the web UI after initialized by `MetricsSystem`, otherwise servlet handler cannot be attached. Yo

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-64319669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-64319659 [Test build #23819 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23819/consoleFull) for PR 3375 at commit [`99c5c97`](https://gith

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64319649 [Test build #23815 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23815/consoleFull) for PR 3393 at commit [`63855bf`](https://gith

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64319656 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64319392 And the compilation failure is caused by this line https://github.com/apache/spark/pull/3381/files#diff-ff50aea397a607b79df9bec6f2a841dbL23 --- If your project is set

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64319310 @adrian-wang Actually I can reproduce this compilation error with this: ```bash ./sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -P

[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3238#issuecomment-64318723 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3238#issuecomment-64318717 [Test build #23818 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23818/consoleFull) for PR 3238 at commit [`fe95125`](https://gith

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread jongyoul
Github user jongyoul closed the pull request at: https://github.com/apache/spark/pull/3393 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is en

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread jongyoul
Github user jongyoul commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64318571 @pwendell Yes, I reopened beacuse jenkins triggered. I'll close again. --- If your project is set up for it, you can reply to this email and have your reply appear on Gi

[GitHub] spark pull request: [SPARK-4593] [SQL] return null when divider is...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3443#issuecomment-64318454 [Test build #23821 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23821/consoleFull) for PR 3443 at commit [`2dfe50f`](https://githu

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64318446 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64318444 [Test build #23814 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23814/consoleFull) for PR 3393 at commit [`63855bf`](https://gith

[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3399#issuecomment-64317956 [Test build #23820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23820/consoleFull) for PR 3399 at commit [`13cb228`](https://githu

[GitHub] spark pull request: [SPARK-4593] [SQL] return null when divider is...

2014-11-24 Thread adrian-wang
GitHub user adrian-wang opened a pull request: https://github.com/apache/spark/pull/3443 [SPARK-4593] [SQL] return null when divider is 0 of Double type SELECT max(1/0) FROM src would return a very large number, which is obviously not right. For hive-0.12, hive would return `

[GitHub] spark pull request: [WIP][SPARK-2926][Shuffle]Add MR style sort-me...

2014-11-24 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3438#issuecomment-64317326 The main changes we implemented here are: * When a shuffle operation has a key ordering, sort records by key on the map side in addition to sorting by partition. * O

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64317323 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64317319 [Test build #23817 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23817/consoleFull) for PR 3435 at commit [`daf2b06`](https://gith

[GitHub] spark pull request: [SPARK-911] allow efficient queries for a rang...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1381#issuecomment-64317259 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [DOC][Build] Wrong cmd for build spark with ap...

2014-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3335 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread shaneknapp
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64316830 i'll have the time to take a closer look at this tomorrow. On Mon, Nov 24, 2014 at 9:59 PM, Daoyuan Wang wrote: > @liancheng

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-24 Thread tianyi
Github user tianyi commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-64316338 I made some optimization for the performance. Here is the result: the test data is generated by the following script generator.sh ``` for((i=1;i<=$

[GitHub] spark pull request: [DOC][Build] Wrong cmd for build spark with ap...

2014-11-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3335#issuecomment-64316277 Pulling this in - thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [DOC][Build] Wrong cmd for build spark with ap...

2014-11-24 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3335#issuecomment-64316241 Hi @pwendell, is this ok to go? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-11-24 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-64316156 It looks like this patch may have introduced a race-condition / bug during multi-master failover: https://issues.apache.org/jira/browse/SPARK-4592. I'm working on a fi

[GitHub] spark pull request: [WIP][SPARK-2926][Shuffle]Add MR style sort-me...

2014-11-24 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/3438#discussion_r20845166 --- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala --- @@ -17,6 +17,7 @@ package org.apache.spark.rdd +import scala.collectio

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64315254 You should close this PR. I already merged my PR and gave you author credit: https://github.com/apache/spark/commit/f0afb623dc51fd3008bd80496b8d1eaa991323d6 --

[GitHub] spark pull request: [SPARK-4483][SQL]Optimization about reduce mem...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3375#issuecomment-64315173 [Test build #23819 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23819/consoleFull) for PR 3375 at commit [`99c5c97`](https://githu

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64314147 @liancheng can you reproduce the build error locally? I think it is a bug of Jenkins. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [DOC][Build] Wrong cmd for build spark with ap...

2014-11-24 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3335#issuecomment-64314157 Ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark pull request: [SPARK-4504][Examples] fix run-example failure...

2014-11-24 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3377#discussion_r20844568 --- Diff: bin/run-example --- @@ -35,9 +35,9 @@ else fi if [ -f "$FWDIR/RELEASE" ]; then - export SPARK_EXAMPLES_JAR="`ls "$FWDIR"/lib/sp

[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3238#issuecomment-64313146 [Test build #23818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23818/consoleFull) for PR 3238 at commit [`fe95125`](https://githu

[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

2014-11-24 Thread jimjh
Github user jimjh commented on the pull request: https://github.com/apache/spark/pull/3238#issuecomment-64312996 @vanzin Thanks for the review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4475] change "localhost" to "127.0.0.1"...

2014-11-24 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/3425#issuecomment-64312914 @lvsoft I agreed that 127.0.0.1 is better than localhost in some cases, if it will not work in the case of the machines only have IPv6, which happens in Facebook.

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64312383 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64312381 [Test build #23816 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23816/consoleFull) for PR 3381 at commit [`f7c704a`](https://gith

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64312182 [Test build #23817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23817/consoleFull) for PR 3435 at commit [`daf2b06`](https://githu

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64312017 [Test build #23816 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23816/consoleFull) for PR 3381 at commit [`f7c704a`](https://githu

[GitHub] spark pull request: [SQL] Compute timeTaken correctly

2014-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3423 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64312011 [Test build #23815 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23815/consoleFull) for PR 3393 at commit [`63855bf`](https://githu

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread jongyoul
Github user jongyoul commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64311727 @pwendell SparkQA trigger this issue for testing. Please check it and close this PR again. I did check your PR. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread jongyoul
GitHub user jongyoul reopened a pull request: https://github.com/apache/spark/pull/3393 [SPARK-4525] MesosSchedulerBackend.resourceOffers cannot decline unused ... ...offers from acceptedOffers - Added code for declining unused offers among acceptedOffers - Edited testCa

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64311648 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64311644 The history of this build has already timed out... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If yo

[GitHub] spark pull request: [SQL] Compute timeTaken correctly

2014-11-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3423#issuecomment-64311548 Ok I'm merging this in master & branch-1.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64311461 [Test build #23814 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23814/consoleFull) for PR 3393 at commit [`63855bf`](https://githu

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread jongyoul
Github user jongyoul closed the pull request at: https://github.com/apache/spark/pull/3393 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is en

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread jongyoul
Github user jongyoul commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64311415 @pwendell Oh, I'm late a little bit. I patched that code to similar you. I'll close this issue. --- If your project is set up for it, you can reply to this email and h

[GitHub] spark pull request: [SPARK-4485][SQL]Add BroadcastHashOuterJoin

2014-11-24 Thread tianyi
Github user tianyi commented on a diff in the pull request: https://github.com/apache/spark/pull/3362#discussion_r20843639 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashOuterJoin.scala --- @@ -0,0 +1,164 @@ +/* + * Licensed to the Apac

[GitHub] spark pull request: [SPARK-4409][MLlib] Additional Linear Algebra ...

2014-11-24 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3319#issuecomment-64310622 @brkyvz Two comments on the API: 1) For the APIs we provide, could you add a JAVA test suite and verify that all methods work in Java. 2) `horzCat` and `vertCa

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20843304 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -97,30 +97,51 @@ class StandardScalerModel private[mllib] ( ov

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20843294 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -97,30 +97,51 @@ class StandardScalerModel private[mllib] ( ov

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20843292 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -87,6 +85,8 @@ class StandardScalerModel private[mllib] ( f

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20843299 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -97,30 +97,51 @@ class StandardScalerModel private[mllib] ( ov

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64309996 By default, Scala generates Java methods for members, no matter whether you use `val` or `def`. That's why you saw `invokespecial` for `shift` and `factor`. But if a membe

[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-24 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64309762 @marmbrus @rxin @liancheng Can you help verify this build error? I cloned this branch separately and the build is successful. --- If your project is set up for it, y

[GitHub] spark pull request: [SPARK-4258][SQL][DOC] Documents spark.sql.par...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3440#issuecomment-64309423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4258][SQL][DOC] Documents spark.sql.par...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3440#issuecomment-64309419 [Test build #23812 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23812/consoleFull) for PR 3440 at commit [`2104311`](https://gith

[GitHub] spark pull request: [SPARK-4525] MesosSchedulerBackend.resourceOff...

2014-11-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3393#issuecomment-64308978 @jongyoul can you close this issue now? I pulled in your commits already. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-3575][SQL][WIP] Removes the Metastore P...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3441#issuecomment-64308762 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-3575][SQL][WIP] Removes the Metastore P...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3441#issuecomment-64308760 [Test build #23813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23813/consoleFull) for PR 3441 at commit [`681cf87`](https://gith

[GitHub] spark pull request: [SPARK-4475] change "localhost" to "127.0.0.1"...

2014-11-24 Thread lvsoft
Github user lvsoft commented on the pull request: https://github.com/apache/spark/pull/3425#issuecomment-64308603 I did a doctest in aggregation.py to confirm this fix is OK if ```localhost``` can not be resolved. However, I'm not fully confident that spark will work well totally in s

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64308394 Wow, with ```scala private[this] val factor: Array[Double] = { val f = Array.ofDim[Double](variance.size) var i = 0 while (i < f.size) {

[GitHub] spark pull request: [SPARK-4570][SQL]add BroadcastLeftSemiJoinHash

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3442#issuecomment-64307902 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-4570][SQL]add BroadcastLeftSemiJoinHash

2014-11-24 Thread wangxiaojing
GitHub user wangxiaojing opened a pull request: https://github.com/apache/spark/pull/3442 [SPARK-4570][SQL]add BroadcastLeftSemiJoinHash JIRA issue: [SPARK-4570](https://issues.apache.org/jira/browse/SPARK-4570) We are planning to create a `BroadcastLeftSemiJoinHash` to implement

[GitHub] spark pull request: [SPARK-4582][MLLIB] get raw vectors for furthe...

2014-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3437 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: get raw vectors for further processing in Word...

2014-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3309 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SQL] Compute timeTaken correctly

2014-11-24 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3423#issuecomment-64307108 LGTM, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64307020 @dbtsai What if we mark `factor` and `shift` as `private[this]`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-3575][SQL][WIP] Removes the Metastore P...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3441#issuecomment-64306598 [Test build #23813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23813/consoleFull) for PR 3441 at commit [`681cf87`](https://githu

[GitHub] spark pull request: [SPARK-3575][SQL][WIP] Removes the Metastore P...

2014-11-24 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/3441 [SPARK-3575][SQL][WIP] Removes the Metastore Parquet conversion hack Still a WIP. This PR tries to remove the Metastore Parquet conversion hack by moving the conversion logic to `HiveMeta

[GitHub] spark pull request: [SPARK-4583] [mllib] LogLoss for GradientBoost...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3439#issuecomment-64306124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [WIP][SPARK-2926][Shuffle]Add MR style sort-me...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3438#issuecomment-64306134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [WIP][SPARK-2926][Shuffle]Add MR style sort-me...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3438#issuecomment-64306132 [Test build #23809 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23809/consoleFull) for PR 3438 at commit [`7d839cd`](https://gith

[GitHub] spark pull request: [SPARK-4583] [mllib] LogLoss for GradientBoost...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3439#issuecomment-64306120 [Test build #23811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23811/consoleFull) for PR 3439 at commit [`7c38962`](https://gith

[GitHub] spark pull request: [SPARK-3133] embed small object in broadcast t...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2681#issuecomment-64305586 [Test build #23808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23808/consoleFull) for PR 2681 at commit [`1ffd763`](https://gith

[GitHub] spark pull request: [SPARK-3133] embed small object in broadcast t...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2681#issuecomment-64305591 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4553][SQL] Query for parquet table with...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3414#issuecomment-64305067 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4525] Mesos should decline unused offer...

2014-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3436 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4553][SQL] Query for parquet table with...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3414#issuecomment-64305062 [Test build #23810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23810/consoleFull) for PR 3414 at commit [`9c85c22`](https://gith

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64304881 PS, we may want to go though the mllib codebase, and find things like this. This issue impacts the performance quite a lot. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64304769 @mengxr Without the local reference copy of `factor` and `shift` arrays, the runtime is almost three time slower. DenseVector withMean and withStd: 18.1

[GitHub] spark pull request: [SPARK-4525] Mesos should decline unused offer...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3436#issuecomment-64304097 [Test build #23806 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23806/consoleFull) for PR 3436 at commit [`58c35b5`](https://gith

[GitHub] spark pull request: [SPARK-4525] Mesos should decline unused offer...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3436#issuecomment-64304100 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

[GitHub] spark pull request: [SPARK-4258][SQL][DOC] Documents spark.sql.par...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3440#issuecomment-64304043 [Test build #23812 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23812/consoleFull) for PR 3440 at commit [`2104311`](https://githu

[GitHub] spark pull request: [SPARK-4258][SQL][DOC] Documents spark.sql.par...

2014-11-24 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/3440 [SPARK-4258][SQL][DOC] Documents spark.sql.parquet.filterPushdown Documents `spark.sql.parquet.filterPushdown`, explain why it's turned off by default and when it's safe to be turned on. You can

[GitHub] spark pull request: [SPARK-4266] [Web-UI] Reduce stage page load t...

2014-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3328 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-4505][Core] Add a ClassTag parameter to...

2014-11-24 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3378#issuecomment-64303243 @rxin Is it OK to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64302802 [Test build #23805 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23805/consoleFull) for PR 3435 at commit [`cdb5cef`](https://gith

[GitHub] spark pull request: [SPARK-2313] PySpark pass port rather than std...

2014-11-24 Thread lvsoft
Github user lvsoft commented on the pull request: https://github.com/apache/spark/pull/3424#issuecomment-64302794 I think this is a better solution. However, pass the port back via socket will affair py4j too. Currently, stdin is the only supported method in py4j to pass back t

[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64302807 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23

  1   2   3   4   >