[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/991#issuecomment-49062165 QA tests have started for PR 991. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16686/consoleFull --- If y

[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-49062792 QA tests have started for PR 1112. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16687/consoleFull --- If

[GitHub] spark pull request: SPARK-2480: Resolve sbt warnings "NOTE: SPARK_...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1404#issuecomment-49063020 QA results for PR 1404:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.c

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49063007 But *why*? If those need to be cleaned up and are not currently, then you need to explicitly clean then up, not go through some gc-based hack. --- If your project is set

[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/991#issuecomment-49063796 QA results for PR 991:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds the following public classes (experimental):trait Lifecycle extends Servi

[GitHub] spark pull request: SPARK-1890 and SPARK-1891- add admin and modif...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1196#issuecomment-49064466 QA results for PR 1196:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.c

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49065703 BTW, there is no `finalize()` that I can find in the Spark tree, so the only thing `System.gc()` is achieving here is freeing memory. --- If your project is set up for it

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49065130 I'm not sure I understood your last comment, but anyway: these object had references to them at some point. What you're suggesting is that, at some point, those references

[GitHub] spark pull request: SPARK-1097: Do not introduce deadlock while fi...

2014-07-15 Thread koeninger
Github user koeninger commented on the pull request: https://github.com/apache/spark/pull/1409#issuecomment-49066268 Testing that patch, it seems to have fixed the deadlock we were seeing in production. --- If your project is set up for it, you can reply to this email and have your r

[GitHub] spark pull request: [SPARK-2314] Override collect and take in Java...

2014-07-15 Thread staple
GitHub user staple opened a pull request: https://github.com/apache/spark/pull/1421 [SPARK-2314] Override collect and take in JavaSchemaRDD, forwarding to SchemaRDD implementations. You can merge this pull request into a Git repository by running: $ git pull https://github.co

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49064468 Explicitly clear the means to keep all the reference object, for Java programmers ,it is very unfriendly. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-2314] Override collect and take in Java...

2014-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49066976 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49067472 Yes , `System.gc()` is just advice, may not really free resources. But RDD no close method,can only be cleared by `ContextCleaner` --- If your project is set up for it, yo

[GitHub] spark pull request: SPARK-2480: Resolve sbt warnings "NOTE: SPARK_...

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1404#issuecomment-49067503 LGTM - thanks for cleaning this up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2480: Resolve sbt warnings "NOTE: SPARK_...

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1404 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49068556 So that's the problem. If ContextCleaner is relying on the gc to clean up after these references, and leaving those references "uncleaned" causes problems, then ContextCle

[GitHub] spark pull request: [SPARK-2471] remove runtime scope for jets3t

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1402#issuecomment-49069460 LGTM for now - I opened a separate JIRA to improve the sbt plug-in to fix this. --- If your project is set up for it, you can reply to this email and have your reply ap

[GitHub] spark pull request: SPARK-2465. Use long as user / item ID for ALS

2014-07-15 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1393#issuecomment-49069526 No, MLlib is not experimental, only the parts annotated with @Exprimental are. The reason is that we felt we could continue supporting these low-level APIs indefinitely an

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49069644 This involves a bug https://issues.apache.org/jira/browse/SPARK-2491 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-07-15 Thread tsudukim
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1094#issuecomment-49069859 OK, thank you for your reply. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49070018 So your whole issue is with the `FileNotFoundException` being logged? Or, aside from that, is there a user-visible side-effect, such as the wrong job status being reported

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49072342 Just as a thought exercise: what if you change `ContextCleaner.stop()` to interrupt the cleaning thread, wait for it to finish, and then manually clean all buffered refere

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49072476 Yes this solution is not perfect. I have been thinking about this problem. BTW the `runGC ` method run GC and make sure it actually has run. reference https://github

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1387#discussion_r14953652 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskEventListener.scala --- @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2465. Use long as user / item ID for ALS

2014-07-15 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1393#issuecomment-49072664 Yeah API stability is very important. I keep banging on about the flip-side -- freezing an API that may still need to change. You get a different important problem. I'm su

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49072778 @witgo the problem is that there's no reliable way to make sure the gc has run. Have you tried with all available gcs in the Oracle vm? Have you tried with different vms?

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/1387#discussion_r14953877 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskEventListener.scala --- @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/991#issuecomment-49074313 QA results for PR 991:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds the following public classes (experimental):trait Lifecycle extends Servi

[GitHub] spark pull request: [WIP] SPARK-2360: CSV import to SchemaRDDs

2014-07-15 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1351#issuecomment-49074527 Can you add [SQL] to the title please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49074742 I'm sorry, my English is poor. The problem now is we do not have a reliable solution to the RDD is cleared. Close this first? --- If your project is set up for it, you ca

[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-49075062 QA results for PR 1112:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds the following public classes (experimental):case class AddWebUIFilter(f

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49075444 `runGC` method's main problem is likely to run for a long time and still didn't work. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49076040 In my tests, `runGC` method is normally working in jdk7_45. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...

2014-07-15 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-49076295 Looks good. Thanks @witgo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1112 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49076858 @witgo you do know that the Oracle VM has 3 different GCs, configurable using command line arguments, right? Have you tried all of them (and combinations thereof, since yo

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49077092 @manishamde It looks like MIMA is complaining about binary compatibility. Could you please check out the errors in the Jenkins log, which give commands, e.g.:

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-15 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49077845 Ok, tomorrow or the day after tomorrow I try it on the way you said. I only tested the default gc configuration and I will test the other. --- If your project is set up

[GitHub] spark pull request: [SPARK-2494] [PySpark] hijack hash to make has...

2014-07-15 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1371#issuecomment-49079782 Create issue #SPARK-2494 to track this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-07-15 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1094#issuecomment-49083255 @rahulsinghaliitd can you please upmerge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49084154 QA tests have started for PR 886. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16688/consoleFull --- If y

[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-49085302 Jenkins, retest this pleae. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-49085313 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49085915 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Added LZ4 to compression codec in configuratio...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1417#issuecomment-49085969 Merging this in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-49085897 QA tests have started for PR 1330. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16690/consoleFull --- If

[GitHub] spark pull request: SPARK-1215: Clustering: Index out of bounds er...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1407#issuecomment-49085926 QA tests have started for PR 1407. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16689/consoleFull --- If

[GitHub] spark pull request: Added LZ4 to compression codec in configuratio...

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1417 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [WIP][SQL] SPARK-2360: CSV import to SchemaRDD...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1351#issuecomment-49086571 QA tests have started for PR 1351. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16692/consoleFull --- If

[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49086561 QA tests have started for PR 1421. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16691/consoleFull --- If

[GitHub] spark pull request: [SPARK-2469] Use Snappy (instead of LZF) for d...

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1415#issuecomment-49090832 @rxin IIRC at one point we changed this before and it caused a performance regression for our perf suite so we reverted it. At the time I think we were running on smalle

[GitHub] spark pull request: [SPARK-2412] CoalescedRDD throws exception wit...

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1337#issuecomment-49091029 okay LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark pull request: replace println to log4j

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1372#issuecomment-49091169 @fireflyc will you have a chance to address the comments so we can merge this? --- If your project is set up for it, you can reply to this email and have your reply app

[GitHub] spark pull request: [SPARK-2471] remove runtime scope for jets3t

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1402 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-2483][SQL] Fix parsing of repeated, nes...

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1411 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-2483][SQL] Fix parsing of repeated, nes...

2014-07-15 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1411#issuecomment-49092118 Thanks for reviewing @yhuai. I've merged this into master and 1.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SQL] Whitelist more Hive tests.

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1396 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SQL] Attribute equality comparisons should be...

2014-07-15 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1414#issuecomment-49092504 @liancheng, please review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2474][SQL] For a registered table in Ov...

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1406 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-15 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1359#issuecomment-49093085 I'm going to go ahead and merge this. Someone can make a follow-up with nullability narrowing. --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-15 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1359#issuecomment-49093280 Thanks! Merged into master and 1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1359 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-15 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14964217 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2014-07-15 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/731#issuecomment-49094992 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabl

[GitHub] spark pull request: SPARK-1715: Ensure actor is self-contained in ...

2014-07-15 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/637#issuecomment-49094948 ping... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enable

[GitHub] spark pull request: SPARK-2404: don't overwrite SPARK_HOME when it...

2014-07-15 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/1331#issuecomment-49094964 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/731#issuecomment-49095245 QA tests have started for PR 731. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16693/consoleFull --- If y

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49095921 QA results for PR 886:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49097254 @manishamde The "@DeveloperApi" does not seem to have fixed MIMA's complaints. I think that modifying the spark/project/MimaExcludes.scala file should do the trick.

[GitHub] spark pull request: SPARK-1215: Clustering: Index out of bounds er...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1407#issuecomment-49098158 QA results for PR 1407:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.c

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49098644 @jkbradley I will make the changes that you mentioned. The documentation from the MimaExcludes.scala says the the DeveloperApi classes will be skipped. May be

[GitHub] spark pull request: follow pep8 None should be compared using is o...

2014-07-15 Thread giwa
GitHub user giwa opened a pull request: https://github.com/apache/spark/pull/1422 follow pep8 None should be compared using is or is not http://legacy.python.org/dev/peps/pep-0008/ ## Programming Recommendations - Comparisons to singletons like None should always be

[GitHub] spark pull request: follow pep8 None should be compared using is o...

2014-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1422#issuecomment-49098978 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49099033 QA results for PR 1421:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.c

[GitHub] spark pull request: [WIP][SQL] SPARK-2360: CSV import to SchemaRDD...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1351#issuecomment-49098508 QA results for PR 1351:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.c

[GitHub] spark pull request: SPARK-2465. Use long as user / item ID for ALS

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1393#issuecomment-49099667 QA tests have started for PR 1393. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16694/consoleFull --- If

[GitHub] spark pull request: [SPARK-2469] Use Snappy (instead of LZF) for d...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1415#issuecomment-49099980 Yea - stability seems much more important than a small performance gain --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49100250 @manishamde Thanks! I'll look into the MIMA issue. The code looks good, and it tests fine vs. scikit-learn on some datasets I've tried. My only comment now is that

[GitHub] spark pull request: [SPARK-2469] Use Snappy (instead of LZF) for d...

2014-07-15 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1415#issuecomment-49100370 Only the codec names are stored in the event logs; no other information is currently recorded. But this change isn't really breaking anything in that area. (And, by defaul

[GitHub] spark pull request: [SPARK-2469] Use Snappy (instead of LZF) for d...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1415#issuecomment-49100573 FYI filed JIRA: https://issues.apache.org/jira/browse/SPARK-2496 Compression streams should write its codec info to the stream --- If your project is set up for it, you ca

[GitHub] spark pull request: follow pep8 None should be compared using is o...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1422#issuecomment-49100610 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-2119][SQL] Improved Parquet performance...

2014-07-15 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1370#discussion_r14967550 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -365,20 +366,23 @@ private[parquet] object ParquetTypesConverter ext

[GitHub] spark pull request: follow pep8 None should be compared using is o...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1422#issuecomment-49101045 QA tests have started for PR 1422. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16695/consoleFull --- If

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49102109 @jkbradley Thanks a lot for taking a look at the MIMA issue. Also many thanks for testing out versus scikit-learn! I will update the documentation to speak about multic

[GitHub] spark pull request: [SPARK-1667] Jobs never finish successfully on...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1383#issuecomment-49102405 Thanks for submitting this. Is there any way we can construct a unit test for this as well? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1667] Jobs never finish successfully on...

2014-07-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1383#discussion_r14968203 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1223,6 +1223,8 @@ private[spark] object Utils extends Logging { /** Returns true if

[GitHub] spark pull request: [SPARK-1667] Jobs never finish successfully on...

2014-07-15 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/1383#issuecomment-49104351 OK. I will add a comment for my change. And I will also add test case for this issue to FailureSuite.scala. Is that proper? --- If your project is set up for it, you

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49104473 @manishamde Apparently MIMA is a bit fragile currently, but is being worked on: [https://issues.apache.org/jira/browse/SPARK-2069] I added a subtask for this specific

[GitHub] spark pull request: [SPARK-1667] Jobs never finish successfully on...

2014-07-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1383#issuecomment-49104510 That's a good place to add it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/731#issuecomment-49104687 QA results for PR 731:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.

[GitHub] spark pull request: [SQL] Attribute equality comparisons should be...

2014-07-15 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/1414#issuecomment-49104773 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-2465. Use long as user / item ID for ALS

2014-07-15 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1393#issuecomment-49104793 I didn't suggest having a new implementation for long IDs, only a new API. They can run on the same implementation (e.g. the current Int-based one transforms the Ints to L

[GitHub] spark pull request: SPARK-2465. Use long as user / item ID for ALS

2014-07-15 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1393#issuecomment-49104938 Wise words Matei! anyway here's another cut that preserves the original API. Tests are still running. Up to you guys' judgment on whether it's worthwhile. --- If your pr

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread concretevitamin
GitHub user concretevitamin opened a pull request: https://github.com/apache/spark/pull/1423 [SQL] Synchronize on a lock when using scala reflection inside data type objects. You can merge this pull request into a Git repository by running: $ git pull https://github.com/concr

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1423#issuecomment-49105217 Jenkins, add to whitelist please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build

2014-07-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-49105352 Could you paste exactly which warnings you are eliminating? I don't see any warnings in our master jenkins build that seem relevant to these changes. --- If your projec

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1423#issuecomment-49105361 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does no

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1423#issuecomment-49105451 Can you open a JIRA too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1423#issuecomment-49105573 QA tests have started for PR 1423. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16696/consoleFull --- If

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-15 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49105740 @jkbradley Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

<    1   2   3   4   >