[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r17790219 --- Diff: core/src/main/scala/org/apache/spark/rdd/IndexedRDDLike.scala --- @@ -0,0 +1,338 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r17791303 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ImmutableLongOpenHashSet.scala --- @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56199798 This looks great! my comments are minor. I know its early to be discussing example docs, but I just wanted to mention that I can see caching being an area

[GitHub] spark pull request: SPARK-4276 fix for two working thread

2014-11-09 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3141#issuecomment-62320554 I agress w/ TD, I don't think this change is necessary. I think we should close this and, @svar29 , maybe you can discuss the problem you are running into on the spark

[GitHub] spark pull request: [SPARK-4260] Httpbroadcast should set connecti...

2014-11-09 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3122#issuecomment-62320949 This looks good, but could also explain what necessitates this change? Did you observe some error? If nothing else, just putting the error you observed in the JIRA

[GitHub] spark pull request: [SPARK-3936] Add aggregateMessages, which supe...

2014-11-09 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3100#discussion_r20062181 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/TripletFields.scala --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-1344 [DOCS] Scala API docs for top metho...

2014-11-09 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3168#issuecomment-62323279 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-971 [DOCS] Link to Confluence wiki from ...

2014-11-09 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3169#issuecomment-62323284 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-11-09 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3079#discussion_r20062337 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -113,8 +117,12 @@ class RangePartitioner[K : Ordering : ClassTag, V]( private

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-11-09 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r20063810 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GMMExpectationMaximization.scala --- @@ -0,0 +1,246 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-09 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3029#issuecomment-62330615 this is awesome! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4087] use broadcast for task only when ...

2014-11-09 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/2933#issuecomment-62330965 I agree with @pwendell . It seems like the right thing to do is just fix Broadcast ... and if we can't, then wouldn't you also want to turn off Broadcast even for big

[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3029#issuecomment-62475673 I was just about to suggest the same thing . So I admit it seemed a lot cooler to have the console keep updating, but I agree with their concerns. As a slight

[GitHub] spark pull request: [SPARK-3936] Add aggregateMessages, which supe...

2014-11-10 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3100#discussion_r20130288 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/impl/EdgePartition.scala --- @@ -285,50 +337,126 @@ class EdgePartition

[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3029#issuecomment-62499404 I totally see the appeal of the one-progress bar (hence my initial excitement when I tried this out). But if it doesn't play nicely with logging multiple stages

[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3029#issuecomment-63149772 Sorry for my delay in responding ... (a) I think this DOES add a lot of value over the std INFO logging. One log line per task completion is *much* noisier than

[GitHub] spark pull request: SPARK-5199. Input metrics should show up for I...

2015-01-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4050#discussion_r23555820 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -218,13 +219,14 @@ class HadoopRDD[K, V]( // Find a function

[GitHub] spark pull request: SPARK-4337. [YARN] Add ability to cancel pendi...

2015-01-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4141#discussion_r23557845 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -192,15 +186,32 @@ private[yarn] class YarnAllocator

[GitHub] spark pull request: SPARK-4337. [YARN] Add ability to cancel pendi...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4141#issuecomment-71523335 just a minor comment, otherwise lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-71512674 I figured out the magic combination to make sbt, scalatest, junit, and the sbt-pom-reader all play nicely together. I had to introduce a new config (or scope

[GitHub] spark pull request: [SPARK-3454] Expose JSON representation of dat...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/2333#issuecomment-71532457 Hi @sarutak thanks for your work on this. Josh's other PR https://github.com/apache/spark/pull/2696 has been merged for a while now. I'm gonna take another crack

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4187#issuecomment-71534196 thanks for all the extra detail @MickDavies. lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-02-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3486#discussion_r23938653 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -382,7 +382,8 @@ private[spark] object JsonProtocol { def

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-02-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3486#discussion_r23938252 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsPage.scala --- @@ -26,7 +26,7 @@ import org.apache.spark.ui.{ToolTips, UIUtils, WebUIPage

[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

2015-02-08 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/4435 [SPARK-3454] [WIP] separate json endpoints for data in the UI Exposes data available in the UI as json over http. There are some TODOs and the code needs cleanup, but there is enough here to get

[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

2015-02-08 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4435#discussion_r24285254 --- Diff: core/pom.xml --- @@ -215,6 +215,11 @@ version3.2.10/version /dependency dependency

[GitHub] spark pull request: assumePartitioned

2015-02-08 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/4449 assumePartitioned https://issues.apache.org/jira/browse/SPARK-1061 If you partition an RDD, save to hdfs, then reload it in a separate SparkContext, you've lost the info that the RDD

[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

2015-02-08 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4435#discussion_r24278046 --- Diff: .rat-excludes --- @@ -65,3 +65,6 @@ logs .*scalastyle-output.xml .*dependency-reduced-pom.xml known_translations +json_expectation

[GitHub] spark pull request: [SPARK-5574] use given name prefix in dir

2015-02-03 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4344#issuecomment-72752833 whoops, sorry I forgot about the title, just updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: use given name prefix in dir

2015-02-03 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/4344 use given name prefix in dir https://issues.apache.org/jira/browse/SPARK-5574 very minor, doesn't effect external behavior at all. Note that after this change, some of these dirs

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23942254 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -203,8 +206,11 @@ private[ui] class StagePage(parent: StagesTab) extends

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-02-02 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-72495719 other than adding a test case, I think the code looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23948987 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -472,12 +512,12 @@ private[ui] class StagePage(parent: StagesTab) extends

[GitHub] spark pull request: [SPARK-1061] assumePartitioned

2015-02-08 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4449#issuecomment-73449430 @pwendell its a good question, I was wondering the same thing a little bit as I was writing those unit tests and was going to comment on the jira about this a little

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-69990018 oh good point Marcelo -- I forgot to add that I've only done this for `core` in this PR. I wanted to ask others whether its worthwhile to do in other projects

[GitHub] spark pull request: SPARK-4747 make it easy to skip IntegrationTes...

2015-01-14 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/4048 SPARK-4747 make it easy to skip IntegrationTests * create an `IntegrationTest` tag * label all tests in core as an `IntegrationTest` if they use a `local-cluster` * make a `unit-test` task

[GitHub] spark pull request: [Minor] Fix tiny typo in BlockManager

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4046#issuecomment-70013728 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70015231 lgtm. I was going to suggest that pending stages should be sorted with oldest submission time first, not reversed ... but I guess we want the completed stages

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-70008794 @pwendell To be honest I'd never had the patience to run all the tests before on my laptop. But I just tried them both again: 237 seconds vs. 852 seconds (just for core

[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70014846 can you add a unit test for what this fixes? I don't see how this avoids the exceptions, just seems to push them down into `MutableValue.update`. A test case would help

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-70017200 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-15 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-70181484 so, this doesn't actually work quite the way I wanted it to. It turns out its skipping all the Junit tests as well. The junit tests are run if you run with `test-only

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-16 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-70292578 @pwendell I like the idea of just getting tests to run faster in general, but I think its gonna be hard to make that happen. (Not the most exciting tasks for beginners

[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70036427 btw, while you're mucking around in there ... it might be nice to change the `SpecificMutableRow` constructor to take varargs. Change this constructor

[GitHub] spark pull request: [SPARK-733] Add documentation on use of accumu...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4022#issuecomment-70036623 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70036236 I think finding fixing a bug in current behavior is a great reason to add a unit test. Some part of the implementation is confusing enough to have allowed a bug

[GitHub] spark pull request: [SPARK-5236] Fix ClassCastException in Specifi...

2015-01-14 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4039#issuecomment-70037269 Back to the question of something deeper being wrong ... I think we'll need to wait for input from somebody more familiar w/ this code. @marmbrus ? But one

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-12 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24642522 --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala --- @@ -81,19 +84,38 @@ class StorageStatus(val blockManagerId: BlockManagerId, val

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-12 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24642582 --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala --- @@ -118,8 +140,20 @@ class StorageStatus(val blockManagerId: BlockManagerId

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-12 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24642955 --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala --- @@ -118,8 +140,20 @@ class StorageStatus(val blockManagerId: BlockManagerId

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24829369 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListenerBus.scala --- @@ -24,7 +24,13 @@ import org.apache.spark.util.ListenerBus

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24831053 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -522,7 +523,9 @@ private[spark] class BlockManagerInfo

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24830276 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListenerBus.scala --- @@ -24,7 +24,13 @@ import org.apache.spark.util.ListenerBus

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24833405 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -49,9 +50,40 @@ class RDDInfo( } } + private[spark

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24833690 --- Diff: core/src/main/scala/org/apache/spark/ui/storage/InMemoryObjectPage.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24831483 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -522,7 +523,9 @@ private[spark] class BlockManagerInfo

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24833566 --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala --- @@ -271,4 +368,19 @@ private[spark] object StorageUtils

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24831712 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -21,13 +21,14 @@ import org.apache.spark.annotation.DeveloperApi import

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-74698831 Hi @CodingCat thanks for making all the updates. Sorry I hadn't realized the subtlety w/ `Int` vs `Long` on the `RDDBlockId` and `BroadcastBlockId`. Still, I

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/2851#discussion_r24835658 --- Diff: core/src/main/scala/org/apache/spark/ui/storage/BroadcastPage.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-17 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-74713997 can you also post a screenshot of the detailed page for a broadcast var? Ideally involving a broadcast var that gets turned into multiple blocks by `TorrentBroadcast`, I

[GitHub] spark pull request: [SPARK-5785] [PySpark] narrow dependency for c...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4629#discussion_r24852684 --- Diff: python/pyspark/tests.py --- @@ -740,6 +739,27 @@ def test_multiple_python_java_RDD_conversions(self): converted_rdd = RDD

[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

2015-02-16 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/2851#issuecomment-74510287 well I was leaning towards using a `ThreadLocal` to get the broadcast blocks into the task end event ... but I forgot that broadcast blocks are also created by the driver

[GitHub] spark pull request: [SPARK-5785] [PySpark] narrow dependency for c...

2015-02-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4629#discussion_r24861247 --- Diff: python/pyspark/tests.py --- @@ -740,6 +739,27 @@ def test_multiple_python_java_RDD_conversions(self): converted_rdd = RDD

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3798#issuecomment-71577982 @koeninger I doubt that we want to go this route in this case, but just in case you're interested, I think a much better way to handle multiple errors gracefully

[GitHub] spark pull request: [SPARK-3298][SQL] Add flag control overwrite r...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4175#issuecomment-71577489 I think these failures are real, looks like you need to do a similar updating of the args to `registerTempTable` in the pyspark tests, eg. [here](https://github.com

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2015-01-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4214#discussion_r23652781 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -113,12 +129,12 @@ private[history] class FsHistoryProvider(conf

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2015-01-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4214#discussion_r23652649 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -163,9 +179,6 @@ private[history] class FsHistoryProvider(conf

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-01-28 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4220#issuecomment-71855275 retest this please hopefully those test failures were random, lets see. btw, I think that if you want the exact same patch applied to multiple branches

[GitHub] spark pull request: [SPARK-4879] Use the Spark driver to authorize...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4155#discussion_r23703042 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -0,0 +1,177 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-01-28 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4220#issuecomment-71923273 I kinda see what is going on with the tests now. A [test case in SparkSubmitSuite](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/deploy

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-01-28 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4216#issuecomment-71924525 I made a comment at one spot in the code, but throughout I find the name stable confusing. It implies the other one is unstable, and without the context from the JIRA

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4216#discussion_r23724419 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolMessage.scala --- @@ -0,0 +1,201 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4216#discussion_r23735817 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolMessage.scala --- @@ -0,0 +1,201 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-5324][SQL] Results of describe can't be...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4249#discussion_r23696141 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala --- @@ -63,6 +63,37 @@ class HiveQuerySuite extends

[GitHub] spark pull request: SPARK-5300 Add LocalFileSystem which will retu...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4204#discussion_r23694090 --- Diff: core/src/main/scala/org/apache/spark/storage/LocalFileSystem.scala --- @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-01-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4220#discussion_r23776203 --- Diff: core/src/test/scala/org/apache/spark/util/ResetSystemProperties.scala --- @@ -42,7 +43,7 @@ private[spark] trait ResetSystemProperties extends

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-01-29 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4220#issuecomment-72079687 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-01-29 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4220#issuecomment-72082093 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3298][SQL] Add `allowExisting` flag to ...

2015-01-29 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4271#issuecomment-72083678 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-01-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4216#discussion_r23875609 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolField.scala --- @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-01-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4216#discussion_r23880057 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/SubmitDriverRequest.scala --- @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-24 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4187#discussion_r23501217 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -426,6 +423,33 @@ private[parquet] class

[GitHub] spark pull request: [SPARK-3298][SQL] Add flag control overwrite r...

2015-01-24 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4175#issuecomment-71348013 Jenkins this is OK to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-24 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4187#discussion_r23501242 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -426,6 +423,33 @@ private[parquet] class

[GitHub] spark pull request: [SPARK-3298][SQL] Add flag control overwrite r...

2015-01-24 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4175#issuecomment-71348345 lets try this again ... Jenkins this is OK to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3298][SQL] Add flag control overwrite r...

2015-01-24 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4175#issuecomment-71348093 this is mentioned in the jira, but its worth noting again here that this changes the behavior slightly, since it wouldn't throw an exception before. --- If your project

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-25 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4187#discussion_r23505185 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -426,6 +423,33 @@ private[parquet] class

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-24 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4187#issuecomment-71323376 Jenkins this is ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-24 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4187#issuecomment-71323709 Thanks @MickDavies ! thanks for investigating and also putting the performance comparison into the jira. I think the code looks fine, but I'm not super-familiar w

[GitHub] spark pull request: [SQL] SPARK-5309: Add support for dictionaries...

2015-01-25 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4187#discussion_r23504700 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -426,6 +423,33 @@ private[parquet] class

[GitHub] spark pull request: [SQL] SPARK-5309: Use Dictionary for Binary-S...

2015-01-23 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4139#issuecomment-71303531 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-01-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3798#discussion_r23569836 --- Diff: external/kafka/src/main/scala/org/apache/spark/streaming/kafka/DeterministicKafkaInputDStream.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-3298][SQL] Add flag control overwrite r...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4175#issuecomment-71551408 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-01-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/3798#discussion_r23570025 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -788,6 +788,20 @@ abstract class RDD[T: ClassTag

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3798#issuecomment-71555904 I'm not very knowledgeable about streaming, but from my limited perspective it looks good --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-01-26 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/3798#issuecomment-71546698 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4216#discussion_r23709166 --- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala --- @@ -148,15 +148,22 @@ private[deploy] object DeployMessages

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-01-28 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/4220#issuecomment-71884820 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5291][CORE] Add timestamp and reason wh...

2015-01-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4082#discussion_r23708804 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -369,7 +369,7 @@ private[spark] class

  1   2   3   4   5   6   7   8   9   10   >