[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-01 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21482 Jenkins, ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21376: [SPARK-24250][SQL] support accessing SQLConf inside task...

2018-06-01 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21376 I'm still seeing a lot of build failures which seem to be related to this (accessing a conf in a task in turn accesses the LiveListenerBus). Is this something new? Or related to this change? eg

[GitHub] spark pull request #21474: [SPARK-24297][CORE] Fetch-to-disk by default for ...

2018-06-01 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21474#discussion_r192487033 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -429,7 +429,11 @@ package object config { "external sh

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-05-31 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 hey sorry I have been meaning to respond to this but keep getting sidetracked. As Tom and I are going to meet in person next week anyway, I figure at this point it makes sense to just wait till we

[GitHub] spark pull request #21474: [SPARK-24297][CORE] Fetch-to-disk by default for ...

2018-05-31 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/21474 [SPARK-24297][CORE] Fetch-to-disk by default for > 2gb Fetch-to-mem is guaranteed to fail if the message is bigger than 2 GB, so we might as well use fetch-to-disk in that case. The mess

[GitHub] spark issue #21456: [SPARK-24356] [CORE] Duplicate strings in File.path mana...

2018-05-31 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21456 Re: normalization -- if I understand correctly, its not that you know that the normalization definitely *does* change the strings for the heap dump you have. Its just to make sure that your change

[GitHub] spark pull request #21451: [SPARK-24296][CORE][WIP] Replicate large blocks a...

2018-05-31 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21451#discussion_r192136490 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/RpcHandler.java --- @@ -38,15 +38,24 @@ * * This method

[GitHub] spark issue #21456: [SPARK-24356] [CORE] Duplicate strings in File.path mana...

2018-05-30 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21456 From my understanding, that option is only available with G1GC, which is not really a good fit for spark (forget the exact details but something about humongous allocations which are common with all

[GitHub] spark issue #21346: [SPARK-6237][NETWORK] Network-layer changes to allow str...

2018-05-30 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21346 > is this effectively dead code at this point? yes, thats right. this PR by itself is not useful. Its a step towards https://github.com/apache/spark/pull/21451 This is a g

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191981552 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/RpcHandler.java --- @@ -38,15 +38,24 @@ * * This method

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191979425 --- Diff: common/network-common/src/main/java/org/apache/spark/network/protocol/UploadStream.java --- @@ -0,0 +1,107 @@ +/* + * Licensed

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191979019 --- Diff: common/network-common/src/main/java/org/apache/spark/network/protocol/UploadStream.java --- @@ -0,0 +1,107 @@ +/* + * Licensed

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191978545 --- Diff: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java --- @@ -141,26 +141,14 @@ public void fetchChunk

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191978140 --- Diff: common/network-common/src/main/java/org/apache/spark/network/client/StreamInterceptor.java --- @@ -50,16 +52,22 @@ @Override

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191976952 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/StreamData.java --- @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #21346: [SPARK-6237][NETWORK] Network-layer changes to allow str...

2018-05-30 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21346 yeah I see what you're saying about better error handling, but I'd really rather not take that on here. I think some prior attempts at solving the 2gb limit have tried to take on too much, and I'd

[GitHub] spark issue #21456: [SPARK-24356] [CORE] Duplicate strings in File.path mana...

2018-05-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21456 Ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21451: [SPARK-24296][CORE][WIP] Replicate large blocks as a str...

2018-05-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21451 This is the change for SPARK-24296, on top of https://github.com/apache/spark/pull/21346 and https://github.com/apache/spark/pull/21440 Posting here for testing. Review are welcome

[GitHub] spark pull request #21451: [SPARK-24296][CORE][WIP] Replicate large blocks a...

2018-05-29 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/21451 [SPARK-24296][CORE][WIP] Replicate large blocks as a stream. When replicating large cached RDD blocks, it can be helpful to replicate them as a stream, to avoid using large amounts of memory

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r191489062 --- Diff: core/src/main/scala/org/apache/spark/scheduler/PeakExecutorMetrics.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #21440: [SPARK-24307][CORE] Support reading remote cached partit...

2018-05-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21440 thanks for the reviews @markhamstra @Ngone51 , I've updated the pr --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

2018-05-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r191476566 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -659,6 +659,11 @@ private[spark] class BlockManager( * Get block

[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

2018-05-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r191472949 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferFileRegionSuite.scala --- @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

2018-05-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r191472540 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferFileRegionSuite.scala --- @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

2018-05-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r191471289 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBufferFileRegion.scala --- @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...

2018-05-26 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/21440 [SPARK-24307][CORE] Support reading remote cached partitions > 2gb (1) Netty's ByteBuf cannot support data > 2gb. So to transfer data from a ChunkedByteBuffer over the network, we use a

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21346#discussion_r191059478 --- Diff: common/network-common/src/test/java/org/apache/spark/network/StreamTestHelper.java --- @@ -0,0 +1,101 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #21346: [SPARK-6237][NETWORK] Network-layer changes to allow str...

2018-05-25 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21346 Last failures are known flakies. A few updates here from my last set of comments. I've posted an overall design doc, and shared the tests I'm running on a cluster. I think the tests cover

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190405223 --- Diff: core/src/main/scala/org/apache/spark/scheduler/PeakExecutorMetrics.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190363971 --- Diff: core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala --- @@ -251,6 +261,233 @@ class EventLoggingListenerSuite extends

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190364971 --- Diff: core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala --- @@ -251,6 +261,233 @@ class EventLoggingListenerSuite extends

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190351033 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -169,6 +183,35 @@ private[spark] class EventLoggingListener

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190345745 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -800,26 +812,50 @@ private[spark] class Executor

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190359417 --- Diff: core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala --- @@ -251,6 +261,233 @@ class EventLoggingListenerSuite extends

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190346570 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -81,7 +84,7 @@ private[spark] class EventLoggingListener

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190363619 --- Diff: core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala --- @@ -251,6 +261,233 @@ class EventLoggingListenerSuite extends

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190346995 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -93,6 +96,10 @@ private[spark] class EventLoggingListener

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-05-23 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r190353891 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -209,6 +210,16 @@ class DAGScheduler( private[scheduler] val

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-05-23 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 I mean when `YarnAllocatorBlacklistTracker` decides to blacklist because of allocation failures, it doesn't send any message back to the driver -- so the driver doesn't have a msg in the logs, nor

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-05-23 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 I totally understand your motivation for wanting the limit. But I'm trying to balance that against behavior which might not really achieve the desired effect and be even more confusing in some

[GitHub] spark pull request #21356: [SPARK-24309][CORE] AsyncEventQueue should stop o...

2018-05-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21356#discussion_r189702478 --- Diff: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala --- @@ -130,7 +129,11 @@ private class AsyncEventQueue(val name: String, conf

[GitHub] spark issue #20640: [SPARK-19755][Mesos] Blacklist is always active for Meso...

2018-05-21 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20640 thats fine with me, but as I'm neither a user of mesos nor am I in touch w/ many mesos users, I'd like wait a bit for more opinions, given the ramifications of this change. (that shouldn't block

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-05-21 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 ping @tgravescs . honestly I still don't love the blacklist limit, especially since it makes reporting back to the driver pretty confusing, and I don't think it buys us much. But I can live

[GitHub] spark pull request #21356: [SPARK-24309][CORE] AsyncEventQueue should stop o...

2018-05-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21356#discussion_r189604443 --- Diff: core/src/main/scala/org/apache/spark/util/ListenerBus.scala --- @@ -80,7 +89,16 @@ private[spark] trait ListenerBus[L <: AnyRef, E] exte

[GitHub] spark pull request #21356: [SPARK-24309][CORE] AsyncEventQueue should stop o...

2018-05-20 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21356#discussion_r189483397 --- Diff: core/src/main/scala/org/apache/spark/util/ListenerBus.scala --- @@ -80,6 +89,11 @@ private[spark] trait ListenerBus[L <: AnyRef, E] exte

[GitHub] spark issue #21356: [SPARK-24309][CORE] AsyncEventQueue should stop on inter...

2018-05-18 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21356 the problem was not actually an interrupted exception from the listener, it was that the Thread's state was getting set to interrupted, and then there would be a failure later in `queue.take

[GitHub] spark pull request #21356: [SPARK-24309][CORE] AsyncEventQueue should stop o...

2018-05-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21356#discussion_r189384006 --- Diff: core/src/main/scala/org/apache/spark/util/ListenerBus.scala --- @@ -80,6 +89,11 @@ private[spark] trait ListenerBus[L <: AnyRef, E] exte

[GitHub] spark issue #21356: [SPARK-24309][CORE] AsyncEventQueue should stop on inter...

2018-05-18 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21356 > As an alternative design option, PoisonPill could be handled differently, since some msgs should have higher priority and can be considered them as part of your "control plane".

[GitHub] spark issue #21356: [SPARK-24309][CORE] AsyncEventQueue should stop on inter...

2018-05-18 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21356 I pushed an update which only removes the listener which was active at the interrupt. Note that is not the same thing as the listener which *caused* the interrupt, necessarily -- we have no idea

[GitHub] spark issue #21356: [SPARK-24309][CORE] AsyncEventQueue should stop on inter...

2018-05-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21356 > does this mean a problematic listener can kill the queue and crash other listeners in the same queue? Shall we do some isolation? yeah I think marcelo was asking about this above ht

[GitHub] spark pull request #21356: [SPARK-24309][CORE] AsyncEventQueue should stop o...

2018-05-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21356#discussion_r189152423 --- Diff: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala --- @@ -97,6 +98,11 @@ private class AsyncEventQueue(val name: String, conf

[GitHub] spark pull request #21356: [SPARK-24309][CORE] AsyncEventQueue should stop o...

2018-05-17 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/21356 [SPARK-24309][CORE] AsyncEventQueue should stop on interrupt. EventListeners can interrupt the event queue thread. In particular, when the EventLoggingListener writes to hdfs, hdfs can

[GitHub] spark issue #21346: [SPARK-6237][NETWORK] Network-layer changes to allow str...

2018-05-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21346 btw I may have made the pull-based approach sound more complex than I meant to, I'm happy to take that approach if you think its better. The fact the replication is synchronous doesn't really

[GitHub] spark issue #21346: [SPARK-6237][NETWORK] Network-layer changes to allow str...

2018-05-16 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21346 All good questions and stuff I had wondered about too -- I should actually be sure to comment on these on the jira as well: > I recall that the problem with large shuffle blo

[GitHub] spark pull request #21346: [SPARK-6237][NETWORK] Network-layer changes to al...

2018-05-16 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/21346 [SPARK-6237][NETWORK] Network-layer changes to allow stream upload. These changes allow an RPCHandler to receive an upload as a stream of data, without having to buffer the entire message

[GitHub] spark pull request #21299: [SPARK-24250][SQL] support accessing SQLConf insi...

2018-05-16 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21299#discussion_r188650358 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala --- @@ -90,13 +92,42 @@ object SQLExecution { * thread from

[GitHub] spark issue #21341: Revert "[SPARK-22938][SQL][FOLLOWUP] Assert that SQLConf...

2018-05-16 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21341 sure this is fine, but we'll see the flakiness back in the builds till https://github.com/apache/spark/pull/21299 is merged, right

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-05-09 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21221 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21280: [SPARK-19181][Core] Fixing flaky "SparkListenerSuite.loc...

2018-05-09 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21280 Jenkins, add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-08 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 @jinxing64 > I guess your concern is ArrayBuffer will do lots of copy as size of elements grows, and we don't need fast random access in ShuffleBlockFetcherIterator my concern was

[GitHub] spark pull request #21185: [SPARK-23894][CORE][SQL] Defensively clear Active...

2018-05-08 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21185#discussion_r186772234 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -229,6 +229,23 @@ private[spark] class Executor

[GitHub] spark pull request #21185: [SPARK-23894][CORE][SQL] Defensively clear Active...

2018-05-08 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/21185 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-05-08 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21185 I'm closing this in favor of https://github.com/apache/spark/pull/21190 --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21131: [SPARK-23433][CORE] Late zombie task completions update ...

2018-05-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21131 merged to master / 2.3 / 2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21131: [SPARK-23433][CORE] Late zombie task completions ...

2018-05-03 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/21131 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 can you add a test in MapOutputTrackerSuite and update the pr description to include all the changes? but overall looks good

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 If you have a heap dump, there are tools that can check for wasted space in ArrayBuffer. Eg. [jxray](http://www.jxray.com/) or [YourKit Memory Inspections](https://www.yourkit.com/docs/java/help

[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20940 thanks, can you close this pull request now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21221 Jenkins, ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20940 @edwinalu the diff is really weird in github now -- can you merge in the latest master? Or if github is just confused, maybe close this and open a new PR. thanks

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21185 @jkbradley still seeing flakiness in R tests, in other PRs too. I'm not even sure how to interpret the failure. is it safe to ignore those? I can't see how this would be effecting R tests

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185570861 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala --- @@ -328,4 +328,19 @@ package object config

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r18979 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTrackerSuite.scala --- @@ -0,0 +1,144

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185534467 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala --- @@ -354,7 +358,7 @@ class YarnAllocatorSuite

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185528716 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -273,7 +273,7 @@ private[scheduler] class BlacklistTracker

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185556534 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTrackerSuite.scala --- @@ -0,0 +1,144

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185529548 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/FailureTracker.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185556226 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTrackerSuite.scala --- @@ -0,0 +1,144

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185539768 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -602,8 +569,7 @@ private[yarn] class YarnAllocator

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185549149 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTrackerSuite.scala --- @@ -0,0 +1,144

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185558661 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTrackerSuite.scala --- @@ -0,0 +1,144

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185543512 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala --- @@ -0,0 +1,145

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185532512 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala --- @@ -0,0 +1,145

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185543919 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala --- @@ -0,0 +1,145

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185545685 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTrackerSuite.scala --- @@ -0,0 +1,144

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-05-02 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r185533750 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -102,18 +102,14 @@ private[yarn] class

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 This makes sense to me. You should update the comment on `MapOutputTracker.getMapSizesByExecutorId` to mention that it excludes the zero-sized blocks, and also remove the filter

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21185 > Yea I think this is the root cause. I'm making a PR to ban SQLConf.get at executor side, shall we do the same thing for SparkSession? And fixes all the places that mistakenly access SparkSess

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-05-01 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21185 I don't understand the comments about "leaked threads". the executor thread pool is allowed to create threads whenever it wants. You can play around with this example if you li

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-04-30 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21185 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20604: [SPARK-23365][CORE] Do not adjust num executors w...

2018-04-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20604#discussion_r185005638 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1643,7 +1646,10 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #21185: [SPARK-23894][CORE][SQL] Defensively clear Active...

2018-04-30 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21185#discussion_r184993529 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -299,6 +316,9 @@ private[spark] class Executor

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-04-30 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21185 @cloud-fan > I think SparkSession is driver only, how do we access it in executor? that's the whole problem. Its only meant to be available on the driver, but it ends up getting

[GitHub] spark issue #21041: [SPARK-23962][SQL][TEST] Fix race in currentExecutionIds...

2018-04-27 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21041 I submitted a PR for SPARK-23894, https://github.com/apache/spark/pull/21185, please take a look. --- - To unsubscribe, e-mail

[GitHub] spark pull request #21185: [SPARK-23894][CORE][SQL] Defensively clear Active...

2018-04-27 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/21185 [SPARK-23894][CORE][SQL] Defensively clear ActiveSession in Executors Because SparkSession.getActiveSession uses an InheritableThreadLocal, the ThreadPool in executors might end up inheriting

[GitHub] spark issue #21041: [SPARK-23962][SQL][TEST] Fix race in currentExecutionIds...

2018-04-27 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21041 @dongjoon-hyun thanks for reporting this. I think this is the same as https://issues.apache.org/jira/browse/SPARK-23894 . I am nearly certain its not directly caused by this change, but some

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-04-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r184575258 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -170,8 +170,7 @@ class

[GitHub] spark pull request #21165: Spark 20087: Attach accumulators / metrics to 'Ta...

2018-04-26 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r184404291 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -287,6 +287,33 @@ private[spark] class Executor( notifyAll

[GitHub] spark issue #21165: Spark 20087: Attach accumulators / metrics to 'TaskKille...

2018-04-26 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21165 Ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-25 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 A couple more high-level thoughts: 1) Do we want to have a event posted about the node getting blacklisted? I think it would be useful. But then there needs to be a msg from

<    1   2   3   4   5   6   7   8   9   10   >