[GitHub] spark issue #20514: [SPARK-23310][CORE][FOLLOWUP] Fix Java style check issue...

2018-02-05 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/20514 LGTM, thanks for fixing this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #20492: [SPARK-23310][CORE] Turn off read ahead input str...

2018-02-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/20492#discussion_r165874317 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java --- @@ -77,7 +77,7 @@ public

[GitHub] spark pull request #20492: [SPARK-21113][CORE] Turn off read ahead input str...

2018-02-02 Thread sitalkedia
GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/20492 [SPARK-21113][CORE] Turn off read ahead input stream for unshafe shuffle reader To fix regression for TPC-DS queries You can merge this pull request into a Git repository by running

[GitHub] spark pull request #20014: [SPARK-22827][CORE] Avoid throwing OutOfMemoryErr...

2017-12-19 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/20014#discussion_r157841590 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java --- @@ -341,7 +342,7 @@ private void growPointerArrayIfNecessary

[GitHub] spark pull request #20014: [SPARK-22827][CORE] Avoid throwing OutOfMemoryErr...

2017-12-18 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/20014#discussion_r157682902 --- Diff: core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java --- @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #20014: [SPARK-22827][CORE] Avoid throwing OutOfMemoryError in c...

2017-12-18 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/20014 Jenkins test this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20014: [SPARK-22827][CORE] Avoid throwing OutOfMemoryError in c...

2017-12-18 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/20014 cc - @rxin, @sameeragarwal, @zsxwing, --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20014: [SPARK-22827][CORE] Avoid throwing OutOfMemoryErr...

2017-12-18 Thread sitalkedia
GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/20014 [SPARK-22827][CORE] Avoid throwing OutOfMemoryError in case of exception in spill ## What changes were proposed in this pull request? Currently, the task memory manager throws

[GitHub] spark issue #19955: [SPARK-21867][CORE] Support async spilling in UnsafeShuf...

2017-12-13 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19955 @ericvandenbergfb - In the testing section, let's put the benchmark numbers for large shuffle heavy jobs we ran internally

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-31 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r148184755 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -216,3 +218,33 @@ private final class SnappyOutputStreamWrapper(os

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r147618796 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -216,3 +218,33 @@ private final class SnappyOutputStreamWrapper(os

[GitHub] spark pull request #19580: [SPARK-11334][CORE] Fix bug in Executor allocatio...

2017-10-27 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/19580#discussion_r147328834 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -267,6 +267,10 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #19580: [SPARK-11334][CORE] Fix bug in Executor allocatio...

2017-10-26 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/19580#discussion_r147320096 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -267,6 +267,10 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-26 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r147215825 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -216,3 +218,33 @@ private final class SnappyOutputStreamWrapper(os

[GitHub] spark issue #19580: [SPARK-22312][CORE] Fix bug in Executor allocation manag...

2017-10-26 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19580 duplicate of https://github.com/apache/spark/pull/19534 cc - @vanzin, --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #19580: [SPARK-22312][CORE] Fix bug in Executor allocatio...

2017-10-26 Thread sitalkedia
GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/19580 [SPARK-22312][CORE] Fix bug in Executor allocation manager in running tasks calculation ## What changes were proposed in this pull request? We often see the issue of Spark jobs stuck

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-26 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 ping. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19534: [SPARK-22312][CORE] Fix bug in Executor allocatio...

2017-10-23 Thread sitalkedia
Github user sitalkedia closed the pull request at: https://github.com/apache/spark/pull/19534 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19534: [SPARK-22312][CORE] Fix bug in Executor allocation manag...

2017-10-20 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19534 I think other PR is fixing one more issue on top of runningTasks being negative, so we can proceed with the other one. What do you think @jerryshao

[GitHub] spark issue #19534: [SPARK-22312][CORE] Fix bug in Executor allocation manag...

2017-10-19 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19534 Jenkins retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19534: [SPARK-22312][CORE] Fix bug in Executor allocation manag...

2017-10-19 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19534 @jiangxb1987 - yes that is the issue and you are right, we can avoid it by checking if the stageId is valid when we get a task end event. But I like this approach better because we can clean up

[GitHub] spark issue #19534: [SPARK-22312][CORE] Fix bug in Executor allocation manag...

2017-10-18 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19534 cc - @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19534: [SPARK-22312][CORE] Fix bug in Executor allocatio...

2017-10-18 Thread sitalkedia
GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/19534 [SPARK-22312][CORE] Fix bug in Executor allocation manager in running… ## What changes were proposed in this pull request? We often see the issue of Spark jobs stuck because

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-18 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Created https://github.com/luben/zstd-jni/issues/47. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-11 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Yes, the binary distribution is included in the zstd-jni jar file. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-10-11 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 @vanzin - May be the test time outs are related to one test failure -https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82608/testReport/org.apache.spark.scheduler

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-28 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 @srowen - Did you get a chance to look into this ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-22 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Thanks for looking into this @srowen. Its weird, I dont understand that either. Also, I am not able to reproduce this issue on my laptop

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-18 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Updated with zstd-jni versin 1.3.1-1 and also updated the license to include zstd-jni license. @srowen - How does that look from licensing prospective

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-09-12 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 Thanks for the change. Left few comments there. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-09-12 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r138502867 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,313 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-09-12 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r138473231 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,315 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-09-11 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r138230590 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java --- @@ -72,10 +72,15 @@ public

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-09-11 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r138231043 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,315 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-09-11 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r138230901 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,315 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-09-11 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r138230596 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java --- @@ -72,10 +72,15 @@ public

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-09-09 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 Hi @zsxwing, could you find some time to take a look at this PR. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-08-31 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 ping @zsxwing ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135950441 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 Created https://github.com/apache/spark/pull/19081. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #19081: [SPARK-21834] Incorrect executor request in case ...

2017-08-29 Thread sitalkedia
GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/19081 [SPARK-21834] Incorrect executor request in case of dynamic allocation ## What changes were proposed in this pull request? killExecutor api currently does not allow killing an executor

[GitHub] spark pull request #19048: [SPARK-21834] Incorrect executor request in case ...

2017-08-29 Thread sitalkedia
Github user sitalkedia closed the pull request at: https://github.com/apache/spark/pull/19048 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135889304 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135884679 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135882122 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,317 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135882063 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,317 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135881985 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,317 @@ +/* + * Licensed under the Apache License

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-29 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 Not sure why the test failed? May be the build is unstable? cc - @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689398 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689367 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689239 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689225 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689194 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689169 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689113 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135689067 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135688838 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r135688946 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,292 @@ +/* + * Licensed under the Apache License

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 @jiangxb1987 - I agree with you. I do not have the context or history to comment on that. Unfortunately, the api has been designed that way and book keeping of target number of executors is done

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 Sure, I will address @mridulm comment in next few days. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 >>Or it can call killExecutors() like it does today and then call requestTotalExecutors right after, same result without the awkwardness of the parameter name, but that adds

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-28 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 On a high level I agree that keeping the states in 3 places is creating a mess but changing that would require a big refactoring which is probably outside of the scope of this change

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 >> this code in the EAM: Should be changed to account for the current number of executors, so that the EAM doesn't tell the CGSB that it wants less executors than currently

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 >> I think this will be OK but we do need to add these two licenses to licenses/ (see the convention there) and also add a line for each in LICENSE here. @srowen - Does tha

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 >> Why? Because of the idle timeout? If that's your point, then the change I referenced above should avoid that. Yes because of idle timeout. Note that the `numExecutorsTarget

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 To be clear there is no issue on EAM side. Consider the following situation - - 10 executors are running, each executor can run 4 tasks at max. - 20 tasks are running so EAM sets

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 That's not really true. The EAM uses the `requestTotalExecutors` api to set the target for the scheduler. - 10 executors are running, each executor can run 4 tasks at max. - 20

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 Looking at the scheduler and the dynamic executor allocator code, this is what my understanding, correct me if I am wrong. Let's say the dynamic executor allocator is ramping down

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 Jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-24 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/19048 cc - @markhamstra , @sameeragarwal, @rxin, @vanzin, --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #19048: [SPARK-21834] Incorrect executor request in case ...

2017-08-24 Thread sitalkedia
GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/19048 [SPARK-21834] Incorrect executor request in case of dynamic allocation ## What changes were proposed in this pull request? killExecutor api currently does not allow killing an executor

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-23 Thread sitalkedia
Github user sitalkedia closed the pull request at: https://github.com/apache/spark/pull/18317 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-23 Thread sitalkedia
GitHub user sitalkedia reopened a pull request: https://github.com/apache/spark/pull/18317 [SPARK-21113][CORE] Read ahead input stream to amortize disk IO cost … Profiling some of our big jobs, we see that around 30% of the time is being spent in reading the spill files from disk

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-08-23 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 @jiangxb1987 - Made a change to wrap the read ahead input stream around the compressed input stream so that we can amortize the cost of decompression as well --- If your project is set up

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-23 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r134821455 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,288 @@ +/* + * Licensed under the Apache License

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-21 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Relevant PR - https://github.com/facebook/zstd/pull/801/files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-21 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 We have released new zstd version (https://github.com/facebook/zstd/releases) with modified BSD + GPLv2 license. @rxin, @srowen, @markhamstra - Can you confirm this looks fine from

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-13 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 https://github.com/facebook/zstd/issues/775 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-08-11 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 ping @jiangxb1987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-11 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Just an update on this - I am in talk with our internal team to relicense zstd library. This might take some time though. I will keep you updated. @discipleforteen - Unfortunately, we do

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-05 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131516266 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 @jiangxb1987, @kiszk Addressed review comments, lmk what you guys think. BTW, this idea can be applied to other places when we block on reading the input stream like HDFS reading. What

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131494491 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131494505 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131494464 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131493686 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java --- @@ -73,7 +73,9 @@ public

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131493511 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131493491 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark pull request #18317: [SPARK-21113][CORE] Read ahead input stream to am...

2017-08-04 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18317#discussion_r131493469 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -0,0 +1,279 @@ +/* + * Licensed under the Apache License

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-02 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 @rxin - Sure, let me talk to folks internally to see if it is possible to relicense. Otherwise, we might have to upgrade to hadoop 2.9.0, which will come with its own zstd implementation

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-02 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 >> How big is the dependency that's getting pulled in? zstd-jni library actually is a very thin library and is not pulling any dependency of its own, so I would not be worried

[GitHub] spark issue #18317: [SPARK-21113][CORE] Read ahead input stream to amortize ...

2017-08-02 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18317 @jiangxb1987 - Sorry haven't gotten time to work on this lately. Will address comments in next few days. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r130787262 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -50,13 +51,14 @@ private[spark] object CompressionCodec

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r130787287 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -216,3 +218,30 @@ private final class SnappyOutputStreamWrapper(os

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r130787269 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -216,3 +218,30 @@ private final class SnappyOutputStreamWrapper(os

[GitHub] spark pull request #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/18805#discussion_r130787205 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -216,3 +218,30 @@ private final class SnappyOutputStreamWrapper(os

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Any idea what is the build failure about? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-01 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Please note that few minor improvements I have made when comapring to old PR - #17303 1. Use zstd compression level 1 instead of 3, which is significantly faster. 2. Wrap the zstd input

  1   2   3   4   5   >