subject:"\[GitHub\] spark pull request\: \[SPARK\-14277\] UnsafeSorterSpillReader should d..."

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-31 Thread sitalkedia

Github user sitalkedia closed the pull request at:

https://github.com/apache/spark/pull/12074


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-31 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-204146086
  
Changed the SPARK-14277 JIRA's description, closing this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-31 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-204125633
  
Great! @sitalkedia, do you mind closing this PR in favor of #12096 and 
updating the SPARK-14277 JIRA's description to match your new PR so that it 
accurately describes the change that we're going to commit? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-31 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-204117055
  
Thanks @xerial. I tested the change and I saw 7.5% CPU savings after this 
change. Opened a PR https://github.com/apache/spark/pull/12096 to upgrade 
snappy. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-31 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-204089164
  
Thanks @xerial! @sitalkedia, feel free to open a new PR for the dep. bump 
after you finish testing this new version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread xerial

Github user xerial commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203748806
  
Released snappy-java-1.1.2.4 with this fix. Thanks for letting me know. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread xerial

Github user xerial commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203734165
  
@sitalkedia Sure. I'll do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203727586
  
@xerial - I am seeing similar issue for snappy write as well. Can we fix 
the write code path as well? 

Stack trace - 


org.xerial.snappy.SnappyNative.arrayCopy(Native Method)
org.xerial.snappy.Snappy.arrayCopy(Snappy.java:85)
org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:273)
org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:115)

org.apache.spark.io.SnappyOutputStreamWrapper.write(CompressionCodec.scala:202)

org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:220)

org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:126)

org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:192)

org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:175)

org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:249)
org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)

org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.growPointerArrayIfNecessary(UnsafeExternalSorter.java:298)

org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:338)

org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:93)

org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:179)
org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:90)
org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:64)

org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728)

org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
org.apache.spark.scheduler.Task.run(Task.scala:89)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203715133
  
@JoshRosen - thanks, working on it. Will update soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203711830
  
@sitalkedia, if you confirm that the updated `snappy-java` fixes the 
performance issue for you, then I'd open a different pull request to upgrade 
Spark to the newer version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203707032
  
@JoshRosen - I guess after @xerial 's change, we won't be needing this 
change, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203704164
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54564/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203704159
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203703541
  
**[Test build #54564 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54564/consoleFull)**
 for PR 12074 at commit 
[`5ad27f4`](https://github.com/apache/spark/commit/5ad27f47f4b452e17067424b2eda480b1a9ac454).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203703076
  
That's great. Thanks a lot for the quick fix. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread xerial

Github user xerial commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203702958
  
I have just deployed snappy-java-1.1.2.3 with this fix, which will be 
synchronized to the Maven central soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203701547
  
Thanks @xerial , this is going to fix all snappy read/write inefficiency 
due to small writes.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread xerial

Github user xerial commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203687378
  
A reason snappy-java's SnappyInputStream uses Snappy.arrayCopy (JNI method) 
is to load the uncompressed data into primitive type arrays (e.g., float[], 
int[]) since there is no standard Java method for doing this. 

When writing data to byte[], replacing the implementation with non-JNI 
based one (using System.arrayCopy) is possible. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203683427
  
**[Test build #54564 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54564/consoleFull)**
 for PR 12074 at commit 
[`5ad27f4`](https://github.com/apache/spark/commit/5ad27f47f4b452e17067424b2eda480b1a9ac454).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203682157
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

Github user sitalkedia commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203680866
  
@JoshRosen - There might be other places where buffering might help, I did 
not notice any other hotspot during my job run though. Also, as you mentioned 
pushing this into `wrapForCompression ` has undesirable effect of double 
buffering. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203656991
  
Also, /cc @xerial, who may be able to comment on whether `snappy-java` 
performs any of its own buffering.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203656188
  
Is this the only place where buffering helps or would it make sense to do 
buffered reads from Snappy streams in other circumstances as well? In other 
words, should this buffering perhaps either be done at more call-sites of 
`wrapForCompression` or in `wrapForCompression` itself? (Note that pushing this 
into `wrapForCompression` risks accidental double-buffering, which might be 
undesirable).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12074#issuecomment-203651711
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

2016-03-30 Thread sitalkedia

GitHub user sitalkedia opened a pull request:

https://github.com/apache/spark/pull/12074

[SPARK-14277] UnsafeSorterSpillReader should do buffered read from unâ¦

## What changes were proposed in this pull request?

While running a Spark job which is spilling a lot of data in reduce phase, 
we see that significant amount of CPU is being consumed in native Snappy 
ArrayCopy method (Please see the stack trace below).
Stack trace - 
org.xerial.snappy.SnappyNative.$$YJP$$arrayCopy(Native Method)
org.xerial.snappy.SnappyNative.arrayCopy(SnappyNative.java)
org.xerial.snappy.Snappy.arrayCopy(Snappy.java:85)
org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:190)
org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:163)
java.io.DataInputStream.readFully(DataInputStream.java:195)
java.io.DataInputStream.readLong(DataInputStream.java:416)

org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillReader.loadNext(UnsafeSorterSpillReader.java:71)

org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillMerger$2.loadNext(UnsafeSorterSpillMerger.java:79)

org.apache.spark.sql.execution.UnsafeExternalRowSorter$1.next(UnsafeExternalRowSorter.java:136)

org.apache.spark.sql.execution.UnsafeExternalRowSorter$1.next(UnsafeExternalRowSorter.java:123)
The reason for that is the SpillReader does a lot of small reads from the 
underlying snappy compressed stream and we pay a heavy cost of jni calls for 
these small reads. The SpillReader should instead do a buffered read from the 
underlying snappy compressed stream.


## How was this patch tested?

Tested by running the job and we saw more than 10% cpu savings.

(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)


â¦derlying compression stream

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sitalkedia/spark bufferedReader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12074.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12074






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

[GitHub] spark pull request: [SPARK-14277] UnsafeSorterSpillReader should d...

25 matches

Site Navigation

Mail list logo

Footer information