GitHub user gentlewangyu opened a pull request:
https://github.com/apache/spark/pull/21417
Branch 2.0
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/spark branch-2.0
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21417.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21417
commit 050b8177e27df06d33a6f6f2b3b6a952b0d03ba6
Author: cody koeninger
Date: 2016-10-12T22:22:06Z
[SPARK-17782][STREAMING][KAFKA] alternative eliminate race condition of
poll twice
## What changes were proposed in this pull request?
Alternative approach to https://github.com/apache/spark/pull/15387
Author: cody koeninger
Closes #15401 from koeninger/SPARK-17782-alt.
(cherry picked from commit f9a56a153e0579283160519065c7f3620d12da3e)
Signed-off-by: Shixiong Zhu
commit 5903dabc57c07310573babe94e4f205bdea6455f
Author: Brian Cho
Date: 2016-10-13T03:43:18Z
[SPARK-16827][BRANCH-2.0] Avoid reporting spill metrics as shuffle metrics
## What changes were proposed in this pull request?
Fix a bug where spill metrics were being reported as shuffle metrics.
Eventually these spill metrics should be reported (SPARK-3577), but separate
from shuffle metrics. The fix itself basically reverts the line to what it was
in 1.6.
## How was this patch tested?
Cherry-picked from master (#15347)
Author: Brian Cho
Closes #15455 from dafrista/shuffle-metrics-2.0.
commit ab00e410c6b1d7dafdfabcea1f249c78459b94f0
Author: Burak Yavuz
Date: 2016-10-13T04:40:45Z
[SPARK-17876] Write StructuredStreaming WAL to a stream instead of
materializing all at once
## What changes were proposed in this pull request?
The CompactibleFileStreamLog materializes the whole metadata log in memory
as a String. This can cause issues when there are lots of files that are being
committed, especially during a compaction batch.
You may come across stacktraces that look like:
```
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.lang.StringCoding.encode(StringCoding.java:350)
at java.lang.String.getBytes(String.java:941)
at
org.apache.spark.sql.execution.streaming.FileStreamSinkLog.serialize(FileStreamSinkLog.scala:127)
```
The safer way is to write to an output stream so that we don't have to
materialize a huge string.
## How was this patch tested?
Existing unit tests
Author: Burak Yavuz
Closes #15437 from brkyvz/ser-to-stream.
(cherry picked from commit edeb51a39d76d64196d7635f52be1b42c7ec4341)
Signed-off-by: Shixiong Zhu
commit d38f38a093b4dff32c686675d93ab03e7a8f4908
Author: buzhihuojie
Date: 2016-10-13T05:51:54Z
minor doc fix for Row.scala
## What changes were proposed in this pull request?
minor doc fix for "getAnyValAs" in class Row
## How was this patch tested?
None.
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Author: buzhihuojie
Closes #15452 from david-weiluo-ren/minorDocFixForRow.
(cherry picked from commit 7222a25a11790fa9d9d1428c84b6f827a785c9e8)
Signed-off-by: Reynold Xin
commit d7fa3e32421c73adfa522adfeeb970edd4c22eb3
Author: Shixiong Zhu
Date: 2016-10-13T20:31:50Z
[SPARK-17834][SQL] Fetch the earliest offsets manually in KafkaSource
instead of counting on KafkaConsumer
## What changes were proposed in this pull request?
Because `KafkaConsumer.poll(0)` may update the partition offsets, this PR
just calls `seekToBeginning` to manually set the earliest offsets for the
KafkaSource initial offsets.
## How was this patch tested?
Existing tests.
Author: Shixiong Zhu
Closes #15397 from zsxwing/SPARK-17834.
(cherry picked from commit 08eac356095c7faa2b19d52f2fb0cbc47eb7d1d1)
Signed-off-by: Shixiong Zhu
commit c53b8374911e801ed98c1436c384f0aef076eaab
Author: Davies Liu