GitHub user wankunde opened a pull request:
https://github.com/apache/spark/pull/15578
Branch 2.0
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wankunde/spark branch-2.0
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15578.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15578
commit 72d9fba26c19aae73116fd0d00b566967934c6fc
Author: WeichenXu <weichenxu...@outlook.com>
Date: 2016-09-22T11:35:54Z
[SPARK-17281][ML][MLLIB] Add treeAggregateDepth parameter for
AFTSurvivalRegression
## What changes were proposed in this pull request?
Add treeAggregateDepth parameter for AFTSurvivalRegression to keep
consistent with LiR/LoR.
## How was this patch tested?
Existing tests.
Author: WeichenXu <weichenxu...@outlook.com>
Closes #14851 from
WeichenXu123/add_treeAggregate_param_for_survival_regression.
commit 8a02410a92429bff50d6ce082f873cea9e9fa91e
Author: Wenchen Fan <wenc...@databricks.com>
Date: 2016-09-22T15:25:32Z
[SQL][MINOR] correct the comment of SortBasedAggregationIterator.safeProj
## What changes were proposed in this pull request?
This comment went stale long time ago, this PR fixes it according to my
understanding.
## How was this patch tested?
N/A
Author: Wenchen Fan <wenc...@databricks.com>
Closes #15095 from cloud-fan/update-comment.
commit 17b72d31e0c59711eddeb525becb8085930eadcc
Author: Dhruve Ashar <das...@yahoo-inc.com>
Date: 2016-09-22T17:10:37Z
[SPARK-17365][CORE] Remove/Kill multiple executors together to reduce RPC
call time.
## What changes were proposed in this pull request?
We are killing multiple executors together instead of iterating over
expensive RPC calls to kill single executor.
## How was this patch tested?
Executed sample spark job to observe executors being killed/removed with
dynamic allocation enabled.
Author: Dhruve Ashar <das...@yahoo-inc.com>
Author: Dhruve Ashar <dhruveas...@gmail.com>
Closes #15152 from dhruve/impr/SPARK-17365.
commit 9f24a17c59b1130d97efa7d313c06577f7344338
Author: Shivaram Venkataraman <shiva...@cs.berkeley.edu>
Date: 2016-09-22T18:52:42Z
Skip building R vignettes if Spark is not built
## What changes were proposed in this pull request?
When we build the docs separately we don't have the JAR files from the
Spark build in
the same tree. As the SparkR vignettes need to launch a SparkContext to be
built, we skip building them if JAR files don't exist
## How was this patch tested?
To test this we can run the following:
```
build/mvn -DskipTests -Psparkr clean
./R/create-docs.sh
```
You should see a line `Skipping R vignettes as Spark JARs not found` at the
end
Author: Shivaram Venkataraman <shiva...@cs.berkeley.edu>
Closes #15200 from shivaram/sparkr-vignette-skip.
commit 85d609cf25c1da2df3cd4f5d5aeaf3cbcf0d674c
Author: Burak Yavuz <brk...@gmail.com>
Date: 2016-09-22T20:05:41Z
[SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames
## What changes were proposed in this pull request?
Consider you have a bucket as `s3a://some-bucket`
and under it you have files:
```
s3a://some-bucket/file1.parquet
s3a://some-bucket/file2.parquet
```
Getting the parent path of `s3a://some-bucket/file1.parquet` yields
`s3a://some-bucket/` and the ListingFileCatalog uses this as the key in the
hash map.
When catalog.allFiles is called, we use `s3a://some-bucket` (no slash at
the end) to get the list of files, and we're left with an empty list!
This PR fixes this by adding a `/` at the end of the `URI` iff the given
`Path` doesn't have a parent, i.e. is the root. This is a no-op if the path
already had a `/` at the end, and is handled through the Hadoop Path, path
merging semantics.
## How was this patch tested?
Unit test in `FileCatalogSuite`.
Author: Burak Yavuz <brk...@gmail.com>
Closes #15169 from brkyvz/SPARK-17613.
commit 3cdae0ff2f45643df7bc198cb48623526c7eb1a6
Author: Shixiong Zhu <shixi...@databricks.com>
Date: 2016-09-22T21:26:45Z
[SPARK-17638][STREAMING] Stop JVM StreamingContext when the Python process
is dead
## What changes were proposed in this pull request?
When the Python process is dead, the JVM StreamingContext is still running.
Hence we will see a lot of Py4jException before the JVM process exits. It's
better to stop the JVM StreamingContext to avoid those annoying logs.
## How was this patch tested?