[GitHub] spark pull request #19070: Branch 2.2

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19070


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19070: Branch 2.2

2017-08-28 Thread wind-org
GitHub user wind-org opened a pull request:

https://github.com/apache/spark/pull/19070

Branch 2.2

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/spark branch-2.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19070.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19070


commit 0bd918f67630f83cdc2922a2f48bd28b023ef821
Author: Wenchen Fan 
Date:   2017-05-15T16:22:06Z

[SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name should not fail if 
accumulator is garbage collected

## What changes were proposed in this pull request?

After https://github.com/apache/spark/pull/17596 , we do not send internal 
accumulator name to executor side anymore, and always look up the accumulator 
name in `AccumulatorContext`.

This cause a regression if the accumulator is already garbage collected, 
this PR fixes this by still sending accumulator name for `SQLMetrics`.

## How was this patch tested?

N/A

Author: Wenchen Fan 

Closes #17931 from cloud-fan/bug.

(cherry picked from commit e1aaab1e277b1b07c26acea75ade78e39bdac209)
Signed-off-by: Marcelo Vanzin 

commit 82ae1f0aca9c00fddba130c144adfe0777172cc8
Author: Tathagata Das 
Date:   2017-05-15T17:46:38Z

[SPARK-20716][SS] StateStore.abort() should not throw exceptions

## What changes were proposed in this pull request?

StateStore.abort() should do a best effort attempt to clean up temporary 
resources. It should not throw errors, especially because its called in a 
TaskCompletionListener, because this error could hide previous real errors in 
the task.

## How was this patch tested?
No unit test.

Author: Tathagata Das 

Closes #17958 from tdas/SPARK-20716.

(cherry picked from commit 271175e2bd0f7887a068db92de73eff60f5ef2b2)
Signed-off-by: Shixiong Zhu 

commit a79a120a8fc595045b32f16663286b32dadc53ed
Author: Tathagata Das 
Date:   2017-05-15T17:48:10Z

[SPARK-20717][SS] Minor tweaks to the MapGroupsWithState behavior

## What changes were proposed in this pull request?

Timeout and state data are two independent entities and should be settable 
independently. Therefore, in the same call of the user-defined function, one 
should be able to set the timeout before initializing the state and also after 
removing the state. Whether timeouts can be set or not, should not depend on 
the current state, and vice versa.

However, a limitation of the current implementation is that state cannot be 
null while timeout is set. This is checked lazily after the function call has 
completed.

## How was this patch tested?
- Updated existing unit tests that test the behavior of 
GroupState.setTimeout*** wrt to the current state
- Added new tests that verify the disallowed cases where state is undefined 
but timeout is set.

Author: Tathagata Das 

Closes #17957 from tdas/SPARK-20717.

(cherry picked from commit 499ba2cb47efd6a860e74e6995412408efc5238d)
Signed-off-by: Shixiong Zhu 

commit e84e9dd54cf67369c75fc38dc60d758ee8930240
Author: Dongjoon Hyun 
Date:   2017-05-15T18:24:30Z

[SPARK-20735][SQL][TEST] Enable cross join in TPCDSQueryBenchmark

## What changes were proposed in this pull request?

Since [SPARK-17298](https://issues.apache.org/jira/browse/SPARK-17298), 
some queries (q28, q61, q77, q88, q90) in the test suites fail with a message 
"_Use the CROSS JOIN syntax to allow cartesian products between these 
relations_".

This benchmark is used as a reference model for Spark TPC-DS, so this PR 
aims to enable the correct configuration in `TPCDSQueryBenchmark.scala`.

## How was this patch tested?

Manual. (Run TPCDSQueryBenchmark)

Author: Dongjoon Hyun 

Closes #17977 from dongjoon-hyun/SPARK-20735.

(cherry picked from commit bbd163d589e7503c5cb150d934e7565b18a908f2)
Signed-off-by: Xiao Li 

commit