[GitHub] spark pull request #15827: [SPARK-18187][STREAMING] CompactibleFileStreamLog...

2016-11-20 Thread uncleGen
Github user uncleGen closed the pull request at:

https://github.com/apache/spark/pull/15827


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15827: [SPARK-18187][STREAMING] CompactibleFileStreamLog...

2016-11-09 Thread uncleGen
GitHub user uncleGen opened a pull request:

https://github.com/apache/spark/pull/15827

[SPARK-18187][STREAMING] CompactibleFileStreamLog should not use 
"compactInterval" direcly with user setting.

## What changes were proposed in this pull request?

CompactibleFileStreamLog relys on "compactInterval" to detect a compaction 
batch. If the "compactInterval" is reset by user, CompactibleFileStreamLog will 
return wrong answer, resulting data loss. This PR procides a way to check the 
validity of 'compactInterval', and calculate an appropriate value.

## How was this patch tested?

When restart a stream, we change the 
'spark.sql.streaming.fileSource.log.compactInterval' different with the former 
one.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uncleGen/spark SPARK-18187

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15827


commit 65395dddb505f6084db471430da1486d75a77e2a
Author: genmao.ygm 
Date:   2016-11-09T08:21:09Z

SPARK-18187: CompactibleFileStreamLog should not rely on "compactInterval" 
to detect a compaction batch

commit d556933e0f039d661989e07f381aff185c9fac1b
Author: genmao.ygm 
Date:   2016-11-09T08:24:53Z

comment update

commit 8b56f70b2dffd69dbc37007e923f3d5a56fce039
Author: genmao.ygm 
Date:   2016-11-09T08:34:11Z

revert

commit 4a7e28c4e372caa3b16b979273577bd6aa2c11f3
Author: genmao.ygm 
Date:   2016-11-09T08:35:13Z

unit test - compacat metadata log
change compactInterval from 4 to 5




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org