GitHub user watermen opened a pull request:

    https://github.com/apache/carbondata/pull/978

    Cover the case when last page is not be consumed at the end

    First, we use Producer-Consumer model in the write step, we have n(default 
value is 2 and can be configured) producers and one consumer. The task of 
generate last page(less than 32000) is added to thread pool at the end, but 
can't be guaranteed to be finished and add to BlockletDataHolder at the end. 
Because we have n tasks running concurrently.
    Second, we have 2 ways to invoke `writeDataToFile`, one is the size of 
`DataWriterHolder` reach the size of blocklet and two is the page is the last 
page.
    So if the last page is not be consumed at the end, we lost the page which 
be consumed after last page.
    This PR add a flag named isLastPageWrited to make sure every page is writed.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/watermen/incubator-carbondata CARBONDATA-1109

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #978
    
----
commit 6e34389cbb011734078d8c2431065d1f04fc891f
Author: Yadong Qi <qiyadong2...@gmail.com>
Date:   2017-05-31T09:35:45Z

    Cover the case when last page is not be consumed at the end.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to