GitHub user watermen opened a pull request: https://github.com/apache/carbondata/pull/978
Cover the case when last page is not be consumed at the end First, we use Producer-Consumer model in the write step, we have n(default value is 2 and can be configured) producers and one consumer. The task of generate last page(less than 32000) is added to thread pool at the end, but can't be guaranteed to be finished and add to BlockletDataHolder at the end. Because we have n tasks running concurrently. Second, we have 2 ways to invoke `writeDataToFile`, one is the size of `DataWriterHolder` reach the size of blocklet and two is the page is the last page. So if the last page is not be consumed at the end, we lost the page which be consumed after last page. This PR add a flag named isLastPageWrited to make sure every page is writed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/watermen/incubator-carbondata CARBONDATA-1109 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/978.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #978 ---- commit 6e34389cbb011734078d8c2431065d1f04fc891f Author: Yadong Qi <qiyadong2...@gmail.com> Date: 2017-05-31T09:35:45Z Cover the case when last page is not be consumed at the end. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---