[ https://issues.apache.org/jira/browse/CARBONDATA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
xuchuanyin resolved CARBONDATA-2238. ------------------------------------ Resolution: Fixed > Optimization in unsafe sort during data loading > ----------------------------------------------- > > Key: CARBONDATA-2238 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2238 > Project: CarbonData > Issue Type: Improvement > Components: data-load > Reporter: xuchuanyin > Assignee: xuchuanyin > Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > Inspired by batch_sort, if we have enough memory, in local_sort with unsafe > property, we can hold all the row pages in memory if possible and only spill > the pages to disk as sort temp file if the memory is unavailable. > Before spilling the pages, we can do in-memory merge sort of the pages. > Each time we request an unsafe row page, if the memory is unavailable, we can > trigger a merge sort for the in-memory pages and spill the result to disk as > a sort temp file. So the incoming pages will be held into the memory instead > of spilling to disk directly. > After this implementation, the data size during each spilling will be bigger > than that of before and will benefit the disk IO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)