ShreelekhyaG opened a new pull request, #4287: URL: https://github.com/apache/carbondata/pull/4287
### Why is this PR needed? 1. Performance degradation for Incremental updates is observed more in partition table. - During the update, in the prune step we are listing files from segment path to get the carbondata files and create `fileNameToMetaInfoMapping `map. On incremental update for partition table, the number of invalid files keep on increasing each time which is causing the degradation in listing files. 2. Invalid segments cache is not removed after delete/update. ### What changes were proposed in this PR? 1. Instead of listing files, made a change to get carbon file from the file name and create BlockMetaInfo directly in `createBlockMetaInfo`. _**Impact when tested on a single partition with 100 segments:**_ - There is significant improvement observed in the Incremental update operation. - Improvement of `select count(*)` operation from 200 secs to 9 secs. Because in `select count(*)` flow it was listing files for each segment and the map was not reused. 3. Clearing invalid/deleted segments from cache after delete and update. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@carbondata.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org