marchpure opened a new pull request #3793:
URL: https://github.com/apache/carbondata/pull/3793


   ### Why is this PR needed?
    In the CDC flow. the parallelism of processing deltafiles is the same as 
executor number. The insufficient parallelism limits CDC's performance.
    
    ### What changes were proposed in this PR?
    Set the parallelism of processing deltafiles as same as the configured 
value of 'spark.sql.suffle.partitions'.
    Specially, it won't increase the file count of deltafiles because of the 
deltafiles combination.
   
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to