[ https://issues.apache.org/jira/browse/ARROW-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088966#comment-17088966 ]
Mayur Srivastava commented on ARROW-8543: ----------------------------------------- Thank you for fixing the PR! This is my first contribution so I'm not well versed. Thanks, Mayur > [C++] IO: single pass coalescing algorithm > ------------------------------------------ > > Key: ARROW-8543 > URL: https://issues.apache.org/jira/browse/ARROW-8543 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Mayur Srivastava > Priority: Major > Labels: pull-request-available > > The current coalescing algorithm is a O(N^2) two pass algorithm (where N is > number of ranges) (first implemented in > https://issues.apache.org/jira/browse/ARROW-7995). In the first pass, the > Coalesce() function finds the begin and end of a candidate range that can be > coalesced. In the second, pass the CoalesceUntilLargeEnough() function goes > over the ranges from begin to end and adds coalesced range to the result > (out). > The proposal is to convert the algorithm to an O(N) single pass algorithm > that computes coalesced ranges while making the first pass over the ranges in > the list. This algorithm is also shorter in lines of code and hence > (hopefully) more maintainable in long term. > Correction: Post sorting, the current algorithm is O(N) and the improvement > is O(N). > -- This message was sent by Atlassian Jira (v8.3.4#803005)