sundy-li commented on pull request #9602:
URL: https://github.com/apache/arrow/pull/9602#issuecomment-792216219


   > For example, using a priority queue to keep only the top k values in 
memory.
   
   Yes, but lots of codes may duplicate with sort kernel. partial_sort used 
priority queue inside. It maybe good to do sorting in pipeline OLAP systems 
   
   In ClickHouse,  PartialSortingTransform(Each block in each thread) --> 
MergeSortingTransform(Blocks to one block in each thread) --> 
MergingSortedTransform(N Block in N Thread to one block) . 
   
   ```
   ┌─explain────────────────────────────────┐
   │ (Expression)                           │
   │ ExpressionTransform                    │
   │   (Limit)                              │
   │   Limit                                │
   │     (MergingSorted)                    │
   │     MergingSortedTransform 16 → 1      │
   │       (MergeSorting)                   │
   │       MergeSortingTransform × 16       │
   │         (PartialSorting)               │
   │         LimitsCheckingTransform × 16   │
   │           PartialSortingTransform × 16 │
   │             (Expression)               │
   │             ExpressionTransform × 16   │
   │               (SettingQuotaAndLimits)  │
   │                 (ReadFromStorage)      │
   │                 NumbersMt × 16 0 → 1   │
   └────────────────────────────────────────┘
   ```
   
   
   
   @alamb @jorgecarleitao  Thanks for all your reviews. I also have the 
consideration about unsafe codes in partial_sort may break `arrow`, because it 
was just created, without any used in production(BTW I am new to rust).
   
   We can keep this MR open currently until you think it's safe enough or must 
have it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to