[jira] [Created] (PHOENIX-1779) Parallelize fetching of next batch of records for scans corresponding to queries with no order by

Samarth Jain (JIRA) Thu, 26 Mar 2015 12:20:07 -0700

Samarth Jain created PHOENIX-1779:
-------------------------------------

             Summary: Parallelize fetching of next batch of records for scans 
corresponding to queries with no order by 
                 Key: PHOENIX-1779
                 URL: https://issues.apache.org/jira/browse/PHOENIX-1779
             Project: Phoenix
          Issue Type: Improvement
            Reporter: Samarth Jain
            Assignee: Samarth Jain



Today in Phoenix we parallelize the first execution of scans i.e. we load only 
the first batch of records up to the scan's cache size in parallel. Loading of 
subsequent batches of records in scanners is essentially serial. This could be 
improved especially for queries, including the ones with no order by clauses,  
that do not need any kind of merge sort on the client. This could also 
potentially improve the performance of UPSERT SELECT statements that load data 
from one table and insert into another. One such use case being creating 
immutable indexes for tables that already have data. It could also potentially 
improve the performance of our MapReduce solution for bulk loading data by 
improving the speed of the loading/mapping phase. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (PHOENIX-1779) Parallelize fetching of next batch of records for scans corresponding to queries with no order by

Reply via email to