kadirozde commented on pull request #936:
URL: https://github.com/apache/phoenix/pull/936#issuecomment-724872173


   > @kadirozde overall looks like a great improvement. I have added a few 
comments. Some questions:
   > 
   > 1. Is it more beneficial to have paging based on row size rather than 
number of rows, since each row can be arbitrarily large?
   > 2. Server-side pagination will help _reduce_ the chance of the race 
conditions mentioned in the Jira description, but does not aim at eliminating 
them, correct?
   > 3. Though this is aimed at such race conditions related to mutations 
(server-side UPSERT SELECT/DELETE), it seems like it will also affect the 
normal read path for non-Group_By aggregate queries. Is there any negative 
effect/extra slowness during reads due to this pagination, and if yes, do we 
want to make sure that changes only affect the write paths?
   > 
   > Let's also please add some tests for this.
   
   1. Not sure about it but we can introduce additional constraints like the 
total size of scanned bytes as you suggested to further improve this feature 
later. 
   2. This is correct. By itself, it does not eliminate. However, the client 
can wait for all the page operation to complete or fail before returning to the 
application, as an additional improvement. This will further reduce the race 
conditions. I think we have to enforce the client side timestamp to make the 
race almost impossible.
   3. I expect this feature will improve the overall performance and 
availability since paging limits the memory usage and the time to hold server 
resources. My experience with paging on a real cluster is very positive.  I 
have not seen any negative impact yet as long as the page size is not very 
small (e.g., less than 1000).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to