maytasm commented on pull request #11025:
URL: https://github.com/apache/druid/pull/11025#issuecomment-812259294


   > I meant, that might not be what users expect. I think they will want to 
know when they can query the new data without seeing old one. Maybe this can be 
addressed using #10676.
   > 
   > > One problem might be auto compaction as it would try to compact the same 
interval again.
   > 
   > Can you elaborate more on this problem? I'm not sure why it would compact 
the same interval again.
   
   Ah I see. For compaction task, the lag in dropping old segments would not be 
a problem for querying. For new data without seeing old data, I agree with you 
that it can be addressed using mechanism in 
https://github.com/apache/druid/pull/10676.
   
   The problem with auto compaction is when we change segment granularity to a 
smaller granularity (for example MONTH to DAY) and we do not have data in all 
time chunk of the new granularity (for example, we only have a DAY of data in 
the MONTH interval). In this case, if the old segment is not marked unused by 
the time the auto compaction runs in the next cycle, the auto compaction would 
see the old segment as needing compaction and try to compact the same interval 
again (auto compaction would see both the new DAYS segments and old MONTH 
segments). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to