maytasm commented on pull request #11025: URL: https://github.com/apache/druid/pull/11025#issuecomment-812259294
> I meant, that might not be what users expect. I think they will want to know when they can query the new data without seeing old one. Maybe this can be addressed using #10676. > > > One problem might be auto compaction as it would try to compact the same interval again. > > Can you elaborate more on this problem? I'm not sure why it would compact the same interval again. Ah I see. For compaction task, the lag in dropping old segments would not be a problem for querying. For new data without seeing old data, I agree with you that it can be addressed using mechanism in https://github.com/apache/druid/pull/10676. The problem with auto compaction is when we change segment granularity to a smaller granularity (for example MONTH to DAY) and we do not have data in all time chunk of the new granularity (for example, we only have a DAY of data in the MONTH interval). In this case, if the old segment is not marked unused by the time the auto compaction runs in the next cycle, the auto compaction would see the old segment as needing compaction and try to compact the same interval again (auto compaction would see both the new DAYS segments and old MONTH segments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
