Hey, I'm not a Druid developer, so it's quite possible I'm missing many considerations here, but from a first glance, I like your offer, as it resembles the *tsColumn *in JDBC lookups ( https://druid.apache.org/docs/latest/development/extensions-core/lookups-cached-global.html#jdbc-lookup ).
Anyway, just my 2 cents. Thanks! Itai On Tue, Apr 6, 2021 at 6:07 AM Benedict Jin <asdf2...@apache.org> wrote: > Hi all, > > Recently, when the Coordinator in our company's Druid cluster pulls > metadata, there is a performance bottleneck. The main reason is the huge > amount of metadata, which leads to a very slow process of scanning the full > table of metadata storage and deserializing metadata. The size of the full > metadata has been reduced through TTL, Compaction, Rollup, and etc., but > the effect is not very significant. Therefore, I want to design a scheme > for Coordinator to pull metadata incrementally, that is, each time > Coordinator only pulls newly added metadata, so as to reduce the query > pressure of metadata storage and the pressure of deserializing metadata. > The general idea is to add a column last_update to the druid_segments table > to record the update time of each record. Furthermore, when we query the > metadata table, we can add filter conditions for the last_update column to > avoid full table scan operations. Moreover, whether it is MySQL or > PostgreSQL as the metadata storage medium, it can support > automatic update of the timestamp field, which is somewhat similar to the > characteristics of triggers. So, have you encountered this problem before? > If so, how did you solve it? In addition, do you have any suggestions or > comments on the above incremental acquisition of metadata? Please let me > know, thanks a lot. > > Regards, > Benedict Jin > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org > For additional commands, e-mail: dev-h...@druid.apache.org > >