Oh, that's easier than tombstones.  flag is_deleted and update timestamp
(so it gets pulled again).

On Tue, Apr 6, 2021 at 10:48 AM Tijo Thomas <tijothoma...@gmail.com> wrote:

> Abhishek,
> Good point.  Do we need one more col for storing if it's deleted or not?
>
> On Tue, Apr 6, 2021 at 4:32 PM Abhishek Agarwal <abhishek.agar...@imply.io
> >
> wrote:
>
> > If an entry is deleted from the metadata, how is the coordinator going to
> > update its own state?
> >
> > On Tue, Apr 6, 2021 at 3:38 PM Itai Yaffe <itai.ya...@gmail.com> wrote:
> >
> > > Hey,
> > > I'm not a Druid developer, so it's quite possible I'm missing many
> > > considerations here, but from a first glance, I like your offer, as it
> > > resembles the *tsColumn *in JDBC lookups (
> > >
> > >
> >
> https://druid.apache.org/docs/latest/development/extensions-core/lookups-cached-global.html#jdbc-lookup
> > > ).
> > >
> > > Anyway, just my 2 cents.
> > >
> > > Thanks!
> > >           Itai
> > >
> > > On Tue, Apr 6, 2021 at 6:07 AM Benedict Jin <asdf2...@apache.org>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Recently, when the Coordinator in our company's Druid cluster pulls
> > > > metadata, there is a performance bottleneck. The main reason is the
> > huge
> > > > amount of metadata, which leads to a very slow process of scanning
> the
> > > full
> > > > table of metadata storage and deserializing metadata. The size of the
> > > full
> > > > metadata has been reduced through TTL, Compaction, Rollup, and etc.,
> > but
> > > > the effect is not very significant. Therefore, I want to design a
> > scheme
> > > > for Coordinator to pull metadata incrementally, that is, each time
> > > > Coordinator only pulls newly added metadata, so as to reduce the
> query
> > > > pressure of metadata storage and the pressure of deserializing
> > metadata.
> > > > The general idea is to add a column last_update to the druid_segments
> > > table
> > > > to record the update time of each record. Furthermore, when we query
> > the
> > > > metadata table, we can add filter conditions for the last_update
> column
> > > to
> > > > avoid full table scan operations. Moreover, whether it is MySQL or
> > > > PostgreSQL as the metadata storage medium, it can support
> > > >  automatic update of the timestamp field, which is somewhat similar
> to
> > > the
> > > > characteristics of triggers. So, have you encountered this problem
> > > before?
> > > > If so, how did you solve it? In addition, do you have any suggestions
> > or
> > > > comments on the above incremental acquisition of metadata? Please let
> > me
> > > > know, thanks a lot.
> > > >
> > > > Regards,
> > > > Benedict Jin
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > > > For additional commands, e-mail: dev-h...@druid.apache.org
> > > >
> > > >
> > >
> >
>
>
> --
> Thanks & Regards
> Tijo Thomas
>

Reply via email to