It's not that insane, it's about several weeks however the big segment can
stay there for quite long if there's not enough update for a merge policy
to pick it up

On Tue, Nov 28, 2023, 17:14 Dongyu Xu <dongyu...@hotmail.com> wrote:

> What is the expected grace time for the data-deletion request to take
> place?
>
> I'm not expert about the policy but I think something like "I need my data
> to be gone in next 2 second" is unreasonable.
>
> Tony X
>
> ------------------------------
> *From:* Robert Muir <rcm...@gmail.com>
> *Sent:* Tuesday, November 28, 2023 11:52 AM
> *To:* dev@lucene.apache.org <dev@lucene.apache.org>
> *Subject:* Re: GDPR compliance
>
> I don't think there's any problem with GDPR, and I don't think users
> should be running unnecessary "optimize". GDRP just says data should
> be erased without "undue" delay. waiting for a merge to nuke the
> deleted docs isn't "undue", there is a good reason for it.
>
> On Tue, Nov 28, 2023 at 2:40 PM Patrick Zhai <zhai7...@gmail.com> wrote:
> >
> > Hi Folks,
> > In LinkedIn we need to comply with GDPR for a large part of our data,
> and an important part of it is that we need to be sure we have completely
> deleted the data the user requested to delete within a certain period of
> time.
> > The way we have come up with so far is to:
> > 1. Record the segment creation time somewhere (not decided yet, maybe
> index commit userinfo, maybe some other place outside of lucene)
> > 2. Create a new merge policy which delegate most operations to a normal
> MP, like TieredMergePolicy, and then add extra single-segment (merge from 1
> segment to 1 segment, basically only do deletion) merges if it finds any
> segment is about to violate the GDPR time frame.
> >
> > So here's my question:
> > 1. Is there a better/existing way to do this?
> > 2. I would like to directly contribute to Lucene about such a merge
> policy since I think GDPR is more or less a common thing. Would like to
> know whether people feel like it's necessary or not?
> > 3. It's also nice if we can store the segment creation time to the index
> directly by IndexWriter (maybe write to SegmentInfo?), I can try to do that
> but would like to ask whether there's any objections?
> >
> > Best
> > Patrick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to