On Thu, 10 Sep 2020 at 13:46, Daniel Gruno <[email protected]> wrote:
>
> On 10/09/2020 14.44, sebb wrote:
> > On Thu, 10 Sep 2020 at 13:23, Daniel Gruno <[email protected]> wrote:
> >>
> >> On 10/09/2020 14.15, sebb wrote:
> >>> On Thu, 10 Sep 2020 at 12:32, Daniel Gruno <[email protected]> wrote:
> >>>>
> >>>> On 10/09/2020 13.25, sebb wrote:
> >>>>> Migration to Foal will be a huge job for some installations.
> >>>>>
> >>>>> Whilst hopefully all snags will have been ironed out of any conversion
> >>>>> tool before it is deployed in earnest, it's possible that some edge
> >>>>> cases will cause issues, and will need subsequent adjustment.
> >>>>
> >>>> Short of ironing out a standard for DKIM_ID, the migration tests I've
> >>>> done have gone relatively well. There were IIRC a few snags, most
> >>>> related to the ES 7.8.1 lib, but once I got migration started, it worked
> >>>> as intended and everything on the new ES server was compatible. If we
> >>>> could somehow get a migration test running on travis or such, that would
> >>>> be ideal - but that is quite tricky - we'd have to maybe dockerize two
> >>>> containers - one with old pony, one with foal, and then test migrating
> >>>> across and checking that each document is obtainable.
> >>>
> >>> What tests are planned for checking migration?
> >>>
> >>>>>
> >>>>> To this end, I think it will be essential to know which records have
> >>>>> been migrated, and which version of the software was used to do so (as
> >>>>> well as the date).
> >>>>>
> >>>>> It may be worth including version and timestamp info in the direct
> >>>>> archive and imports as well.
> >>>>
> >>>> Do you mean adding a key/value to the migrated doc with a migration
> >>>> note? That wouldn't be a bad idea, if nothing else, to keep score of
> >>>> what was migrated and what's new.
> >>>
> >>> Something like that.
> >>>
> >>> I think the data needs to be flexible and allow for multiple notes.
> >>> It won't always be sufficient to record the last change to the data.
> >>
> >> Yes, one wondrous thing about ES is a text field can be both text or an
> >> array of texts, so you can have one note or multiple notes, and it'll
> >> just work. I'm thinking of just having a "notes" field where we can put
> >> entries.
> >
> > Does that automatically append new entries, or does the user have to
> > amend the record to ensure previous entries are not lost?
>
> What I do right now is fetch the doc, ensure 'notes' is a list, then
> append new notes to it and save the entire doc.

i.e. care must be taken not to lose existing info.

> >
> > It would probably still be useful to have some fixed attributes such as
> > -archived-at
> > -imported-at
>
> That would be for archiver.py and import-mbox.py?

Yes, probably also need
-migrated at

> >
> >>>
> >>>>>
> >>>>> One possible application would be to back-fill attachments which were
> >>>>> originally ignored.
> >>>>
> >>>> This could be run as a background re-indexer perhaps? That grabs the
> >>>> source document, re-parses attachments, and if it contained more than
> >>>> originally thought, add them and update the email document.
> >>>
> >>> Yes, and marks the document somehow so it does not need to be scanned 
> >>> again.
> >>>
> >>> This is where the change context comes in.
> >>> If we knew which documents were created with which version of
> >>> software, it would be possible to know which ones did not need
> >>> processing.
> >>>
> >>>>>
> >>>>> S.
> >>>>>
> >>>>
> >>
>

Reply via email to