On 25/08/2020 21.24, sebb wrote:
On Tue, 25 Aug 2020 at 20:11, Daniel Gruno <[email protected]> wrote:
On 25/08/2020 20.54, sebb wrote:
On Tue, 25 Aug 2020 at 19:42, Daniel Gruno <[email protected]> wrote:
On 25/08/2020 20.35, sebb wrote:
On Tue, 25 Aug 2020 at 19:23, Daniel Gruno <[email protected]> wrote:
On 25/08/2020 20.15, sebb wrote:
AFAICT this will generate different hashes for the same message if
they are loaded from a different source.
Yeah, it will - at present, that is on purpose. We can look at doing
something like using Sean's DKIM parser for this, and only hashing the
output from that, with the x-archived-list-id added in from the command
line --lid argument if different from the canonical list id.
Whilst it should ensure that distinct messages don't clash, it won't
weed out actual duplicates.
Right, aware of that. In most cases, if you are reloading, you are doing
so with a fresh DB, and it won't matter much. In cases where you are
"cascading" mbox files, it would make duplicates, but that's only a
question of disk space for now, having duplicate source files won't
cause malfunctions, just a few more bytes used and source alternatives.
This has implications for the API and the UI.
If there are multiple matches for a Permalink, in general one cannot
say which is correct, so all will have to be returned and displayed.
I'm pondering how to address this. Currently, the prototype will return
the first hit it finds that matches. This should really be fine, as they
are all valid sources, so returning one or the other would not matter
for the end-user.
This assumes that the Permalink is sufficiently unique.
That is not true for some of the current designs.
This would be the case only if you lost your database and decided to
re-image everything from scratch using foal with an older generator
instead of the original pony mail, and two or more emails had collisions.
I would strongly recommend against doing this unless you have no other
choice or do not care about older permalinks that much.
Foal is not meant as a drop-in replacement for the current Pony Mail. If
you lose your old database and want complete assuredness against this,
you should re-image using the old version first, and then migrate
across. There will be differences in both the archiver and the UI that
are not fully backwards compatible, as the 'old ways' are bugged here
and there.
The migrator will, once it's done, migrate everything over verbatim, so
any overrides you had in the old system will apply to the new one as
well, and you won't see multiple choices for old emails, only newly
archived ones done with the foal archiver or importer.
If Foal is to support non-unique generators, it must use their
Permalinks as the database Id, or it must support multiple matches.
I'm strongly in favor of ripping them out of the system altogether, and
only supporting full and dkim for future operations. I haven't quite
gotten around to it yet :)