On Tue, Dec 22, 2020 at 06:28:08AM +0000, Eric Wong wrote: > Eric Wong <e...@80x24.org> wrote: > > > > There's scripts/ssoma-replay which was v1-only and dependent on > > ssoma. I've been meaning to convert into something that reads > > NNTP so it's not locked into public-inbox. Maybe it could be > > part of `lei', too, for piping to arbitrary commands, dunno...
I wrote grok-pi-piper a while back for the purpose of piping from git to patchwork.kernel.org. It's not complete yet, because we currently do not handle situations with rewritten history, but it's been working well enough. I have a write-up here: https://people.kernel.org/monsieuricon/subscribing-to-lore-lists-with-grokmirror What is the sanest way to recognize and handle history rewrites? Right now, we just keep track of the latest tip hash. On each subsequent run, we just iterate all commits between the recorded hash and the newest tip. My current thoughts are: - in addition to the latest tip hash, keep track of author, authordate and message-id of the last processed message - if we no longer find the tracked hash in the repo, use author+authordate to find the new hash of the latest message we processed, and verify with message-id - if we cannot find the exact match (i.e. our latest processed message is gone from history), find the first commit that happens before our recorded authordate and use that as the "latest processed" jump-off point This should do the right thing in most situations except for when the message that was deleted from history was sent with a bogus Date: header with a date in the future. In this case, we can miss valid messages in the queue. Any suggestions on how this can be improved? -K