Re: duplicates due to X-TUID headers

2024-03-15 Thread Peter P.
* Bence Ferdinandy  [2024-03-15 08:41]:
> 2024. márc. 15. 8:16:53 Evgeniy Berdnikov :
> 
> > On Fri, Mar 15, 2024 at 07:42:32AM +0100, Peter P. wrote:
> > > Do you have any recommendataion how I could de-duplicate messages with
> > > missing message-ids?
> > 
> > You have better to remove them all.
> > 
> > for f in /path/to/mailbox/* ; do
> >  if [ ! grep -is '^Message-id:' $f ] ; then
> >     rm -f $f
> >  fi
> > done
> > --
> > Eugene Berdnikov
> 
> I think you can tell mail dedup to not use message-id-s when checking
> duplicates
> 
> https://github.com/kdeldycke/mail-deduplicate

Thank you Oswald, Evgeniy, Bence,

I will look at your recommendations and report back in a while.

best, Peter



___
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel


Re: duplicates due to X-TUID headers

2024-03-15 Thread Bence Ferdinandy

2024. márc. 15. 8:16:53 Evgeniy Berdnikov :


On Fri, Mar 15, 2024 at 07:42:32AM +0100, Peter P. wrote:

Do you have any recommendataion how I could de-duplicate messages with
missing message-ids?


You have better to remove them all.

for f in /path/to/mailbox/* ; do
 if [ ! grep -is '^Message-id:' $f ] ; then
    rm -f $f
 fi
done
--
Eugene Berdnikov


I think you can tell mail dedup to not use message-id-s when checking 
duplicates


https://github.com/kdeldycke/mail-deduplicate


___
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel


Re: duplicates due to X-TUID headers

2024-03-15 Thread Evgeniy Berdnikov
On Fri, Mar 15, 2024 at 07:42:32AM +0100, Peter P. wrote:
> Do you have any recommendataion how I could de-duplicate messages with
> missing message-ids?

 You have better to remove them all.
 
 for f in /path/to/mailbox/* ; do
 if [ ! grep -is '^Message-id:' $f ] ; then
rm -f $f
 fi
 done
-- 
 Eugene Berdnikov


___
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel


Re: duplicates due to X-TUID headers

2024-03-14 Thread Oswald Buddenhagen via isync-devel

On Fri, Mar 15, 2024 at 07:42:32AM +0100, Peter P. wrote:

I re-checked and the duplicated messages are in fact lacking a
message-id
header. I don't know why and how they got removed.

Do you have any recommendataion how I could de-duplicate messages with
missing message-ids?


in that case you can follow that guide you found - eliminate the x-tuids
and then do full-text (?) de-duplication. i didn't investigate the topic
myself, so i have no more specific advice.


___
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel


Re: duplicates due to X-TUID headers

2024-03-14 Thread Peter P.
* Oswald Buddenhagen via isync-devel  
[2024-03-13 12:57]:
> On Wed, Mar 13, 2024 at 12:40:21PM +0100, Peter P. wrote:
> > These duplicates are not detected by (neo)mutt's ~= filter since some
> > of these messages contain a X-TUID header, others do not.
> > 
> that's just wrong. mutt goes by the message-id header and nothing else.
Thanks for correcting me Oswald! 

I re-checked and the duplicated messages are in fact lacking a message-id
header. I don't know why and how they got removed.

Do you have any recommendataion how I could de-duplicate messages with
missing message-ids?

Much appreciated,
Peter


___
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel


Re: duplicates due to X-TUID headers

2024-03-13 Thread Oswald Buddenhagen via isync-devel

On Wed, Mar 13, 2024 at 12:40:21PM +0100, Peter P. wrote:

These duplicates are not detected by (neo)mutt's ~= filter since some
of these messages contain a X-TUID header, others do not.


that's just wrong. mutt goes by the message-id header and nothing else.


I am interested how/why these headers got introduced, and how I can
avoid this in the future.


these are generated by isync to be able to provide a transactional
guarantee (being able to reliably resume interrupted runs).


Furthermore, some webpages suggest some sed magic[1] to remove these
X-TUID headers to be able to remove duplicates.


that page uses a different de-duplication method which presumably relies
on textual identity.


Will mbsync have any issues when these headers are removed, and
duplicates are removed on the near side?


you can strip them out from your maildir if you feel like it, but it
will just wear down your ssd for no discernable gain.


___
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel