On Tue, Jun 3, 2025 at 2:43 AM Jason McIntyre <[email protected]> wrote:
>
> On Mon, Jun 02, 2025 at 01:42:50PM -0300, K R wrote:
> > Hi Jason,
> >
> > On Thu, May 29, 2025 at 11:08???AM Jason McIntyre <[email protected]> 
> > wrote:
> > >
> > > On Wed, May 28, 2025 at 05:22:55PM -0300, K R wrote:
> > > > >Synopsis:      fortune(6): fortunes2 file has duplicate entries
> > > > >Category:      system games
> > > > >Environment:
> > > >         System      : OpenBSD 7.7
> > > >         Details     : OpenBSD 7.7 (GENERIC) #0: Sun May  4 11:10:16 MDT 
> > > > 2025
> > > >
> > > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC
> > > >
> > > >         Architecture: OpenBSD.amd64
> > > >         Machine     : amd64
> > > > >Description:
> > > >
> > > >         There are 100+ entries in the fortunes2 file that are already
> > > >         present in the fortunes file.
> > > >
> > > > >How-To-Repeat:
> > > >
> > > >         cd /tmp
> > > >         cp /usr/share/games/fortune/{fortunes,fortunes2} .
> > > >         split -a 4 -p '^%$' fortunes fortunes.
> > > >         split -a 4 -p '^%$' fortunes2 fortunes2.
> > > >         sha256 fortunes.* > SHA256.fortunes
> > > >         sha256 fortunes2.* > SHA256.fortunes2
> > > >         # compare the two SHA256 files...
> > > >
> > > > >Fix:
> > > >         diff below removes the duplicate entries from fortunes2.
> > > >
> > > > Thanks,
> > > > --Kor
> > >
> > > hi.
> > >
> > > this methodology is too smart for me! if another obsd dev wants to
> > > confirm it's sound, i'd be happy to remove dups (or said dev could
> > > kindly take care of it themselves ;)
> >
> > Sorry for the delay.  You' re right, the methodology can be simplified.
> >
> > Please find attached a simple Python script that detects duplicate
> > entries in fortune files.  It uses sets to do that. Given two files,
> > file1 and file2, detects entries in file2 already present in file1.
> > It warns to stderr and writes file2 to stdout with the duplicate
> > entries removed.
> >
> > Looking into other fortune(6) files, it's not only fortunes2 that has
> > duplicates:
> >
> > /usr/share/games/fortune/fortunes2: 104 dups, from fortunes
> > /usr/share/games/fortune/fortunes2-o: 73 dups, from fortunes-o
> > /usr/share/games/fortune/limerick: 10 dups, from fortunes
> > /usr/share/games/fortune/zippy: 8 dups, from fortunes
> >
> > I hope it helps.
> >
> > Thanks,
> > --Kor
> >
> > >
> > > jmc
> > >
>
> hi. thanks for the update. however it's not that i need a simpler way to
> check for dups - since i don;t have the technical skill to verify that
> these tests are correct, and since i don;t want to spend a ton of hours
> manually checking, what i need is a technical ok from an obsd dev that the
> methods are correct.
>
> in hindsight, i probably should have left this thread alone to sink or
> swim.
>
> one more thought - although there are obviously dups, it can be argued
> that correctly so. for example, why would you remove dups from limerick
> that are in fortune? someone might want only limericks, and would not
> benefit from having them removed.

You have a point.

> so any such diff would have to take that into account. as far as i can
> see, the only dup removal diff that makes sense is to remove dups from
> fortune and fortune2.

Makes sense.  Please remember to also take into account their
"offensive" counterparts, fortunes-o and fortunes2-o -- they also
contain dups.

Thanks again,
--Kor

>
> jmc
>

Reply via email to