Re: Comparing text strings

2021-04-18 Thread dn via Python-list
On 14/04/2021 04.05, Mats Wichmann wrote:
> On 4/12/21 5:11 PM, Rich Shepard wrote:
>> I'm running Slackware64-14.2 and keep a list of installed packages.
>> When a
>> package is upgraded I want to remove the earlier version, and I've not
>> before written a script like this. Could there be a module or tool that
>> already exists to do this? If not, which string function would be best
>> suited to the task?
>>
>> Here's an example:
>> atftp-0.7.2-x86_64-2_SBo.tgz
>> atftp-0.7.4-x86_64-1_SBo.tgz

The 'trick' here is to understand how the distro handles versioning (and
multiple architectures, etc) and then to split the long name into
components before comparison, simplified example:

only relevant comparison if x86_64 == x86_64, then
atftp ? atftp == same, and thus
0.7.4 ? 0.7.4 => version update
(perhaps)


>> and there are others like this. I want the python3 script to remove the
>> first one. Tools like like 'find' or 'sort -u' won't work because
>> while the
>> file name is the same the version or build numbers differ.
> 
> Yes, you've identified why this is hard: package versioning takes many
> forms.  As suggested elsewhere, for Linux distribution packages, the
> only reliable approach is to lean on the distro's packaging
> infrastructure in some way, because those version strings (plus package
> metadata which may have "replaces" or "obsoletes" or some similar
> information) all have a defined meaning to *it* - it's the intended
> audience.
> 
> Don't know if Slack exposes this information in some way, it may be hard
> to make a reliable script if not. I know Debian actually does what
> you're looking for as a feature of the packaging system (apt-get
> autoclean), and the Fedora/RedHat universe does not, so I've also looked
> for what you're looking for :)


Not a sand-box I've played in. However, dnf is at least partly-written
in Python (it may still employs the older rpm, even yum, code).

Maybe the OP could learn from, or even piggy-back off, the existing code?
(which may be at https://github.com/rpm-software-management/dnf)
-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-18 Thread Peter Pearson
On Sun, 18 Apr 2021 06:38:16 GMT, Gilmeh Serda wrote:
> On Mon, 12 Apr 2021 16:11:21 -0700, Rich Shepard wrote:
>
>> All suggestions welcome.
>
> Assuming you want to know which is the oldest version and that the same 
> scheme is used all the time, could this work?
>
 s1='atftp-0.7.2-x86_64-2_SBo.tgz'
 s2='atftp-0.7.4-x86_64-1_SBo.tgz'
 s1>s2
> False
 s2>s1
> True
[snip]

However, beware:

>>> s2='atftp-0.7.4-x86_64-1_SBo.tgz'
>>> s3='atftp-0.7.10-x86_64-1_SBo.tgz'
>>> s2>s3
True

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-13 Thread Rich Shepard

On Tue, 13 Apr 2021, jak wrote:


If I understand your problem correctly, the problem would be dealing with
numbers as such in file names. This is just a track but it might help you.
This example splits filenames into strings and numbers into tuples,
appends the tuple into a list, and then sorts the list:


jak,

Yes, it would be the version and build numbers that differ when two files
have the same initial string (application name).

Thanks very much,

Rich


--
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-13 Thread jak

Il 13/04/2021 01:11, Rich Shepard ha scritto:

I'm running Slackware64-14.2 and keep a list of installed packages. When a
package is upgraded I want to remove the earlier version, and I've not
before written a script like this. Could there be a module or tool that
already exists to do this? If not, which string function would be best
suited to the task?

Here's an example:
atftp-0.7.2-x86_64-2_SBo.tgz
atftp-0.7.4-x86_64-1_SBo.tgz

and there are others like this. I want the python3 script to remove the
first one. Tools like like 'find' or 'sort -u' won't work because while the
file name is the same the version or build numbers differ.

All suggestions welcome.

Rich



If I understand your problem correctly, the problem would be dealing 
with numbers as such in file names. This is just a track but it might 
help you. This example splits filenames into strings and numbers into 
tuples, appends the tuple into a list, and then sorts the list:


files = ['atftp-0.7.4-x86_64-2_SBo.tgz', 'atftp-0.7.2-x86_64-1_SBo.tgz']
digit = None
da = ''
tfn = tuple()
ltafn = list()
for f in files:
for c in f:
if str(c).isdigit():
if not digit:
if len(da) > 0:
tfn += (da,)
da = ''
digit = True
da += c
else:
da += c
else:
if digit:
if len(da) > 0:
tfn += (int(da),)
da = ''
digit = False
da += c
else:
da += c
if len(da) > 0:
if da.isdigit():
tfn += (int(da),)
else:
tfn += (da,)
da = ''
ltafn += [tfn, ]
tfn = ()

ltafn.sort()

for t in files:
print(t)
print()
for t in ltafn:
nn = ''
for f in t:
nn += str(f)
print(nn)
--
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-13 Thread Grant Edwards
On 2021-04-12, 2qdxy4rzwzuui...@potatochowder.com 
<2qdxy4rzwzuui...@potatochowder.com> wrote:

> I don't know whether or how Slackware handles "compound" package names
> (e.g., python-flask), but at some point, you're going to have to pull
> apart (aka---gasp--parse), the package names to come up with the name
> itself and the version (and the architecture, etc.), compare the name
> parts to find the "duplicates," and then compare the versions (and maybe
> the build number) to eliminate the lesser ones.
>
> You'll also have to watch for the transitions from, say, 0.9 to 0.10, or
> from 1.0rc2 to 1.0, which may not sort simply.

Indeed. Sorting version numbers is very dark magic, and you'll need to
know way more than you want to about the version numbering conventions
in your distro. There may be all sorts of exceptions and special cases
(like handling an rcNN version-number suffix as mentioned above).

--
Grant



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-13 Thread Mats Wichmann

On 4/12/21 5:11 PM, Rich Shepard wrote:

I'm running Slackware64-14.2 and keep a list of installed packages. When a
package is upgraded I want to remove the earlier version, and I've not
before written a script like this. Could there be a module or tool that
already exists to do this? If not, which string function would be best
suited to the task?

Here's an example:
atftp-0.7.2-x86_64-2_SBo.tgz
atftp-0.7.4-x86_64-1_SBo.tgz

and there are others like this. I want the python3 script to remove the
first one. Tools like like 'find' or 'sort -u' won't work because while the
file name is the same the version or build numbers differ.


Yes, you've identified why this is hard: package versioning takes many 
forms.  As suggested elsewhere, for Linux distribution packages, the 
only reliable approach is to lean on the distro's packaging 
infrastructure in some way, because those version strings (plus package 
metadata which may have "replaces" or "obsoletes" or some similar 
information) all have a defined meaning to *it* - it's the intended 
audience.


Don't know if Slack exposes this information in some way, it may be hard 
to make a reliable script if not. I know Debian actually does what 
you're looking for as a feature of the packaging system (apt-get 
autoclean), and the Fedora/RedHat universe does not, so I've also looked 
for what you're looking for :)


--
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-13 Thread Rich Shepard

On Tue, 13 Apr 2021, Cameron Simpson wrote:


The problem is not that simple. Sometimes the package maintainer upgrades
the package for the same version number so there could be abc-1.0_1_SBo.tgz
and abc-1.0_2_SBo.tgz. The more involved route will be taken.


If that _1, _2 thing is like RedHat's -1, -2 suffixes for later releases
of the same source package version, you could just include that in the
version string. Looks like it counts up.


Cameron,

There are two variables: the application/tool version number and the build
number. They're both embedded within the filename string. I don't know in
advance the build numbers if it's that variable that's changed.

The list of installed files is (currently) less than 500 lines so a
character-by-character comparison when two rows begin with the same string
will not take long.

Thanks again,

Rich
--
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-12 Thread Cameron Simpson
On 12Apr2021 19:11, Rich Shepard  wrote:
>On Tue, 13 Apr 2021, Cameron Simpson wrote:
>>Alternatively, and now that I think about it, more simply: _if_ the
>>package files can be sorted by version, then all you need to do is read a
>>sorted listing and note that latest fil for a particular package. If you
>>need another one, it should be newer and you can remove the "known"
>>package file, and update your record that to the new one.
>
>The problem is not that simple. Sometimes the package maintainer upgrades
>the package for the same version number so there could be abc-1.0_1_SBo.tgz
>and abc-1.0_2_SBo.tgz. The more involved route will be taken.

If that _1, _2 thing is like RedHat's -1, -2 suffixes for later releases 
of the same source package version, you could just include that in the 
version string. Looks like it counts up.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-12 Thread Rich Shepard

On Tue, 13 Apr 2021, Cameron Simpson wrote:


I do not know if there are preexisting modules/tools for this, but I
recommend looking at slackware's package management tool - they usually
have some kind of 'clean" operation to purge "old" package install files.
Sometimes that purges all the install files, not just the obsolete ones,
so take care.


Cameron,

slackpkg clean removes all non-core distribution files. That's how I FUBAR'd
my system a couple of months ago.


If you're writing a script, what you want to do is read the names and
extract the version information from them, and also the "package name" -
the bit which identifies the package and does not change with an upgrade.


Yes, that's the approach I would take.


I would then make a dict mapping package names to a list of versions
and/or the full "...tgz" names above. (Actually, use a
defaultdict(list), it will avoid a lot of tedious mucking about.)


Okay. That didn't occur to me.


Alternatively, and now that I think about it, more simply: _if_ the
package files can be sorted by version, then all you need to do is read a
sorted listing and note that latest fil for a particular package. If you
need another one, it should be newer and you can remove the "known"
package file, and update your record that to the new one.


The problem is not that simple. Sometimes the package maintainer upgrades
the package for the same version number so there could be abc-1.0_1_SBo.tgz
and abc-1.0_2_SBo.tgz. The more involved route will be taken.

Thanks!

Rich


--
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-12 Thread Chris Angelico
On Tue, Apr 13, 2021 at 9:54 AM Cameron Simpson  wrote:
> Note that this depends on sorting by version. A lexical sort (eg
> "ls|sort") will look good intil a package version crosses a boundary
> like this:
>
> 1.9.1
> 1.10.0
>
> A lexical sort will put those the other way around because "9" > "1".
> Wrongness will ensue.
>

GNU sort has a -V option to sort by version numbers. I don't know how
well that'd handle other tags though.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-12 Thread Cameron Simpson
On 12Apr2021 16:11, Rich Shepard  wrote:
>I'm running Slackware64-14.2 and keep a list of installed packages. When a
>package is upgraded I want to remove the earlier version, and I've not
>before written a script like this. Could there be a module or tool that
>already exists to do this? If not, which string function would be best
>suited to the task?
>
>Here's an example:
>atftp-0.7.2-x86_64-2_SBo.tgz
>atftp-0.7.4-x86_64-1_SBo.tgz
>
>and there are others like this. I want the python3 script to remove the
>first one. Tools like like 'find' or 'sort -u' won't work because while the
>file name is the same the version or build numbers differ.

I do not know if there are preexisting modules/tools for this, but I 
recommend looking at slackware's package management tool - they usually 
have some kind of 'clean" operation to purge "old" package install 
files. Sometimes that purges all the install files, not just the 
obsolete ones, so take care.

If you're writing a script, what you want to do is read the names and 
extract the version information from them, and also the "package name" - 
the bit which identifies the package and does not change with an 
upgrade.

I would then make a dict mapping package names to a list of versions 
and/or the full "...tgz" names above. (Actually, use a 
defaultdict(list), it will avoid a lot of tedious mucking about.)

Alternatively, and now that I think about it, more simply: _if_ the 
package files can be sorted by version, then all you need to do is read 
a sorted listing and note that latest fil for a particular package. If 
you need another one, it should be newer and you can remove the "known" 
package file, and update your record that to the new one.

Note that this depends on sorting by version. A lexical sort (eg 
"ls|sort") will look good intil a package version crosses a boundary 
like this:

1.9.1
1.10.0

A lexical sort will put those the other way around because "9" > "1".  
Wrongness will ensue.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing text strings

2021-04-12 Thread 2QdxY4RzWzUUiLuE
On 2021-04-12 at 16:11:21 -0700,
Rich Shepard  wrote:

> I'm running Slackware64-14.2 and keep a list of installed packages. When a
> package is upgraded I want to remove the earlier version, and I've not
> before written a script like this. Could there be a module or tool that
> already exists to do this? If not, which string function would be best
> suited to the task?
> 
> Here's an example:
> atftp-0.7.2-x86_64-2_SBo.tgz
> atftp-0.7.4-x86_64-1_SBo.tgz
> 
> and there are others like this. I want the python3 script to remove the
> first one. Tools like like 'find' or 'sort -u' won't work because while the
> file name is the same the version or build numbers differ.
> 
> All suggestions welcome.

The trick to this is to define "one" in the phrase "first one."  What
makes a package name a first one, or a second one?  Is it the prefix
before the first hyphen?  Always?  Does Slackware define all the
possible bits, pieces, and permutations of package names, versions,
builds, etc.?

I don't know whether or how Slackware handles "compound" package names
(e.g., python-flask), but at some point, you're going to have to pull
apart (aka---gasp--parse), the package names to come up with the name
itself and the version (and the architecture, etc.), compare the name
parts to find the "duplicates," and then compare the versions (and maybe
the build number) to eliminate the lesser ones.

You'll also have to watch for the transitions from, say, 0.9 to 0.10, or
from 1.0rc2 to 1.0, which may not sort simply.

So to answer your question, a string function like split is a good
start.  IMO, if you think about the edge cases early, you'll have a
better chance at not being bitten by them.
-- 
https://mail.python.org/mailman/listinfo/python-list


Comparing text strings

2021-04-12 Thread Rich Shepard

I'm running Slackware64-14.2 and keep a list of installed packages. When a
package is upgraded I want to remove the earlier version, and I've not
before written a script like this. Could there be a module or tool that
already exists to do this? If not, which string function would be best
suited to the task?

Here's an example:
atftp-0.7.2-x86_64-2_SBo.tgz
atftp-0.7.4-x86_64-1_SBo.tgz

and there are others like this. I want the python3 script to remove the
first one. Tools like like 'find' or 'sort -u' won't work because while the
file name is the same the version or build numbers differ.

All suggestions welcome.

Rich

--
https://mail.python.org/mailman/listinfo/python-list