On 2022-06-08, 2qdxy4rzwzuui...@potatochowder.com <2qdxy4rzwzuui...@potatochowder.com> wrote: > On 2022-06-09 at 04:15:46 +1000, > Chris Angelico <ros...@gmail.com> wrote: > >> On Thu, 9 Jun 2022 at 04:14, <2qdxy4rzwzuui...@potatochowder.com> wrote: >> > >> > On 2022-06-09 at 03:18:56 +1000, >> > Chris Angelico <ros...@gmail.com> wrote: >> > >> > > On Thu, 9 Jun 2022 at 03:15, <2qdxy4rzwzuui...@potatochowder.com> wrote: >> > > > >> > > > On 2022-06-08 at 08:07:40 -0000, >> > > > De ongekruisigde <ongekruisi...@news.eternal-september.org> wrote: >> > > > >> > > > > Depending on the problem a regular expression may be the much simpler >> > > > > solution. I love them for e.g. text parsing and use them all the >> > > > > time. >> > > > > Unrivaled when e.g. parts of text have to be extracted, e.g. from >> > > > > lines >> > > > > like these: >> > > > > >> > > > > root:x:0:0:System >> > > > > administrator:/root:/run/current-system/sw/bin/bash >> > > > > dhcpcd:x:995:991::/var/empty:/run/current-system/sw/bin/nologin >> > > > > nm-iodine:x:996:57::/var/empty:/run/current-system/sw/bin/nologin >> > > > > avahi:x:997:996:avahi-daemon privilege separation >> > > > > user:/var/empty:/run/current-system/sw/bin/nologin >> > > > > sshd:x:998:993:SSH privilege separation >> > > > > user:/var/empty:/run/current-system/sw/bin/nologin >> > > > > geoclue:x:999:998:Geoinformation >> > > > > service:/var/lib/geoclue:/run/current-system/sw/bin/nologin >> > > > > >> > > > > Compare a regexp solution like this: >> > > > > >> > > > > >>> g = >> > > > > re.search(r'([^:]*):([^:]*):(\d+):(\d+):([^:]*):([^:]*):(.*)$' , s) >> > > > > >>> print(g.groups()) >> > > > > ('geoclue', 'x', '999', '998', 'Geoinformation service', >> > > > > '/var/lib/geoclue', '/run/current-system/sw/bin/nologin') >> > > > > >> > > > > to the code one would require to process it manually, with all the >> > > > > edge >> > > > > cases. The regexp surely reads much simpler (?). >> > > > >> > > > Uh... >> > > > >> > > > >>> import pwd # https://docs.python.org/3/library/pwd.html >> > > > >>> [x for x in pwd.getpwall() if x[0] == 'geoclue'] >> > > > [pwd.struct_passwd(pw_name='geoclue', pw_passwd='x', pw_uid=992, >> > > > pw_gid=992, pw_gecos='Geoinformation service', >> > > > pw_dir='/var/lib/geoclue', pw_shell='/sbin/nologin')] >> > > >> > > That's great if the lines are specifically coming from your system's >> > > own /etc/passwd, but not so much if you're trying to compare passwd >> > > files from different systems, where you simply have the files >> > > themselves. >> > >> > In addition to pwent to get specific entries from the local password >> > database, POSIX has fpwent to get a specific entry from a stream that >> > looks like /etc/passwd. So even POSIX agrees that if you think you have >> > to process this data manually, you're doing it wrong. Python exposes >> > neither functon directly (at least not in the pwd module or the os >> > module; I didn't dig around or check PyPI). >> >> So...... we can go find some other way of calling fpwent, or we can >> just parse the file ourselves. It's a very VERY simple format. > > If you insist: > > >>> s = > 'nm-iodine:x:996:57::/var/empty:/run/current-system/sw/bin/nologin' > >>> print(s.split(':')) > ['nm-iodine', 'x', '996', '57', '', '/var/empty', > '/run/current-system/sw/bin/nologin'] > > Hesitantly, because this is the Python mailing list, I claim (a) ':' is > simpler than r'([^:]*):([^:]*):(\d+):(\d+):([^:]*):([^:]*):(.*)$', and > (b) string.split covers pretty much the same edge cases as re.search.
Ah, but you don't catch the be numeric of fields (0-based) 2 and 3! But agreed, it's not the best of examples. -- <StevenK> You're rewriting parts of Quake in *Python*? <knghtbrd> MUAHAHAHA -- https://mail.python.org/mailman/listinfo/python-list