Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, 2023-01-28 at 17:38 +0100, Michał Górny wrote: > Hi, everyone. > > TL;DR: I'd like to propose naming dev-python/* packages following PyPI > names whenever possible, case-preserving, with modifications only when > necessary to match PN rules. > The "relaxed" version is now official: https://projects.gentoo.org/python/guide/package-maintenance.html#package-name-policy -- Best regards, Michał Górny
Re: [gentoo-dev] dev-python/ package naming policy?
On Mon, 2023-01-30 at 16:11 +0500, Anna (cybertailor) Vyalkova wrote: > On 2023-01-30 12:00, Michał Górny wrote: > > However, there's a can of worms around the corner -- should we also > > allow normalizing "-" and "_" across different packages (see dev- > > python/sphinx*)? > > PyPI treats "-" and "_" separators as the same, so I'd not use > underscores for in-repo consistency. I suppose that's PEP 503. It speaks of name normalization: | The name should be lowercased with all runs of the characters ., -, | or _ replaced with a single - character. [1] Technically, a policy that would require only "normalized" name match would let us improve consistency when upstreams fail to do so. Unfortunately, while common tools search case-insensitively, they are sensitive to these characters (and I'm not convinced of changing that). [1] https://peps.python.org/pep-0503/#normalized-names -- Best regards, Michał Górny
Re: [gentoo-dev] dev-python/ package naming policy?
Andrew Ammerlaan writes: > On 28/01/2023 19:02, Ulrich Mueller wrote: >>> On Sat, 28 Jan 2023, Michał Górny wrote: However, it's been pointed out that this makes it hard for people to find packages they're looking for. >> I don't understand this argument. Why would all-lowercase make finding a >> package harder? > > Here's an example, on pypi we have packages: > - git-python > - python-git > - GitPython > - git-py > > Each of these is a different package. The package you usually want is > GitPython, but if we would name it gitpython or git-python, things would get > very confusing very quickly. In fact, this package was renamed precisely to > avoid this confusion in [1]. This is not the only case where there are very > similarly named packages on pypi. By having a 1 to 1 mapping between names in > pypi and names in ::gentoo we avoid this confusion. AFAIK, but I cannot find a source confirming this, PyPI project names are case-insensitive, so it should be okay to map to all lowercase. > Best regards, > Andrew > > [1] > https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0dec450a90c7490f11df7e69cd9c6709c099285c -- Arsen Arsenović signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
On 2023-01-30 12:00, Michał Górny wrote: > However, there's a can of worms around the corner -- should we also > allow normalizing "-" and "_" across different packages (see dev- > python/sphinx*)? PyPI treats "-" and "_" separators as the same, so I'd not use underscores for in-repo consistency.
Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, 2023-01-28 at 17:38 +0100, Michał Górny wrote: > To improve consistency and make packages easier to find, I'd like to > propose going forward that when packages are published on PyPI, we use > their official PyPI names. This also means preserving the case for > the few packages that use CamelCase names and similar. > > Some modifications will be necessary. For example, it is legal for PyPI > package names to include dot (".") — we normally translate that to a > hyphen ("-"). We may also have use cases for creating multiple Gentoo > packages from the same PyPI package (see e.g. dev-python/ensurepip-*). > Then, there are of course Python packages that aren't published on PyPI. > > Still, I think as a general rule of thumb this would make sense. WDYT? > To add a data point, the "Flask-Babel" package has been renamed to "flask-babel" upstream today. Unfortunately, minor changes to names are not that uncommon (pkgcheck regularly catches them via "mismatched" remote-ids). This also means that now this one package is inconsistent with the rest of capitalized "Flask" packages. In the end, I'm still not sure whether this policy really makes sense. Perhaps it should be relaxed to allow case mismatches, if only to allow us to retain in-tree consistency when upstreams fail to be consistent. However, there's a can of worms around the corner -- should we also allow normalizing "-" and "_" across different packages (see dev- python/sphinx*)? Now you see why we didn't have a policy for this before. -- Best regards, Michał Górny
Re: [gentoo-dev] dev-python/ package naming policy?
On Sun, Jan 29, 2023 at 02:15:19AM +0300, Torokhov Sergey wrote: > The similar names in PyPi is a real problem for users when trying to > find associated packages. It's also could be a security issue for them with > malicious packages named like popular packages. />So in ::guru I try to save package naming even if it's too > CamelCase.As for replacing dot (".") with hyphen > ("-") I have PyPi package "FoBiS.py" that is packaged in ::guru just as > "FoBiS" as I wasn't sure is it worth to store ".py" suffix while github repo > of this project is just "FoBiS". So there could be a problem if package named > "fobis" will appear in PyPi.28.01.2023, 19:38, > "Michał Górny":Hi, > everyone.TL;DR: I'd like to propose naming dev-python/* packages > following PyPInames whenever possible, case-preserving, with > modifications only whennecessary to match PN rules.So > far the naming in dev-python/* hasn't been exactly consistent. Myself > I've been mostly following "whatever's the easiest" policy which />generally meant following GitHub project names whenever we fetched from />there.This mostly made sense so far, as I've been thinking of > dev-python/primarily in terms of dependencies of other packages. > However, it'sbeen pointed out that this makes it hard for people to > find packagesthey're looking for.The vast majority of > packages in dev-python/ are also published on PyPI[1]. They can > afterwards be installed using tools such as pip, orspecified as > dependencies of other projects — using their PyPI namesin every > case.On top of that, it is not unknown for multiple packages with > verysimilar names to coexis, say "foo", "pyfoo" and "python-foo". When > GHproject names come into the picture, this can get even more > ambiguous. Don't even get me started about developers pushing duplicate > packagesbecause they didn't find the existing instance. />To improve consistency and make packages easier to find, I'd like to />propose going forward that when packages are published on PyPI, we use />their official PyPI names. This also means preserving the case for />the few packages that use CamelCase names and similar.Some > modifications will be necessary. For example, it is legal for PyPI />package names to include dot (".") — we normally translate that to a />hyphen ("-"). We may also have use cases for creating multiple Gentoo />packages from the same PyPI package (see e.g. dev-python/ensurepip-*). />Then, there are of course Python packages that aren't published on PyPI. />Still, I think as a general rule of thumb this would make sense. > WDYT?[1] https://pypi.org/"; > target="_blank">https://pypi.org/ class="f55bbb4eeef208e8wmi-sign">-- Best regards,Michał Górny /> Can you send plaintext mail to gentoo-dev? HTML makes it very hard to read your mails in certain clients. signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, Jan 28, 2023 at 10:23:45PM +0100, Ulrich Mueller wrote: > > On Sat, 28 Jan 2023, Andrew Ammerlaan wrote: > > > Each of these is a different package. The package you usually want is > > GitPython, but if we would name it gitpython or git-python, things > > would get very confusing very quickly. In fact, this package was > > renamed precisely to avoid this confusion in [1]. This is not the only > > case where there are very similarly named packages on pypi. By having > > a 1 to 1 mapping between names in pypi and names in ::gentoo we avoid > > this confusion. > > Looking at mgorny's list, you cannot have an 1 to 1 mapping anyway, > because that would result in invalid PN names. Should imperfection get in the way of bettering the mapping? signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, Jan 28, 2023 at 10:15:02PM +0500, Anna (cybertailor) Vyalkova wrote: > I'd prefer if PyPI names are guidelines, not a strict policy. I don't > like CamelCase and separators other than dash ("-") :P > > Also I don't like when packages are named "dev-python/python-foo" > instead of just "dev-python/foo". So, two simply aesthetic opinions. I'm not sure it's appropriate to suggest one's aesthetic preference as default when there's no further benefit. signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
On Sun, 2023-01-29 at 02:15 +0300, Torokhov Sergey wrote: > As for replacing dot (".") with hyphen ("-") I have PyPi package > "FoBiS.py" that is packaged in ::guru just as "FoBiS" as I wasn't sure > is it worth to store ".py" suffix while github repo of this project is > just "FoBiS". So there could be a problem if package named "fobis" > will appear in PyPi. Thanks for this example. This is actually a perfect case that makes you really, really think about dropping ".py" and a perfect explanation why we should keep it, even if it makes the package name look "unnatural". -- Best regards, Michał Górny
Re: [gentoo-dev] dev-python/ package naming policy?
The similar names in PyPi is a real problem for users when trying to find associated packages. It's also could be a security issue for them with malicious packages named like popular packages. So in ::guru I try to save package naming even if it's too CamelCase.As for replacing dot (".") with hyphen ("-") I have PyPi package "FoBiS.py" that is packaged in ::guru just as "FoBiS" as I wasn't sure is it worth to store ".py" suffix while github repo of this project is just "FoBiS". So there could be a problem if package named "fobis" will appear in PyPi.28.01.2023, 19:38, "Michał Górny" :Hi, everyone.TL;DR: I'd like to propose naming dev-python/* packages following PyPInames whenever possible, case-preserving, with modifications only whennecessary to match PN rules.So far the naming in dev-python/* hasn't been exactly consistent. Myself I've been mostly following "whatever's the easiest" policy whichgenerally meant following GitHub project names whenever we fetched fromthere.This mostly made sense so far, as I've been thinking of dev-python/primarily in terms of dependencies of other packages. However, it'sbeen pointed out that this makes it hard for people to find packagesthey're looking for.The vast majority of packages in dev-python/ are also published on PyPI[1]. They can afterwards be installed using tools such as pip, orspecified as dependencies of other projects — using their PyPI namesin every case.On top of that, it is not unknown for multiple packages with verysimilar names to coexis, say "foo", "pyfoo" and "python-foo". When GHproject names come into the picture, this can get even more ambiguous. Don't even get me started about developers pushing duplicate packagesbecause they didn't find the existing instance.To improve consistency and make packages easier to find, I'd like topropose going forward that when packages are published on PyPI, we usetheir official PyPI names. This also means preserving the case forthe few packages that use CamelCase names and similar.Some modifications will be necessary. For example, it is legal for PyPIpackage names to include dot (".") — we normally translate that to ahyphen ("-"). We may also have use cases for creating multiple Gentoopackages from the same PyPI package (see e.g. dev-python/ensurepip-*). Then, there are of course Python packages that aren't published on PyPI.Still, I think as a general rule of thumb this would make sense. WDYT?[1] https://pypi.org/-- Best regards,Michał Górny
Re: [gentoo-dev] dev-python/ package naming policy?
On 28/01/2023 17.38, Michał Górny wrote: To improve consistency and make packages easier to find, I'd like to propose going forward that when packages are published on PyPI, we use their official PyPI names. This also means preserving the case for the few packages that use CamelCase names and similar. Consistency is generally a good thing. So +1 FTR, I think this should probably be applied in general in such situations, and not just for the Python ecosystem. - Flow
Re: [gentoo-dev] dev-python/ package naming policy?
> On Sat, 28 Jan 2023, Andrew Ammerlaan wrote: > Each of these is a different package. The package you usually want is > GitPython, but if we would name it gitpython or git-python, things > would get very confusing very quickly. In fact, this package was > renamed precisely to avoid this confusion in [1]. This is not the only > case where there are very similarly named packages on pypi. By having > a 1 to 1 mapping between names in pypi and names in ::gentoo we avoid > this confusion. Looking at mgorny's list, you cannot have an 1 to 1 mapping anyway, because that would result in invalid PN names. signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, 2023-01-28 at 22:15 +0500, Anna (cybertailor) Vyalkova wrote: > I'd prefer if PyPI names are guidelines, not a strict policy. I don't > like CamelCase and separators other than dash ("-") :P > > Also I don't like when packages are named "dev-python/python-foo" > instead of just "dev-python/foo". > So instead you claim "foo" and block adding actual "foo" later? -- Best regards, Michał Górny
Re: [gentoo-dev] dev-python/ package naming policy?
On 2023-01-28 19:02, Ulrich Mueller wrote: > > On Sat, 28 Jan 2023, Michał Górny wrote: > >> However, it's been pointed out that this makes it hard for people to > >> find packages they're looking for. > > I don't understand this argument. Why would all-lowercase make finding a > package harder? It doesn't. `eix` search is case-insensitive.
Re: [gentoo-dev] dev-python/ package naming policy?
On 28/01/2023 19:02, Ulrich Mueller wrote: On Sat, 28 Jan 2023, Michał Górny wrote: However, it's been pointed out that this makes it hard for people to find packages they're looking for. I don't understand this argument. Why would all-lowercase make finding a package harder? Here's an example, on pypi we have packages: - git-python - python-git - GitPython - git-py Each of these is a different package. The package you usually want is GitPython, but if we would name it gitpython or git-python, things would get very confusing very quickly. In fact, this package was renamed precisely to avoid this confusion in [1]. This is not the only case where there are very similarly named packages on pypi. By having a 1 to 1 mapping between names in pypi and names in ::gentoo we avoid this confusion. Best regards, Andrew [1] https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0dec450a90c7490f11df7e69cd9c6709c099285c
Re: [gentoo-dev] dev-python/ package naming policy?
> On Sat, 28 Jan 2023, Michał Górny wrote: > Based on existing remote-id entries, the following package names are > mismatched (PN on left, PyPI name on right). Note that some of the IDs > could be wrong, particularly because PyPI "autocorrects" - vs _. Are there any rules by which upstream use of upper vs lower case can be predicted? On first glance they look completely random, which is exactly the reason why we have an all-lowercase policy for PN. >> However, it's been pointed out that this makes it hard for people to >> find packages they're looking for. I don't understand this argument. Why would all-lowercase make finding a package harder? signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, Jan 28, 2023 at 05:38:05PM +0100, Michał Górny wrote: > Hi, everyone. > > TL;DR: I'd like to propose naming dev-python/* packages following PyPI > names whenever possible, case-preserving, with modifications only when > necessary to match PN rules. > > > So far the naming in dev-python/* hasn't been exactly consistent. > Myself I've been mostly following "whatever's the easiest" policy which > generally meant following GitHub project names whenever we fetched from > there. > > This mostly made sense so far, as I've been thinking of dev-python/ > primarily in terms of dependencies of other packages. However, it's > been pointed out that this makes it hard for people to find packages > they're looking for. > > The vast majority of packages in dev-python/ are also published on PyPI > [1]. They can afterwards be installed using tools such as pip, or > specified as dependencies of other projects — using their PyPI names > in every case. > > On top of that, it is not unknown for multiple packages with very > similar names to coexis, say "foo", "pyfoo" and "python-foo". When GH > project names come into the picture, this can get even more ambiguous. > Don't even get me started about developers pushing duplicate packages > because they didn't find the existing instance. > > > To improve consistency and make packages easier to find, I'd like to > propose going forward that when packages are published on PyPI, we use > their official PyPI names. This also means preserving the case for > the few packages that use CamelCase names and similar. > > Some modifications will be necessary. For example, it is legal for PyPI > package names to include dot (".") — we normally translate that to a > hyphen ("-"). We may also have use cases for creating multiple Gentoo > packages from the same PyPI package (see e.g. dev-python/ensurepip-*). > Then, there are of course Python packages that aren't published on PyPI. > > Still, I think as a general rule of thumb this would make sense. WDYT? Just to say I'm all for it. As much as I don't like some of the pypi^H^H^H^HPyPi^HI names and mismatches from the "typical" style used in the tree, it's a small price to pay for consistency within this large group of packages. > > > [1] https://pypi.org/ -- ionen signature.asc Description: PGP signature
Re: [gentoo-dev] dev-python/ package naming policy?
I'd prefer if PyPI names are guidelines, not a strict policy. I don't like CamelCase and separators other than dash ("-") :P Also I don't like when packages are named "dev-python/python-foo" instead of just "dev-python/foo".
Re: [gentoo-dev] dev-python/ package naming policy?
On Sat, 2023-01-28 at 17:38 +0100, Michał Górny wrote: > TL;DR: I'd like to propose naming dev-python/* packages following PyPI > names whenever possible, case-preserving, with modifications only when > necessary to match PN rules. Based on existing remote-id entries, the following package names are mismatched (PN on left, PyPI name on right). Note that some of the IDs could be wrong, particularly because PyPI "autocorrects" - vs _. aiohttp-cors | aiohttp_cors anyqt | AnyQt automat | Automat aws-xray-sdk-python | aws-xray-sdk blake3-py | blake3 boolean-py| boolean.py bottleneck| Bottleneck cachecontrol | CacheControl cangjie | CangJie cerberus | Cerberus certifi | certifi-system-store chameleon | Chameleon charset_normalizer| charset-normalizer cheetah3 | Cheetah3 cherrypy | CherryPy cjkwrap | CJKwrap cli_helpers | cli-helpers collective-checkdocs | collective.checkdocs configupdater | ConfigUpdater cx_Freeze | cx-Freeze cython| Cython deprecated| Deprecated discogs-client| python3-discogs-client django| Django django_polymorphic| django-polymorphic dogpile-cache | dogpile.cache easyprocess | EasyProcess editorconfig-core-py | EditorConfig elasticsearch-py | elasticsearch7 ensurepip-pip | pip ensurepip-setuptools | setuptools ensurepip-wheels | pip et_xmlfile| et-xmlfile eyeD3 | eyed3 flask-api | Flask-API flask-babel | Flask-Babel flask-compress| Flask-Compress flask-cors| Flask-Cors flask-debug | Flask-Debug flask-gravatar| Flask-Gravatar flask-htmlmin | Flask-HTMLmin flask-login | Flask-Login flask | Flask flask-migrate | Flask-Migrate flask-paranoid| Flask-Paranoid flask-script | Flask-Script flask-sphinx-themes | Flask-Sphinx-Themes flit_core | flit-core flit_scm | flit-scm flufl-lock| flufl.lock genshi| Genshi github3 | github3.py gmpy | gmpy2 google-reauth-python | google-reauth hcloud-python | hcloud imapclient| IMAPClient importlib_metadata| importlib-metadata importlib_resources | importlib-resources indexed_gzip | indexed-gzip jack-client | JACK-Client jaraco-classes| jaraco.classes jaraco-collections| jaraco.collections jaraco-context| jaraco.context jaraco-envs | jaraco.envs jaraco-functools | jaraco.functools jaraco-itertools | jaraco.itertools jaraco-logging| jaraco.logging jaraco-path | jaraco.path jaraco-stream | jaraco.stream jaraco-test | jaraco.test jaraco-text | jaraco.text jinja | Jinja2 js2py | Js2Py jschema_to_python | jschema-to-python jupyter_client| jupyter-client jupyter_console | jupyter-console jupyter_core | jupyter-core jupyter_events| jupyter-events jupyter_kernel_test | jupyter-kernel-test jupyterlab_pygments | jupyterlab-pygments jupyterlab_server | jupyterlab-server jupyter_packaging | jupyter-packaging jupyter_server_mathjax| jupyter-server-mathjax jupyter_server| jupyter-server keyrings-alt | keyrings.alt keystoneauth | keystoneauth1 libcloud
[gentoo-dev] dev-python/ package naming policy?
Hi, everyone. TL;DR: I'd like to propose naming dev-python/* packages following PyPI names whenever possible, case-preserving, with modifications only when necessary to match PN rules. So far the naming in dev-python/* hasn't been exactly consistent. Myself I've been mostly following "whatever's the easiest" policy which generally meant following GitHub project names whenever we fetched from there. This mostly made sense so far, as I've been thinking of dev-python/ primarily in terms of dependencies of other packages. However, it's been pointed out that this makes it hard for people to find packages they're looking for. The vast majority of packages in dev-python/ are also published on PyPI [1]. They can afterwards be installed using tools such as pip, or specified as dependencies of other projects — using their PyPI names in every case. On top of that, it is not unknown for multiple packages with very similar names to coexis, say "foo", "pyfoo" and "python-foo". When GH project names come into the picture, this can get even more ambiguous. Don't even get me started about developers pushing duplicate packages because they didn't find the existing instance. To improve consistency and make packages easier to find, I'd like to propose going forward that when packages are published on PyPI, we use their official PyPI names. This also means preserving the case for the few packages that use CamelCase names and similar. Some modifications will be necessary. For example, it is legal for PyPI package names to include dot (".") — we normally translate that to a hyphen ("-"). We may also have use cases for creating multiple Gentoo packages from the same PyPI package (see e.g. dev-python/ensurepip-*). Then, there are of course Python packages that aren't published on PyPI. Still, I think as a general rule of thumb this would make sense. WDYT? [1] https://pypi.org/ -- Best regards, Michał Górny