Re: [Distutils] PyPI is a sick sick hoarder
On Sat, May 16, 2015 at 11:36 PM, Justin Cappos jcap...@nyu.edu wrote: I am no expert, but I don't understand why backtracking algorithms would to be faster than SAT, since they both potentially need to walk over the full set of possible solutions. It is hard to reason about the cost because the worst case is in theory growing exponentially in both cases. This is talked about a bit in this thread: https://github.com/pypa/pip/issues/988 Each algorithm could be computationally more efficient. Basically, *if there are no conflicts* backtracking will certainly win. If there are a huge number of conflicts a SAT solver will certainly win. It's not clear where the tipping point is between the two schemes. However, a better question is does the computational difference matter? If one is a microsecond faster than the other, I don't think anyone cares. However, from the OPIUM paper (listed off of that thread), it is clear that SAT solver resolution can be slow without optimizations to make them work more like backtracking resolvers. From my experience backtracking resolvers are also slow when the conflict rate is high. Pure SAT is fast enough in practice in my experience (concretely: solving thousand of rules takes 1 sec). It becomes more complicated as you need to optimize the solution, especially when you have already installed packages. This is unfortunately not as well discussed in the literature. Pseudo-boolean SAT for optimization was argued to be too slow by the 0 install people, but OTOH, this seems to be what's used in conda, so who knows :) If you SAT solver is in pure python, you can choose a direction of the search which is more meaningful. I believe this is what 0install does from reading http://0install.net/solver.html, and what we have in our own SAT solver code. I unfortunately cannot look at the 0install code myself as it is under the GPL and am working on a BSD solver implementation. I also do not know how they handle updates and already installed packages. This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. The bandwidth cost for SAT solvers should be higher. With a reasonable representation, I think you can make it small enough. To give an idea, our index @ Enthought containing around 20k packages takes ~340 kb compressed w/ bz2 if you only keep the data required for dependency handling (name, version and runtime dependencies), and that's using json, an inefficient encoding, so I suspect encoding all of pypi may be a few MB only fetch, which is generally faster that doing tens of http requests. The libsvolv people worked on a binary representation that may also be worth looking at. P.S. If you'd like to talk off list, possibly over Skype, I'd be happy to talk more with you and/or Robert about minutiae that others may not care about. Sure, I would be happy too. As I mentioned before, we have some code around a SAT-based solver, but it is not ready yet, which is why we kept it private (https://github.com/enthought/sat-solver). It handles well (== both speed and quality-wise) the case where nothing is installed, but behaves poorly when packages are already installed, and does not handle the update case yet. The code is also very prototype-ish, but is not too complicated to experimente with it. David ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
[Distutils] Making pip and PyPI work with conda packages
I've just started monitoring this SIG to get a sense of the issues and status of things. I've also just started working for Continuum Analytics. Continuum has a great desire to make 'pip' work with conda packages. Obviously, we love for users to choose the Anaconda Python distribution but many will not for a variety of reasons (many good reasons). However, we would like for users of other distros still to be able to benefit from our creation of binary packages for many platforms in the conda format. As has been discussed in recent threads on dependency solving, the way conda provides metadata apart from entire packages makes much of that work easier. But even aside from that, there are simply a large number of well-tested packages (not only for Python, it is true, so that's possibly a wrinkle in the task) we have generated in conda format. It is true that right now, a user can in principle type: % pip install conda % conda install some_conda_package But that creates two separate systems for tracking what's installed and what dependencies are resolved; and many users will not want to convert completely to conda after that step. What would be better as a user experience would be to let users do this: % pip install --upgrade pip % pip install some_conda_package Whether that second command ultimately downloads code from pyip.python.org or from repo.continuum.io is probably less important for a user experience perspective. Continuum is very happy to upload all of our conda packages to PyPI if this would improve this user experience. Obviously, the idea here would be that the user would be able to type 'pip list' and friends afterward, and have knowledge of what was installed, even as conda packages. I'm hoping members of the SIG can help me understand both the technical and social obstacles that need to be overcome before this can happen. Yours, David... -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI is a sick sick hoarder
On May 16, 2015 11:22 AM, David Cournapeau courn...@gmail.com wrote: On Sat, May 16, 2015 at 11:36 PM, Justin Cappos jcap...@nyu.edu wrote: I am no expert, but I don't understand why backtracking algorithms would to be faster than SAT, since they both potentially need to walk over the full set of possible solutions. It is hard to reason about the cost because the worst case is in theory growing exponentially in both cases. This is talked about a bit in this thread: https://github.com/pypa/pip/issues/988 Each algorithm could be computationally more efficient. Basically, *if there are no conflicts* backtracking will certainly win. If there are a huge number of conflicts a SAT solver will certainly win. It's not clear where the tipping point is between the two schemes. However, a better question is does the computational difference matter? If one is a microsecond faster than the other, I don't think anyone cares. However, from the OPIUM paper (listed off of that thread), it is clear that SAT solver resolution can be slow without optimizations to make them work more like backtracking resolvers. From my experience backtracking resolvers are also slow when the conflict rate is high. Pure SAT is fast enough in practice in my experience (concretely: solving thousand of rules takes 1 sec). It becomes more complicated as you need to optimize the solution, especially when you have already installed packages. This is unfortunately not as well discussed in the literature. Pseudo-boolean SAT for optimization was argued to be too slow by the 0 install people, but OTOH, this seems to be what's used in conda, so who knows :) Where optimizing means something like find a solution with the newest possible releases of the required packages, not execution speed. If you SAT solver is in pure python, you can choose a direction of the search which is more meaningful. I believe this is what 0install does from reading http://0install.net/solver.html, and what we have in our own SAT solver code. I unfortunately cannot look at the 0install code myself as it is under the GPL and am working on a BSD solver implementation. I also do not know how they handle updates and already installed packages. This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. The bandwidth cost for SAT solvers should be higher. With a reasonable representation, I think you can make it small enough. To give an idea, our index @ Enthought containing around 20k packages takes ~340 kb compressed w/ bz2 if you only keep the data required for dependency handling (name, version and runtime dependencies), and that's using json, an inefficient encoding, so I suspect encoding all of pypi may be a few MB only fetch, which is generally faster that doing tens of http requests. The libsvolv people worked on a binary representation that may also be worth looking at. P.S. If you'd like to talk off list, possibly over Skype, I'd be happy to talk more with you and/or Robert about minutiae that others may not care about. Sure, I would be happy too. As I mentioned before, we have some code around a SAT-based solver, but it is not ready yet, which is why we kept it private (https://github.com/enthought/sat-solver). It handles well (== both speed and quality-wise) the case where nothing is installed, but behaves poorly when packages are already installed, and does not handle the update case yet. The code is also very prototype-ish, but is not too complicated to experimente with it. David ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Making pip and PyPI work with conda packages
On 16 May 2015 at 20:04, David Mertz dme...@continuum.io wrote: What would be better as a user experience would be to let users do this: % pip install --upgrade pip % pip install some_conda_package Whether that second command ultimately downloads code from pyip.python.org or from repo.continuum.io is probably less important for a user experience perspective. Continuum is very happy to upload all of our conda packages to PyPI if this would improve this user experience. Obviously, the idea here would be that the user would be able to type 'pip list' and friends afterward, and have knowledge of what was installed, even as conda packages. I'm hoping members of the SIG can help me understand both the technical and social obstacles that need to be overcome before this can happen. My immediate thought is, what obstacles stand in the way of a conda to wheel conversion utility? With such a utility, a wholesale conversion of conda packages to wheels, along with hosting those wheels somewhere (binstar? PyPI isn't immediately possible as only package owners can upload files), would essentially give this capability. There presumably are issues with this approach (maybe technical, more likely social) but it seems to me that understanding *why* this approach doesn't work would be a good first step towards identifying an actual solution. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Making pip and PyPI work with conda packages
On Sat, May 16, 2015 at 12:04 PM, David Mertz dme...@continuum.io wrote: Continuum has a great desire to make 'pip' work with conda packages. Obviously, we love for users to choose the Anaconda Python distribution but many will not for a variety of reasons (many good reasons). Hmm -- this strikes me as very, very , tricky -- and of course, tied in to the other thread I've been spending a bunch of time on... However, we would like for users of other distros still to be able to benefit from our creation of binary packages for many platforms in the conda format. Frankly, if you want your efforts at building binaries to get used outside of Anaconda, then you shoudl be building wheels in the first place. While conda does more than pip + wheel can do -- I suppose you _could_ use wheels for the things it can support.. But on to the technical issues: conda python packages depend on other conda packages, and some of those packages are not python packages at all. The common use case here are non-python dynamic libs -- exactly the use case I've been going on in the other thread about... And conda installs those dynamic libs in a conda environment -- outside of the python environment. So you can't really use a conda package without a conda enviroment, and an installer that understands that environment (I think conda install does some lib path re-naming, yes?), i.e. conda itself. So I think that's kind of a dead end. So what about the idea of a conda-package-to-wheel converter? conda packages an wheels have a bit in common -- IIUC, they are both basically a zip of all the files you need installed. But again the problem is those dependencies on third party dynamic libs. So far that to work -- pip+wheel would have to grow a way to deal with installing, managing and using dynamic libs. See the other thread for the nightmare there... And while I'd love to see this happen, perhaps an easier route would be for conda_build to grow a static flag that will statically link stuff and get to somethign already supported by pip, wheel, and pypi. -Chris It is true that right now, a user can in principle type: % pip install conda % conda install some_conda_package But that creates two separate systems for tracking what's installed and what dependencies are resolved; Indeed -- which is why some folks are working on making it easier to use conda for everythingconverting a wheel to a conda package is probably easier than the other way around.. Funny -- just moments ago I wrote that it didn't seem that anyone other than me was interested in extending pip_wheel to support this kind of thing -- I guess I was wrong! Great to see you and continuum thinking about this. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On 16 May 2015 at 20:04, Chris Barker chris.bar...@noaa.gov wrote: I was referring to the SetDllDirectory API. I don't think that gets picked up by other processes. from: https://msdn.microsoft.com/en-us/library/windows/desktop/ms686203%28v=vs.85%29.aspx It looks like you can add a path, at run time, that gets searched for dlls before the rest of the system locations. And this does to effect any other applications. But you'd need to make sure this got run before any of the effected packages where loaded -- which is proabbly what David meant by needing to control the python binary. Ah, sorry - I misunderstood you. This might work, but as you say, the DLL Path change would need to run before any imports needed it. Which basically means it needs to be part of the Python interpreter startup. It *could* be run as normal user code - you just have to ensure you run it before any imports that need shared libraries. But that seems very fragile to me. I'm not sure it's viable as a generic solution. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On Fri, May 15, 2015 at 11:35 PM, David Cournapeau courn...@gmail.com wrote: On Sat, May 16, 2015 at 4:56 AM, Chris Barker chris.bar...@noaa.gov wrote: But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains: There is actually no way to do that on windows without modifying the interpreter somehow. Darn. This was somehow discussed a bit at PyCon when talking about windows packaging: 1. the simple way to share DLLs across extensions is to put them in the %PATH%, but that's horrible. yes -- that has to be off the table, period. 2. there are ways to put DLLs in a shared directory *not* in the %PATH% since at least windows XP SP2 and above, through the SetDllDirectory API. With 2., you still have the issue of DLL hell, could you clarify a bit -- I thought that this could, at least, put a dir on the search path that was specific to that python context. So it would require cooperation among all the packages being used at once, but not get tangled up with the rest of the system. but maybe I'm wrong here -- I have no idea what the heck I'm doing with this! which may be resolved through naming and activation contexts. I guess that's what I mean by the above.. I had a brief chat with Steve where he mentioned that this may be a solution, but he was not 100 % sure IIRC. The main drawback of this solution is that it won't work when inheriting virtual environments (as you can only set a single directory). no relative paths here? or path that can be set at run time? or maybe Im missing what inheriting virtual environments means... FWIW, we are about to deploy 2. @ Enthought (where we control the python interpreter, so it is much easier for us). It'll be great to see how that works out, then. I take that this means that for Canopy, you've decided that statically linking everything is NOT The way to go. Which is a good data point to have. Thanks for the update. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI is a sick sick hoarder
On 17 May 2015 at 00:36, Justin Cappos jcap...@nyu.edu wrote: This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. This is the defining consideration for pip at this point: a SAT solver requires publication of static dependency metadata on PyPI, which is dependent on both the Warehouse migration *and* the completion and acceptance of PEP 426. Propagation out to PyPI caching proxies and mirrors like devpi and the pulp-python plugin will then take even longer. A backtracking resolver doesn't have those gating dependencies, as it can tolerate the current dynamic metadata model. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On Sat, May 16, 2015 at 4:13 AM, Paul Moore p.f.mo...@gmail.com wrote: Though it's a lot harder to provide a build environment than just the lib to link too .. Im going to have to think more about that... It seems to me that the end user doesn't really have a problem here (pip install matplotlib works fine for me using the existing wheel). Sure -- but that's because Matthew Brett has done a lot of work to make that happen. It's the package maintainers (who have to build the binaries) that have the issue because everyone ends up doing the same work over and over, building dependencies. Exactly -- It would be nice if the ecosystem made that easier. So rather than trying to address the hard problem of dynamic linking, maybe a simpler solution is to set up a PyPI-like hosting solution for static libraries of C dependencies? It could be as simple as a github project that contained a directory for each dependency, I started that here: https://github.com/PythonCHB/mac-builds but haven't kept it up. And Matthew Brett has done most of the work here: https://github.com/MacPython not sure how he's sharing the static libs -- but it could be done. With a setuptools build plugin you could even just specify your libraries in setup.py, and have the plugin download the lib files automatically at build time. actually, that's a pretty cool idea! you'd need place to host them -- gitHbu is no longer hosting downloads are they? though you could probably use github-pages.. (or somethign else) People add libraries to the archive simply by posting pull requests. Maybe the project maintainer maintains the actual binaries by running the builds separately and publishing them separately, or maybe PRs include binaries or you use a CI system to build them. Something like this is being done by a bunch of folks for conda/binstar: https://github.com/ioos/conda-recipes is just one example. PS The above is described as if it's single-platform, mostly because I only tend to think about these issues from a Windows POV, but it shouldn't be hard to extend it to multi-platform. Indeed -- the MacWheels projects are, of course single platform, but could be extended. though at the end of the day, there isn't much to share between building libs on different platforms (unless you are using a cross-platfrom build tool -- why I was trying out gattai for my stuff) The conda stuff is multi-platform, though, in fact, you have to write a separate build script for each platform -- it doesn't really provide anything to help with that part. But while these efforts are moving towards removing the need for every pacakge maintainer to build the deps -- we are now duplicating the effort of trying to remove duplication of effort :-) -- but maybe just waiting for something to gain momentum and rise to the top is the answer. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI is a sick sick hoarder
I am no expert, but I don't understand why backtracking algorithms would to be faster than SAT, since they both potentially need to walk over the full set of possible solutions. It is hard to reason about the cost because the worst case is in theory growing exponentially in both cases. This is talked about a bit in this thread: https://github.com/pypa/pip/issues/988 Each algorithm could be computationally more efficient. Basically, *if there are no conflicts* backtracking will certainly win. If there are a huge number of conflicts a SAT solver will certainly win. It's not clear where the tipping point is between the two schemes. However, a better question is does the computational difference matter? If one is a microsecond faster than the other, I don't think anyone cares. However, from the OPIUM paper (listed off of that thread), it is clear that SAT solver resolution can be slow without optimizations to make them work more like backtracking resolvers. From my experience backtracking resolvers are also slow when the conflict rate is high. This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. The bandwidth cost for SAT solvers should be higher. Thanks, Justin P.S. If you'd like to talk off list, possibly over Skype, I'd be happy to talk more with you and/or Robert about minutiae that others may not care about. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On Sat, May 16, 2015 at 10:12 AM, Nick Coghlan ncogh...@gmail.com wrote: Maybe, but it's a problem to be solved, and the Linux distros more or less solve it for us, but OS-X and Windows have no such system built in (OS-X does have Brew and macports) Windows 10 has Chocalatey and OneGet: * https://chocolatey.org/ * http://blogs.msdn.com/b/garretts/archive/2015/01/27/oneget-and-the-windows-10-preview.aspx cool -- though I don't think we want the official python to depend on a third party system, and one get won't be available for most users for a LONG time... The fact that OS-X users have to choose between fink, macport, homebrew or roll-your-own is a MAJOR soruce of pain for supporting the OS-X community. More than one way to do it is not the goal. conda and nix then fill the niche for language independent packaging at the user level rather than the system level. yup -- conda is, indeed, pretty cool. I think there is a bit of fuzz here -- cPython, at least, uses the the operating system provided C/C++ dynamic linking system -- it's not a totally independent thing. I'm specifically referring to the *declaration* of dependencies here. sure -- that's my point about the current missing link -- setuptools, pip, etc, can only declare python-package-level dependencies, not binary-level dependencies. My idea is to bundle up a shared lib in a python package -- then, if you declare a dependency on that package, you've handles the dep issue. The trick is that a particular binary wheel depends on that other binary wheel -- rather than the whole package depending on it. (that is, on linux, it would have no dependency, on OS-X it would -- but then only the wheel built for a non-macports build, etc). I think we could hack around this by monkey-patching the wheel after it is built, so may be worth playing with to see how it works before proposing any changes to the ecosystem. And if you are using something like conda you don't need pip or wheels anyway! Correct, just as if you're relying solely on Linux system packages, you don't need pip or wheels. Aside from the fact that conda is cross-platform, the main difference between the conda community and a Linux distro is in the *kind* of software we're likely to have already done the integration work for. sure. but the cross-platform thing is BIG -- we NEED pip and wheel because rpm, or deb, or ... are all platform and distro dependent -- we want a way for package maintainers to support a broad audience without having to deal with 12 different package systems. The key to understanding the difference in the respective roles of pip and conda is realising that there are *two* basic distribution scenarios that we want to be able to cover (I go into this in more detail in https://www.python.org/dev/peps/pep-0426/#development-distribution-and-deployment-of-python-software ): hmm -- sure, they are different, but is it impossible to support both with one system? * software developer/publisher - software integrator/service operator (or data analyst) * software developer/publisher - software integrator - service operator (or data analyst) ... On the consumption side, though, the nature of the PyPA tooling as a platform-independent software publication toolchain means that if you want to consume the PyPA formats directly, you need to be prepared to do your own integration work. Exactly! and while Linux system admins can do their own system integration work, everyday users (and many Windows sys admins) can't, and we shouldn't expect them to. And, in fact, the PyPA tooling does support the more casual user much of the time -- for example, I'm in the third quarter of a Python certification class -- Intro, Web development, Advanced topics -- and only half way through the third class have I run into any problems with sticking with the PyPA tools. (except for pychecker -- not being on Pypi :-( ) Many public web service developers are entirely happy with that deal, but most system administrators and data analysts trying to deal with components written in multiple programming languages aren't. exactly -- but it's not because the audience is different in their role -- it's because different users need different python packages. The PyPA tools support pure-python great -- and compiled extensions without deps pretty well -- but there is a bit of gap with extensions that require other deps. It's a 90% (95%) solution... It'd be nice to get it to a 99% solution. Where is really gets ugly is where you need stuff that has nothing to do with python -- say a Julia run-time, or ... Anaconda is there to support that: their philosophy is that if you are trying to do full-on data analysis with python, you are likely to need stuff strickly beyond the python ecosystem -- your own Fortran code, numpy (which requires LLVM), etc. Maybe they are right -- but there is still a heck of a lot of stuff that you can do and stay
Re: [Distutils] PyPI is a sick sick hoarder
On Sun, May 17, 2015 at 12:40 AM, Daniel Holth dho...@gmail.com wrote: On May 16, 2015 11:22 AM, David Cournapeau courn...@gmail.com wrote: On Sat, May 16, 2015 at 11:36 PM, Justin Cappos jcap...@nyu.edu wrote: I am no expert, but I don't understand why backtracking algorithms would to be faster than SAT, since they both potentially need to walk over the full set of possible solutions. It is hard to reason about the cost because the worst case is in theory growing exponentially in both cases. This is talked about a bit in this thread: https://github.com/pypa/pip/issues/988 Each algorithm could be computationally more efficient. Basically, *if there are no conflicts* backtracking will certainly win. If there are a huge number of conflicts a SAT solver will certainly win. It's not clear where the tipping point is between the two schemes. However, a better question is does the computational difference matter? If one is a microsecond faster than the other, I don't think anyone cares. However, from the OPIUM paper (listed off of that thread), it is clear that SAT solver resolution can be slow without optimizations to make them work more like backtracking resolvers. From my experience backtracking resolvers are also slow when the conflict rate is high. Pure SAT is fast enough in practice in my experience (concretely: solving thousand of rules takes 1 sec). It becomes more complicated as you need to optimize the solution, especially when you have already installed packages. This is unfortunately not as well discussed in the literature. Pseudo-boolean SAT for optimization was argued to be too slow by the 0 install people, but OTOH, this seems to be what's used in conda, so who knows :) Where optimizing means something like find a solution with the newest possible releases of the required packages, not execution speed. Indeed, it was not obvious in this context :) Though in theory, optimization is more general. It could be optimizing w.r.t. a cost function taking into account #packages, download size, minimal number of changes, etc... This is where you want a pseudo-boolean SAT, which is what conda uses I think. 0install, composer and I believe libsolv took a different route, and use heuristics to find a reasonably good solution by picking the next candidate. This requires access to the internals of the SAT solver though (not a problem if you have a python implementation). David If you SAT solver is in pure python, you can choose a direction of the search which is more meaningful. I believe this is what 0install does from reading http://0install.net/solver.html, and what we have in our own SAT solver code. I unfortunately cannot look at the 0install code myself as it is under the GPL and am working on a BSD solver implementation. I also do not know how they handle updates and already installed packages. This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. The bandwidth cost for SAT solvers should be higher. With a reasonable representation, I think you can make it small enough. To give an idea, our index @ Enthought containing around 20k packages takes ~340 kb compressed w/ bz2 if you only keep the data required for dependency handling (name, version and runtime dependencies), and that's using json, an inefficient encoding, so I suspect encoding all of pypi may be a few MB only fetch, which is generally faster that doing tens of http requests. The libsvolv people worked on a binary representation that may also be worth looking at. P.S. If you'd like to talk off list, possibly over Skype, I'd be happy to talk more with you and/or Robert about minutiae that others may not care about. Sure, I would be happy too. As I mentioned before, we have some code around a SAT-based solver, but it is not ready yet, which is why we kept it private (https://github.com/enthought/sat-solver). It handles well (== both speed and quality-wise) the case where nothing is installed, but behaves poorly when packages are already installed, and does not handle the update case yet. The code is also very prototype-ish, but is not too complicated to experimente with it. David ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI is a sick sick hoarder
On May 16, 2015, at 1:24 PM, Nick Coghlan ncogh...@gmail.com wrote: On 17 May 2015 at 00:36, Justin Cappos jcap...@nyu.edu wrote: This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. This is the defining consideration for pip at this point: a SAT solver requires publication of static dependency metadata on PyPI, which is dependent on both the Warehouse migration *and* the completion and acceptance of PEP 426. Propagation out to PyPI caching proxies and mirrors like devpi and the pulp-python plugin will then take even longer. A backtracking resolver doesn't have those gating dependencies, as it can tolerate the current dynamic metadata model. Even when we have Warehouse and PEP 426, that only gives us that data going forward, the 400k files that currently exist on PyPI still won’t have static metadata. We could parse it out for Wheels but not for anything else. For the foreseeable future any solution will need to be able to handle iteratively finding constraints. Though I think a SAT solver can do it if it can handle incremental solving or just by re-doing the SAT problem each time we discover a new constraint. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On 16 May 2015 at 19:40, Chris Barker chris.bar...@noaa.gov wrote: With 2., you still have the issue of DLL hell, could you clarify a bit -- I thought that this could, at least, put a dir on the search path that was specific to that python context. So it would require cooperation among all the packages being used at once, but not get tangled up with the rest of the system. but maybe I'm wrong here -- I have no idea what the heck I'm doing with this! Suppose Python adds C:\PythonXY\SharedDLLs to %PATH%. Suppose there's a libpng.dll in there, for matplotlib. Everything works fine. Then I install another non-Python application that uses libpng.dll, and does so by putting libpng.dll alongside the executable (a common way of making DLLs available with Windows applications). Also assume that the application installer adds the application directory to the *start* of PATH. Now, Python extensions will use this 3rd party application's DLL rather than the correct one. If it's ABI-incompatible, the Python extension will crash. If it's ABI compatible, but behaves differently (it could be a different version) there could be inconsistencies or failures. The problem is that while Python can add a DLL directory to PATH, it cannot control what *else* is on PATH, or what has priority. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On Sat, May 16, 2015 at 11:54 AM, Paul Moore p.f.mo...@gmail.com wrote: could you clarify a bit -- I thought that this could, at least, put a dir on the search path that was specific to that python context. So it would require cooperation among all the packages being used at once, but not get tangled up with the rest of the system. but maybe I'm wrong here -- I have no idea what the heck I'm doing with this! Suppose Python adds C:\PythonXY\SharedDLLs to %PATH%. Suppose there's a libpng.dll in there, for matplotlib. I think we all agree that %PATH% is NOT the option! Taht is the key source od dll hell on Windows. I was referring to the SetDllDirectory API. I don't think that gets picked up by other processes. from: https://msdn.microsoft.com/en-us/library/windows/desktop/ms686203%28v=vs.85%29.aspx It looks like you can add a path, at run time, that gets searched for dlls before the rest of the system locations. And this does to effect any other applications. But you'd need to make sure this got run before any of the effected packages where loaded -- which is proabbly what David meant by needing to control the python binary. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI and Uploading Documentation
On Fri, 15 May 2015, at 15:48, Donald Stufft wrote: Hey! Hi Donald Ideally I hope people start to use ReadTheDocs instead of PyPI itself. +1. Do you want to use the python.org domain (ex. pypi.python.org/docs) or keep RTD on it own domain? -- Sébastien Douche s...@nmeos.net Twitter: @sdouche http://douche.name ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On 15 May 2015 at 04:01, Chris Barker chris.bar...@noaa.gov wrote: I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me Are we talking about Python libraries accessed via Python APIs, or linking to external dependencies not written in Python (including linking directly to C libraries shipped with a Python library)? I, at least, am talking about the latter. for a concrete example: libpng, for instance, might be needed by PIL, wxPython, Matplotlib, and who knows what else. At this point, if you want to build a package of any of these, you need to statically link it into each of them, or distribute shared libs with each package -- if you ware using them all together (which I do, anyway) you now have three copies of the same lib (but maybe different versions) all linked into your executable. Maybe there is no downside to that (I haven't had a problem yet), but it seems like a bad way to do it! It's the latter I consider to be out of scope for a language specific packaging system Maybe, but it's a problem to be solved, and the Linux distros more or less solve it for us, but OS-X and Windows have no such system built in (OS-X does have Brew and macports) Windows 10 has Chocalatey and OneGet: * https://chocolatey.org/ * http://blogs.msdn.com/b/garretts/archive/2015/01/27/oneget-and-the-windows-10-preview.aspx conda and nix then fill the niche for language independent packaging at the user level rather than the system level. - Python packaging dependencies are designed to describe inter-component dependencies based on the Python import system, not dependencies based on the operating system provided C/C++ dynamic linking system. I think there is a bit of fuzz here -- cPython, at least, uses the the operating system provided C/C++ dynamic linking system -- it's not a totally independent thing. I'm specifically referring to the *declaration* of dependencies here. While CPython itself will use the dynamic linker to load extension modules found via the import system, the loading of further dynamically linked modules beyond that point is entirely opaque not only to the interpreter runtime at module import time, but also to pip at installation time. If folks are after the latter, than they want a language independent package system, like conda, nix, or the system package manager in a Linux distribution. And I am, indeed, focusing on conda lately for this reason -- but not all my users want to use a whole new system, they just want to pip install and have it work. And if you are using something like conda you don't need pip or wheels anyway! Correct, just as if you're relying solely on Linux system packages, you don't need pip or wheels. Aside from the fact that conda is cross-platform, the main difference between the conda community and a Linux distro is in the *kind* of software we're likely to have already done the integration work for. The key to understanding the difference in the respective roles of pip and conda is realising that there are *two* basic distribution scenarios that we want to be able to cover (I go into this in more detail in https://www.python.org/dev/peps/pep-0426/#development-distribution-and-deployment-of-python-software): * software developer/publisher - software integrator/service operator (or data analyst) * software developer/publisher - software integrator - service operator (or data analyst) Note the second line has 3 groups and 2 distribution arrows, while the first line only has the 2 groups and a single distribution step. pip and the other Python specific tools cover that initial developer/publisher - integrator link for Python projects. This means that Python developers only need to learn a single publishing toolchain (the PyPA tooling) to get started, and they'll be able to publish their software in a format that any integrator that supports Python can consume (whether that's for direct consumption in a DIY integration scenario, or to put through a redistributor's integration processes). On the consumption side, though, the nature of the PyPA tooling as a platform-independent software publication toolchain means that if you want to consume the PyPA formats directly, you need to be prepared to do your own integration work. Many public web service developers are entirely happy with that deal, but most system administrators and data analysts trying to deal with components written in multiple programming languages aren't. That latter link, where the person or organisation handling the software integration task is distinct from the person or organisation running an operational service, or carrying out some data analysis, are where the language independent redistributor tools like Chocolatey, Nix, deb, rpm, conda, Docker, etc all come in - they let a redistributor handle the integration task (or at least
Re: [Distutils] Making pip and PyPI work with conda packages
On May 16, 2015, at 3:04 PM, David Mertz dme...@continuum.io wrote: I've just started monitoring this SIG to get a sense of the issues and status of things. I've also just started working for Continuum Analytics. Continuum has a great desire to make 'pip' work with conda packages. Obviously, we love for users to choose the Anaconda Python distribution but many will not for a variety of reasons (many good reasons). However, we would like for users of other distros still to be able to benefit from our creation of binary packages for many platforms in the conda format. As has been discussed in recent threads on dependency solving, the way conda provides metadata apart from entire packages makes much of that work easier. But even aside from that, there are simply a large number of well-tested packages (not only for Python, it is true, so that's possibly a wrinkle in the task) we have generated in conda format. It is true that right now, a user can in principle type: % pip install conda % conda install some_conda_package But that creates two separate systems for tracking what's installed and what dependencies are resolved; and many users will not want to convert completely to conda after that step. What would be better as a user experience would be to let users do this: % pip install --upgrade pip % pip install some_conda_package Whether that second command ultimately downloads code from pyip.python.org http://pyip.python.org/ or from repo.continuum.io http://repo.continuum.io/ is probably less important for a user experience perspective. Continuum is very happy to upload all of our conda packages to PyPI if this would improve this user experience. Obviously, the idea here would be that the user would be able to type 'pip list' and friends afterward, and have knowledge of what was installed, even as conda packages. I'm hoping members of the SIG can help me understand both the technical and social obstacles that need to be overcome before this can happen. As Paul mentioned, I’m not sure I see a major benefit to being able to ``pip install`` a conda package that doesn’t come with a lot of footguns, since any conda package either won’t be able to depend on things like Python or random C libraries or we’re going to have to just ignore those dependencies or what have you. I think a far more workable solution is one that translates a conda package to a Wheel. Practically speaking the only real benefit that conda packages has over pip is the one benefit that simply teaching pip to install conda packages won’t provide - Namely that it supports things which aren’t Python packages. However I don’t think it’s likely that we’re going to be able to install R or erlang or whatever into a virtual environment (for instance), but maybe I’m wrong. There are a few other benefits, but that’s not anything that are inherent in the two different approaches, it’s just things that conda has that pip is planning on getting, it just hasn’t gotten them yet because either we have to convince people to publish our new formats (e.g. we can’t go out and create a wheel repo of common packages) or because we haven’t gotten to it yet because dealing with the crushing legacy of PyPI’s ~400k packages is significant slow down factor. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI is a sick sick hoarder
On 16 May 2015 at 11:52, Robert Collins robe...@robertcollins.net wrote: On 16 May 2015 at 13:45, Donald Stufft don...@stufft.io wrote: On May 15, 2015, at 9:22 PM, Robert Collins robe...@robertcollins.net wrote: On 16 May 2015 at 11:08, Marcus Smith qwc...@gmail.com wrote: Why not start with pip at least being a simple fail-on-conflict resolver (vs the 1st found wins resolver it is now)... You'd backtrack for the sake of re-walking when new constraints are found, but not for the purpose of solving conflicts. I know you're motivated to solve Openstack build issues, but many of the issues I've seen in the pip tracker, I think would be solved without the backtracking resolver you're trying to build. Well, I'm scratching the itch I have. If its too hard to get something decent, sure I might back off in my goals, but I see no point aiming for something less than all the other language specific packaging systems out there have. So what makes the other language specific packaging systems different? As far as I know all of them have complete archives (e.g. they are like PyPI where they have a lot of versions, not like Linux Distros). What can we learn from how they solved this? NB; I have by no means finished low hanging heuristics and space trimming stuff :). I have some simple things in mind and am sure I'll end up with something 'good enough' for day to day use. The thing I'm worried about is the long term health of the approach. Longer term, I think it makes sense to have the notion of active and obsolete versions baked into PyPI's API and the web UI. This wouldn't be baked into the package metadata itself (unlike the proposed Obsoleted-By field for project renaming), but rather be a dynamic reflection of whether or not *new* users should be looking at the affected version, and whether or not it should be considered as a candidate for dependency resolution when not specifically requested. (This could also replace the current hidden versions feature, which only hides things from the web UI, without having any impact on the information published to automated tools through the programmatic API) Tools that list outdated packages could also be simplified a bit, as their first pass could just be to check the obsolescence markers on installed packages, with the second pass being to check for newer versions of those packages. While the bare minimum would be to let project mantainers set the obsolescence flag directly, we could also potentially offer projects some automated obsolescence schemes, such as: * single active released version, anything older is marked as obsolete whenever a new (non pre-release) version is uploaded * semantic versioning, with a given maximum number of active released X versions (e.g. 2), but only the most recent (according to PEP 440) released version with a given X.* is active, everything else is obsolete * CPython-style and date-based versioning, with a given maximum number of active released X.Y versions (e.g. 2), but only the most recent (according to PEP 440) released version with a given X.Y.* is active, everything else is obsolete Pre-release versions could also be automatically flagged as obsolete by PyPI as soon as a newer version for the same release (including the final release itself) was uploaded for the given package. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On 17 May 2015 06:19, Chris Barker chris.bar...@noaa.gov wrote: indeed -- but it does have a bunch of python-specific featuresit was built around the need to combine python with other systems. That makes it an interesting alternative to pip on the package *consumption* side for data analysts, but it isn't currently a good fit for any of pip's other use cases (e.g. one of the scenarios I'm personally most interested in is that pip is now part of the Fedora/RHEL/CentOS build pipeline for Python based RPM packages - we universally recommend using pip install in the %install phase over using setup.py install directly) hmm -- conda generally uses setup.py install in its build scripts. And it doesn't use pip install because it wants to handle the downloading and dependencies itself (in fact, turning OFF setuptools dependency handling is an annoyance..) So I'm not sure why pip is needed here -- would it be THAT much harder to build rpms of python packages if it didn't exist? (I do see why you wouldn't want to use conda to build rpms..) We switched to recommending pip to ensure that the Fedora (et al) build toolchain can be updated to emit handle newer Python metadata standards just by upgrading pip. For example, it means that system installed packages on modern Fedora installations should (at least in theory) provide full PEP 376 installation metadata with the installer reported as the system package manager. The conda folks (wastefully, in my view) are still attempting to compete directly with pip upstream, instead of delegating to it from their build scripts as an abstraction layer that helps hide the complexity of the Python packaging ecosystem. But while _maybe_ if conda had been around 5 years earlier we could have not bothered with wheel, No, we couldn't, as conda doesn't work as well for system integrators. I'm not proposing that we drop it -- just that we push pip and wheel a bit farther to broaden the supported user-base. I can't stop you working on something I consider a deep rabbithole, but why not just recommend the use of conda, and only pubish sdists on PyPI? conda needs more users and contributors seeking better integration with the PyPA tooling, and minimising the non-productive competition. The web development folks targeting Linux will generally be in a position to build from source (caching the resulting wheel file, or perhaps an entire container image). Also, assuming Fedora's experiment with language specific repos goes well ( https://fedoraproject.org/wiki/Env_and_Stacks/Projects/LanguageSpecificRepositories), we may see other distros replicating that model of handling the wheel creation task on behalf of their users. It's also worth noting that one of my key intended use cases for metadata extensions is to publish platform specific external dependencies in the upstream project metadata, which would get us one step closer to fully automated repackaging into policy compliant redistributor packages. Binary wheels already work for Python packages that have been developed with cross-platform maintainability and deployability taken into account as key design considerations (including pure Python wheels, where the binary format just serves as an installation accelerator). That category just happens to exclude almost all research and data analysis software, because it excludes the libraries at the bottom of that stack It doesn't quite exclude those -- just makes it harder. And while depending on Fortran, etc, is pretty unique to the data analysis stack, stuff like libpng, libcurl, etc, etc, isn't -- non-system libs are not a rare thing. The rare thing is having two packages which are tightly coupled to the ABI of a given external dependency. That's a generally bad idea because it causes exactly these kinds of problems with independent distribution of prebuilt components. The existence of tight ABI coupling between components both gives the scientific Python stack a lot of its power, *and* makes it almost as hard to distribute in binary form as native GUI applications. It's also the case that when you *are* doing your own system integration, wheels are a powerful tool for caching builds, conda does this nicely as well :-) Im not tlrying to argue, at all, that binary wheels are useless, jsu that they could be a bit more useful. A PEP 426 metadata extension proposal for describing external binary dependencies would certainly be a welcome addition. That's going to be a common need for automated repackaging tools, even if we never find a practical way to take advantage of it upstream. Ah -- here is a key point -- because of that, we DO support binary packages on PyPi -- but only for Windows and OS-X.. I'm just suggesting we find a way to extend that to pacakges that require a non-system non-python dependency. At the point you're managing arbitrary external binary dependencies, you've lost all the constraints that let us get away with doing
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On Sat, May 16, 2015 at 4:56 AM, Chris Barker chris.bar...@noaa.gov wrote: On Fri, May 15, 2015 at 1:49 AM, Paul Moore p.f.mo...@gmail.com wrote: On 14 May 2015 at 19:01, Chris Barker chris.bar...@noaa.gov wrote: Ah -- here is the issue -- but I think we HAVE pretty much got what we need here -- at least for Windows and OS-X. It depends what you mean by curated, but it seems we have a (defacto?) policy for PyPi: binary wheels should be compatible with the python.org builds. So while each package wheel is supplied by the package maintainer one way or another, rather than by a central entity, it is more or less curated -- or at least standardized. And if you are going to put a binary wheel up, you need to make sure it matches -- and that is less than trivial for packages that require a third party dependency -- but building the lib statically and then linking it in is not inherently easier than doing a dynamic link. I think the issue is that, if we have 5 different packages that depend on (say) libpng, and we're using dynamic builds, then how do those packages declare that they need access to libpng.dll? this is the missing link -- it is a binary build dependency, not a package dependency -- so not such much that matplotlib-1.4.3 depends on libpng.x.y, but that: matplotlib-1.4.3-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl depends on: libpng-x.y (all those binary parts will come from the platform) That's what's missing now. And on Windows, where does the user put libpng.dll so that it gets picked up? Well, here is the rub -- Windows dll hell really is hell -- but I think it goes into the python dll searchpath (sorry, not on a Windows box where I can really check this out right now), it can work -- I know have an in-house product that has multiple python modules sharing a single dll somehow And how does a non-expert user do this (put it in $DIRECTORY, update your PATH, blah blah blah doesn't work for the average user)? That's why we may need to update the tooling to handle this -- Im not totally sure if the current wheel format can support this on Windows -- though it can on OS-X. In particular, on Windows, note that the shared DLL must either be in the directory where the executable is located (which is fun when you have virtualenvs, embedded interpreters, etc), or on PATH (which has other implications - suppose I have an incompatible version of libpng.dll, from mingw, say, somewhere earlier on PATH). that would be dll hell, yes. The problem isn't so much defining a standard ABI that shared DLLs need - as you say, that's a more or less solved problem on Windows - it's managing how those shared DLLs are made available to Python extensions. And *that* is what Unix package managers do for you, and Windows doesn't have a good solution for (other than bundle all the dependent DLLs with the app, or suffer DLL hell). exactly -- but if we consider the python install to be the app, rather than an individual python bundle, then we _may_ be OK. PS For a fun exercise, it might be interesting to try breaking conda - Windows really is simply broken [1] in this regard -- so I'm quite sure you could break conda -- but it does seem to do a pretty good job of not being broken easily by common uses -- I can't say I know enough about Windows dll finding or conda to know how... Oh, and conda is actually broken in this regard on OS-X at this point -- if you compile your own extension in an anaconda environment, it will find a shared lib at compile time that it won't find at run time. -- the conda install process fixes these, but that's a pain when under development -- i.e. you don't want to have to actually install the package with conda to run a test each time you re-build the dll.. (or even change a bit of python code...) But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains: There is actually no way to do that on windows without modifying the interpreter somehow. This was somehow discussed a bit at PyCon when talking about windows packaging: 1. the simple way to share DLLs across extensions is to put them in the %PATH%, but that's horrible. 2. there are ways to put DLLs in a shared directory *not* in the %PATH% since at least windows XP SP2 and above, through the SetDllDirectory API. With 2., you still have the issue of DLL hell, which may be resolved through naming and activation contexts. I had a brief chat with Steve where he mentioned that this may be a solution, but he was not 100 % sure IIRC. The main drawback of this solution is that it won't work when inheriting virtual environments (as you can only set a single directory). FWIW, we are
Re: [Distutils] PyPI and Uploading Documentation
On 16 May 2015 at 04:34, Donald Stufft don...@stufft.io wrote: So I can’t speak for ReadTheDocs, but I believe that they are considering and/or are planning on offering arbitrary HTML uploads similarly to how you can upload documentation to PyPI. I don’t know if this will actually happen and what it would look like but I know they are thinking about it. I've never tried it with ReadTheDocs, but in theory the .. raw:: html docutils directive allows arbitrary HTML content to be embedded in a reStructuredText page. Regardless, Can ReadTheDocs do X? questions are better asked on https://groups.google.com/forum/#!forum/read-the-docs, while both GitHub and Atlassian (via BitBucket) offer free static HTML hosting. In relation to the original question, +1 for attempting to phase out PyPI's documentation hosting capability in favour of delegating to RTFD or third party static HTML hosting. One possible option to explore that minimises disruption for existing users might be to stop offering it to *new* projects, while allowing existing projects to continue uploading new versions of their documentation. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI and Uploading Documentation
On May 16, 2015 4:55 AM, Nick Coghlan ncogh...@gmail.com wrote: On 16 May 2015 at 04:34, Donald Stufft don...@stufft.io wrote: So I can’t speak for ReadTheDocs, but I believe that they are considering and/or are planning on offering arbitrary HTML uploads similarly to how you can upload documentation to PyPI. I don’t know if this will actually happen and what it would look like but I know they are thinking about it. I've never tried it with ReadTheDocs, but in theory the .. raw:: html docutils directive allows arbitrary HTML content to be embedded in a reStructuredText page. Regardless, Can ReadTheDocs do X? questions are better asked on https://groups.google.com/forum/#!forum/ https://groups.google.com/forum/#!forum/read-the-docsread-the-docs https://groups.google.com/forum/#!forum/read-the-docs, ReadTheDocs is hiring! https://blog.readthedocs.com/read-the-docs-is-hiring/ while both GitHub and Atlassian (via BitBucket) offer free static HTML hosting. I just wrote a tool (pypi:pgs) for serving files over HTTP directly from gh-pages branches that works in conjunction with pypi:ghp-import ( gh-pages; touch .nojekyll). CloudFront DNS can sort of be used to add TLS/SSL to custom domains with GitHub Pages (and probably BitBucket) In relation to the original question, +1 for attempting to phase out PyPI's documentation hosting capability in favour of delegating to RTFD or third party static HTML hosting. One possible option to explore that minimises disruption for existing users might be to stop offering it to *new* projects, while allowing existing projects to continue uploading new versions of their documentation. - [ ] DOC: migration / alternatives guide Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)
On 16 May 2015 at 07:35, David Cournapeau courn...@gmail.com wrote: But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains: There is actually no way to do that on windows without modifying the interpreter somehow. This was somehow discussed a bit at PyCon when talking about windows packaging: 1. the simple way to share DLLs across extensions is to put them in the %PATH%, but that's horrible. 2. there are ways to put DLLs in a shared directory *not* in the %PATH% since at least windows XP SP2 and above, through the SetDllDirectory API. With 2., you still have the issue of DLL hell, which may be resolved through naming and activation contexts. I had a brief chat with Steve where he mentioned that this may be a solution, but he was not 100 % sure IIRC. The main drawback of this solution is that it won't work when inheriting virtual environments (as you can only set a single directory). FWIW, we are about to deploy 2. @ Enthought (where we control the python interpreter, so it is much easier for us). This is indeed precisely the issue. In general, Python code can run with the executable being in many different places - there are the standard installs, virtualenvs, and embedding scenarios to consider. So put DLLs alongside the executable, which is often how Windows applications deal with this issue, is not a valid option (that's an option David missed out above, but that's fine as it doesn't work :-)) Putting DLLs on %PATH% *does* cause problems, and pretty severe ones. People who use ports of Unix tools, such as myself, hit this a lot - at one point I got so frustrated with various incompatible versions of libintl showing up on my PATH, all with the same name, that I went on a spree of rebuilding all of the GNU tools without libintl support, just to avoid the issue (and older versions openssl were just as bad with libeay, etc). So, as David says, you pretty much have to use SetDllDirectory and similar features to get a viable location for shared DLLs. I guess it *may* be possible to call those APIs from a Python extension that you load *before* using any shared DLLs, but that seems like a very fragile solution. It's also possible for Python 3.6+ to add a new shared DLLs location for such things, which the core interpreter includes (either via SetDllDirectory or by the same mechanism that adds C:\PythonXY\DLLs to the search path at the moment). But that wouldn't help older versions. So while I encourage Chris' enthusiasm in looking for a solution to this issue, I'm not sure it's as easy as he's hoping. Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI and Uploading Documentation
Ok, so unless someone comes out against this in the near future here are my plans: 1. Implement the ability to delete documentation. 2. Implement the ability to add a (simple) redirect where we would essentially just send /project/(.*) to $REDIRECT_BASE/$1. 3. Implement the ability to point the documentation URL to something that isn't pythonhosted.org 4. Send an email out to all projects that are currently utilizing the hosted documentation telling that it is going away, and give them links to RTD and GithubPages and whatever bitbucket calls their service. 5. Disable Documentation Uploads to PyPI with an error message that tells people the service has been discontinued. In addition to the above steps, we'll maintain any documentaton that doesn't get deleted (and the above redirects) indefinitely. Serving static read only documentation (other than deletes) is something that we can do without much trouble or cost. I think that this will cover all of the things that people in this thread have brought up as well as providing a sane migration path to go from pythonhosted.org documentation to wherever they choose to place their docs in the future. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Making pip and PyPI work with conda packages
On Sat, May 16, 2015 at 4:16 PM, Donald Stufft don...@stufft.io wrote: On Sat, May 16, 2015 at 3:03 PM, Donald Stufft don...@stufft.io wrote: There are a few other benefits, but that’s not anything that are inherent in the two different approaches, it’s just things that conda has that pip is planning on getting, Huh? I'm confused -- didn't we just have a big thread about how pip+wheel probably ISN'T going to handle shared libs -- that those are exactly what conda packages do provide -- aside from R and Erlange, anyway :-) but it's not the packages in this case that we need -- it's the environment -- and I can't see how pip is going to provide a conda environment…. I never said pip was going to provide an environment, I said the main benefit conda has over pip, which pip will most likely not get in any reasonable time frame, is that it handles things which are not Python packages. well, I got a bit distraced by Erlang and R -- i.e. things that have nothing to do with python packages. libxml, on the other hand, is a lib that one might want to use with a python package -- so a bit more apropos here. But my confusion was about: things that conda has that pip is planning on getting -- what are those things? Any of the stuff that conda has that really useful like handling shared libs, pip is NOT getting -- yes? A shared library is not a Python package so I’m not sure what this message is even saying? ``pip install lxml-from-conda`` is just going to flat out break because pip won’t install the libxml2 shared library. exactly -- if you're going to install a shared lib, you need somewhere to put it -- and that's what a conda environment provides. Trying not to go around in circles, but python _could_ provide a standard place in which to put shared libs -- and then pip _could_ provide a way to manage them. That would require dealing with that whole binary API problem, so we probably won't do it. I'm not sure what the point of contention is here: I think it would be useful to have a way to manage shared libs solely for python packages to use -- and it would be useful to that way to be part of the standard python ecosytem. Others may not think it would be useful enough to be worth the pain in the neck it would be. And that's what the nifty conda packages continuum (and others) have built could provide -- those shared libs that are built in a compatible way with a python binary. After all, pure python packages are no problem, compiled python packages without any dependencies are little problem. The hard part is those darn third party libs. conda also provides a way to mange all sorts of other stuff that has nothing to do with python, but I'm guessing that's not what continuum would like to contribute to pypi -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Making pip and PyPI work with conda packages
On May 16, 2015, at 8:50 PM, Chris Barker chris.bar...@noaa.gov wrote: On Sat, May 16, 2015 at 4:16 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: On Sat, May 16, 2015 at 3:03 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: There are a few other benefits, but that’s not anything that are inherent in the two different approaches, it’s just things that conda has that pip is planning on getting, Huh? I'm confused -- didn't we just have a big thread about how pip+wheel probably ISN'T going to handle shared libs -- that those are exactly what conda packages do provide -- aside from R and Erlange, anyway :-) but it's not the packages in this case that we need -- it's the environment -- and I can't see how pip is going to provide a conda environment…. I never said pip was going to provide an environment, I said the main benefit conda has over pip, which pip will most likely not get in any reasonable time frame, is that it handles things which are not Python packages. well, I got a bit distraced by Erlang and R -- i.e. things that have nothing to do with python packages. libxml, on the other hand, is a lib that one might want to use with a python package -- so a bit more apropos here. But my confusion was about: things that conda has that pip is planning on getting -- what are those things? Any of the stuff that conda has that really useful like handling shared libs, pip is NOT getting -- yes? The ability to resolve dependencies with static metadata is the major one that comes to my mind that’s specific to pip. The ability to have better build systems besides distutils/setuptools is a more ecosystem level one but that’s something we’ll get too. As far as shared libs… beyond what’s already possible (sticking a shared lib inside of a python project and having libraries load that .dll explicitly) it’s not currently on the road map and may never be. I hesitate to say never because it’s obviously a problem that needs solved and if the Python ecosystem solves it (specific to shared libraries, not whole runtimes or other languages or what have you) then that would be a useful thing. I think we have lower hanging fruit that we need to deal with before something like that is even possibly to be on the radar though (if we ever put it on the radar). A shared library is not a Python package so I’m not sure what this message is even saying? ``pip install lxml-from-conda`` is just going to flat out break because pip won’t install the libxml2 shared library. exactly -- if you're going to install a shared lib, you need somewhere to put it -- and that's what a conda environment provides. Trying not to go around in circles, but python _could_ provide a standard place in which to put shared libs -- and then pip _could_ provide a way to manage them. That would require dealing with that whole binary API problem, so we probably won't do it. I'm not sure what the point of contention is here: I think it would be useful to have a way to manage shared libs solely for python packages to use -- and it would be useful to that way to be part of the standard python ecosytem. Others may not think it would be useful enough to be worth the pain in the neck it would be. And that's what the nifty conda packages continuum (and others) have built could provide -- those shared libs that are built in a compatible way with a python binary. After all, pure python packages are no problem, compiled python packages without any dependencies are little problem. The hard part is those darn third party libs. conda also provides a way to mange all sorts of other stuff that has nothing to do with python, but I'm guessing that's not what continuum would like to contribute to pypi…. I guess I’m confused what the benefit of making pip able to install a conda package would be. If Python adds someplace for shared libs to go then we could just add shared lib support to Wheels, it’s just another file type so that’s not a big deal. The hardest part is dealing with ABI compatibility. However, given the current state of things, what’s the benefit of being able to do ``pip install conda-lxml``? Either it’s going to flat out break or you’re going to have to do ``conda install libxml2`` first, and if you’re doing ``conda install libxml2`` first then why not just do ``conda install lxml``? I view conda the same way I view apt-get, yum, Chocolatey, etc. It provides an environment and you can install a Python package into that environment, but that pip shouldn’t know how to install a .deb or a .rpm or a conda package because those packages rely on specifics to that environment and Python packages can’t. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist
Re: [Distutils] PyPI and Uploading Documentation
Donald Stufft don...@stufft.io writes: Ok, so unless someone comes out against this in the near future here are my plans: 1. Implement the ability to delete documentation. +1. 2. Implement the ability to add a (simple) redirect where we would essentially just send /project/(.*) to $REDIRECT_BASE/$1. 3. Implement the ability to point the documentation URL to something that isn't pythonhosted.org Both of these turn PyPI into a vector for arbitrary content, including (for example) illegal, misleading, or malicious content. Automatic redirects actively expose the visitor to any malicious or mistaken links set by the project owner. If you want to allow the documentation to be at some arbitrary location of the project owner's choice, then an explicit static link, which the visitor must click on (similar to the project home page link) is best. -- \ “I find the whole business of religion profoundly interesting. | `\ But it does mystify me that otherwise intelligent people take | _o__)it seriously.” —Douglas Adams | Ben Finney ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI and Uploading Documentation
On May 16, 2015, at 9:31 PM, Ben Finney ben+pyt...@benfinney.id.au wrote: Donald Stufft don...@stufft.io writes: Ok, so unless someone comes out against this in the near future here are my plans: 1. Implement the ability to delete documentation. +1. 2. Implement the ability to add a (simple) redirect where we would essentially just send /project/(.*) to $REDIRECT_BASE/$1. 3. Implement the ability to point the documentation URL to something that isn't pythonhosted.org Both of these turn PyPI into a vector for arbitrary content, including (for example) illegal, misleading, or malicious content. Automatic redirects actively expose the visitor to any malicious or mistaken links set by the project owner. If you want to allow the documentation to be at some arbitrary location of the project owner's choice, then an explicit static link, which the visitor must click on (similar to the project home page link) is best. To be clear, the documentation isn’t hosted on PyPI, it’s hosted on pythonhosted.org and we already allow people to upload arbitrary content to that domain, which can include JS based redirects. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Making pip and PyPI work with conda packages
On May 16, 2015, at 7:09 PM, Chris Barker chris.bar...@noaa.gov wrote: On Sat, May 16, 2015 at 3:03 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: There are a few other benefits, but that’s not anything that are inherent in the two different approaches, it’s just things that conda has that pip is planning on getting, Huh? I'm confused -- didn't we just have a big thread about how pip+wheel probably ISN'T going to handle shared libs -- that those are exactly what conda packages do provide -- aside from R and Erlange, anyway :-) but it's not the packages in this case that we need -- it's the environment -- and I can't see how pip is going to provide a conda environment…. I never said pip was going to provide an environment, I said the main benefit conda has over pip, which pip will most likely not get in any reasonable time frame, is that it handles things which are not Python packages. A shared library is not a Python package so I’m not sure what this message is even saying? ``pip install lxml-from-conda`` is just going to flat out break because pip won’t install the libxml2 shared library. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Making pip and PyPI work with conda packages
On Sat, May 16, 2015 at 3:03 PM, Donald Stufft don...@stufft.io wrote: There are a few other benefits, but that’s not anything that are inherent in the two different approaches, it’s just things that conda has that pip is planning on getting, Huh? I'm confused -- didn't we just have a big thread about how pip+wheel probably ISN'T going to handle shared libs -- that those are exactly what conda packages do provide -- aside from R and Erlange, anyway :-) but it's not the packages in this case that we need -- it's the environment -- and I can't see how pip is going to provide a conda environment -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig