Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Sat, Jan 9, 2016 at 8:58 PM, Matthew Brett wrote: > On Sat, Jan 9, 2016 at 8:49 PM, Nathaniel Smith wrote: >> On Jan 9, 2016 10:09, "Matthew Brett" wrote: >>> >>> Hi Sandro, >>> >>> On Sat, Jan 9, 2016 at 4:44 AM, Sandro Tosi wrote: >>> >> I wrote a page on using pip with Debian / Ubuntu here : >>> >> https://matthew-brett.github.io/pydagogue/installing_on_debian.html >>> > >>> > Speaking with my numpy debian maintainer hat on, I would really >>> > appreciate if you dont suggest to use pip to install packages in >>> > Debian, or at least not as the only solution. >>> >>> I'm very happy to accept alternative suggestions or PRs. >>> >>> I know what you mean, but I can't yet see how to write a page that >>> would be good for explaining the benefits / tradeoffs of using deb >>> packages vs mainly or only pip packages vs a mix of the two. Do you >>> have any thoughts? >> >> Why not replace all the "sudo pip" calls with "pip --user"? The trade offs >> between Debian-installed packages versus pip --user installed packages are >> subtle, and both are good options. Personal I'd generally recommend anyone >> actively developing python code to skip straight to pip for most things, >> since you'll eventually end up there anyway, but this is definitely >> debatable and situation dependent. On the other hand, "sudo pip" >> specifically is something I'd never recommend, and indeed has the potential >> to totally break your system. > > Sure, but I don't think the page is suggesting doing ``sudo pip`` for > anything other than upgrading pip and virtualenv(wrapper) - and I > don't think that is likely to break the system. It could... a quick glance suggests that currently installing virtualenvwrapper like that will also pull in some random pypi snapshot of stevedore, which will shadow the built-in package version. And then stevedore is used by tons of different debian packages, including large parts of openstack... But more to the point, the target audience for your page is hardly equipped to perform that kind of analysis, never mind in the general case of using 'sudo pip' for arbitrary Python packages, and your very first example is one that demonstrates bad habits... So personally I'd avoid mentioning the possibility of 'sudo pip', or better yet explicitly warn against it. -n -- Nathaniel J. Smith -- http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Sat, Jan 9, 2016 at 8:49 PM, Nathaniel Smith wrote: > On Jan 9, 2016 10:09, "Matthew Brett" wrote: >> >> Hi Sandro, >> >> On Sat, Jan 9, 2016 at 4:44 AM, Sandro Tosi wrote: >> >> I wrote a page on using pip with Debian / Ubuntu here : >> >> https://matthew-brett.github.io/pydagogue/installing_on_debian.html >> > >> > Speaking with my numpy debian maintainer hat on, I would really >> > appreciate if you dont suggest to use pip to install packages in >> > Debian, or at least not as the only solution. >> >> I'm very happy to accept alternative suggestions or PRs. >> >> I know what you mean, but I can't yet see how to write a page that >> would be good for explaining the benefits / tradeoffs of using deb >> packages vs mainly or only pip packages vs a mix of the two. Do you >> have any thoughts? > > Why not replace all the "sudo pip" calls with "pip --user"? The trade offs > between Debian-installed packages versus pip --user installed packages are > subtle, and both are good options. Personal I'd generally recommend anyone > actively developing python code to skip straight to pip for most things, > since you'll eventually end up there anyway, but this is definitely > debatable and situation dependent. On the other hand, "sudo pip" > specifically is something I'd never recommend, and indeed has the potential > to totally break your system. Sure, but I don't think the page is suggesting doing ``sudo pip`` for anything other than upgrading pip and virtualenv(wrapper) - and I don't think that is likely to break the system. Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Jan 9, 2016 10:09, "Matthew Brett" wrote: > > Hi Sandro, > > On Sat, Jan 9, 2016 at 4:44 AM, Sandro Tosi wrote: > >> I wrote a page on using pip with Debian / Ubuntu here : > >> https://matthew-brett.github.io/pydagogue/installing_on_debian.html > > > > Speaking with my numpy debian maintainer hat on, I would really > > appreciate if you dont suggest to use pip to install packages in > > Debian, or at least not as the only solution. > > I'm very happy to accept alternative suggestions or PRs. > > I know what you mean, but I can't yet see how to write a page that > would be good for explaining the benefits / tradeoffs of using deb > packages vs mainly or only pip packages vs a mix of the two. Do you > have any thoughts? Why not replace all the "sudo pip" calls with "pip --user"? The trade offs between Debian-installed packages versus pip --user installed packages are subtle, and both are good options. Personal I'd generally recommend anyone actively developing python code to skip straight to pip for most things, since you'll eventually end up there anyway, but this is definitely debatable and situation dependent. On the other hand, "sudo pip" specifically is something I'd never recommend, and indeed has the potential to totally break your system. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Sat, Jan 9, 2016 at 6:57 PM, Sandro Tosi wrote: > On Sat, Jan 9, 2016 at 6:08 PM, Matthew Brett wrote: >> Hi Sandro, >> >> On Sat, Jan 9, 2016 at 4:44 AM, Sandro Tosi wrote: I wrote a page on using pip with Debian / Ubuntu here : https://matthew-brett.github.io/pydagogue/installing_on_debian.html >>> >>> Speaking with my numpy debian maintainer hat on, I would really >>> appreciate if you dont suggest to use pip to install packages in >>> Debian, or at least not as the only solution. >> >> I'm very happy to accept alternative suggestions or PRs. >> >> I know what you mean, but I can't yet see how to write a page that >> would be good for explaining the benefits / tradeoffs of using deb >> packages vs mainly or only pip packages vs a mix of the two. Do you >> have any thoughts? > > you can start by making extremely clear that this is not the Debian > supported way to install python modules on a Debian system, that if a > user uses pip to do it, it's very likely other applications or modules > will fail, that if they have any problem with anything python related, > they are on their own as they "broke" their system on purpose. thanks > for considering I updated the page with more on reasons to prefer Debian packages over installing with pip: https://matthew-brett.github.io/pydagogue/installing_on_debian.html Is that enough to get the message across? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Sat, Jan 9, 2016 at 6:08 PM, Matthew Brett wrote: > Hi Sandro, > > On Sat, Jan 9, 2016 at 4:44 AM, Sandro Tosi wrote: >>> I wrote a page on using pip with Debian / Ubuntu here : >>> https://matthew-brett.github.io/pydagogue/installing_on_debian.html >> >> Speaking with my numpy debian maintainer hat on, I would really >> appreciate if you dont suggest to use pip to install packages in >> Debian, or at least not as the only solution. > > I'm very happy to accept alternative suggestions or PRs. > > I know what you mean, but I can't yet see how to write a page that > would be good for explaining the benefits / tradeoffs of using deb > packages vs mainly or only pip packages vs a mix of the two. Do you > have any thoughts? you can start by making extremely clear that this is not the Debian supported way to install python modules on a Debian system, that if a user uses pip to do it, it's very likely other applications or modules will fail, that if they have any problem with anything python related, they are on their own as they "broke" their system on purpose. thanks for considering -- Sandro "morph" Tosi My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi G+: https://plus.google.com/u/0/+SandroTosi ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
> Maybe a better approach would be to look at what libraries are used on by an up-to-date default Anaconda install (on the assumption that this is the best tested configuration) That's not a bad idea. I also have a couple other ideas about how to filter this based on using debian popularity-contests and the package graph. I will report back when I have more info. -Robert On Sat, Jan 9, 2016 at 3:04 PM, Nathaniel Smith wrote: > On Sat, Jan 9, 2016 at 3:52 AM, Robert McGibbon > wrote: > > Hi all, > > > > I went ahead and tried to collect a list of all of the libraries that > could > > be considered to constitute the "base" system for linux-64. The strategy > I > > used was to leverage off the work done by the folks at Continuum by > > searching through their pre-compiled binaries from > > https://repo.continuum.io/pkgs/free/linux-64/ to find shared libraries > that > > were dependened on (according to ldd) that were not accounted for by the > > declared dependencies that each package made known to the conda package > > manager. > > > > The full list of these system libraries, sorted in from > > most-commonly-depend-on to rarest, is below. There are 158 of them. > [...] > > So it's not perfect. But it might be a useful starting place. > > Unfortunately, yeah, it looks like there's a lot of false positives in > here :-(. For example your list contains liblzma and libsqlite, but > both of these are shipped as dependencies of python itself. So > probably someone just forgot to declare the dependency explicitly, but > got away with it because the libraries were pulled in anyway. > > Maybe a better approach would be to look at what libraries are used on > by an up-to-date default Anaconda install (on the assumption that this > is the best tested configuration), and then erase from the list all > libraries that are shipped by this configuration (ignoring declared > dependencies since those seem to be unreliable)? It's better to be > conservative here, since the end goal is to come up with a list of > external libraries that we're confident have actually been tested for > compatibility by lots and lots of different users. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
On Sat, Jan 9, 2016 at 3:52 AM, Robert McGibbon wrote: > Hi all, > > I went ahead and tried to collect a list of all of the libraries that could > be considered to constitute the "base" system for linux-64. The strategy I > used was to leverage off the work done by the folks at Continuum by > searching through their pre-compiled binaries from > https://repo.continuum.io/pkgs/free/linux-64/ to find shared libraries that > were dependened on (according to ldd) that were not accounted for by the > declared dependencies that each package made known to the conda package > manager. > > The full list of these system libraries, sorted in from > most-commonly-depend-on to rarest, is below. There are 158 of them. [...] > So it's not perfect. But it might be a useful starting place. Unfortunately, yeah, it looks like there's a lot of false positives in here :-(. For example your list contains liblzma and libsqlite, but both of these are shipped as dependencies of python itself. So probably someone just forgot to declare the dependency explicitly, but got away with it because the libraries were pulled in anyway. Maybe a better approach would be to look at what libraries are used on by an up-to-date default Anaconda install (on the assumption that this is the best tested configuration), and then erase from the list all libraries that are shipped by this configuration (ignoring declared dependencies since those seem to be unreliable)? It's better to be conservative here, since the end goal is to come up with a list of external libraries that we're confident have actually been tested for compatibility by lots and lots of different users. -n -- Nathaniel J. Smith -- http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Jan 9, 2016 04:12, "Julian Taylor" wrote: > > On 09.01.2016 04:38, Nathaniel Smith wrote: > > On Fri, Jan 8, 2016 at 7:17 PM, Nathan Goldbaum wrote: > >> Doesn't building on CentOS 5 also mean using a quite old version of gcc? > > > > Yes. IIRC CentOS 5 ships with gcc 4.4, and you can bump that up to gcc > > 4.8 by using the Redhat Developer Toolset release (which is gcc + > > special backport libraries to let it generate RHEL5/CentOS5-compatible > > binaries). (I might have one or both of those version numbers slightly > > wrong.) > > > >> I've never tested this, but I've seen claims on the anaconda mailing list of > >> ~25% slowdowns compared to building from source or using system packages, > >> which was attributed to building using an older gcc that doesn't optimize as > >> well as newer versions. > > > > I'd be very surprised if that were a 25% slowdown in general, as > > opposed to a 25% slowdown on some particular inner loop that happened > > to neatly match some new feature in a new gcc (e.g. something where > > the new autovectorizer kicked in). But yeah, in general this is just > > an inevitable trade-off when it comes to distributing binaries: you're > > always going to pay some penalty for achieving broad compatibility as > > compared to artisanally hand-tuned binaries specialized for your > > machine's exact OS version, processor, etc. Not much to be done, > > really. At some point the baseline for compatibility will switch to > > "compile everything on CentOS 6", and that will be better but it will > > still be worse than binaries that target CentOS 7, and so on and so > > forth. > > > > I have over the years put in one gcc specific optimization after the > other so yes using an ancient version will make many parts significantly > slower. Though that is not really a problem, updating a compiler is easy > even without redhats devtoolset. > > At least as far as numpy is concerned linux binaries should not be a > very big problem. The only dependency where the version matters is glibc > which has updated its interfaces we use (in a backward compatible way) > many times. > But here if we use a old enough baseline glibc (e.g. centos5 or ubuntu > 10.04) we are fine at reasonable performance costs, basically only > slower memcpy. Are you saying that it's easy to use, say, gcc 5.3's C compiler to produce binaries that will run on an out-of-the-box centos 5 install? I assumed that there'd be issues with things like new symbol versions in libgcc, not just glibc, but if not then that would be great... > Scipy on the other hand is a larger problem as it contains C++ code. > Linux systems are now transitioning to C++11 which is binary > incompatible in parts to the old standard. There a lot of testing is > necessary to check if we are affected. > How does Anaconda deal with C++11? IIUC the situation with the C++ stdlib changes in gcc 5 is that old binaries will continue to work on new systems. The only thing that breaks is that if two libraries want to pass objects of the affected types back and forth (e.g. std::string), then either they both need to be compiled with the old abi or they both need to be compiled with the new abi. (And when using a new compiler it's still possible to choose the old abi with a #define; old compilers of course only support the old abi.) See: http://developerblog.redhat.com/2015/02/05/gcc5-and-the-c11-abi/ So the answer is that most python packages don't care, because even the ones written in C++ don't generally talk C++ across package boundaries, and for the ones that do care then the people making the binary packages will have to coordinate to use the same abi. And for local builds on modern systems that link against binary packages built using the old abi, people might have to use -D_GLIBCXX_USE_CXX11_ABI=0. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Windows build/distribute plan & MingwPy funding
On Mon, Jan 4, 2016 at 7:38 PM, Matthew Brett wrote: > Hi, > > On Mon, Jan 4, 2016 at 5:35 PM, Erik Bray > wrote: > > On Sat, Jan 2, 2016 at 3:20 AM, Ralf Gommers > wrote: > >> Hi all, > >> > >> You probably know that building Numpy, Scipy and the rest of the Scipy > Stack > >> on Windows is problematic. And that there are plans to adopt the static > >> MinGW-w64 based toolchain that Carl Kleffner has done a lot of work on > for > >> the last two years to fix that situation. > >> > >> The good news is: this has become a lot more concrete just now, with > this > >> proposal for funding: > http://mingwpy.github.io/proposal_december2015.html > >> > >> Funding for phases 1 and 2 is already confirmed; the phase 3 part has > been > >> submitted to the PSF. Phase 1 (of $1000) is funded by donations made to > >> Numpy and Scipy (through NumFOCUS), and phase 2 (of $4000) by NumFOCUS > >> directly. So a big thank you to everyone who made a donation to Numpy, > Scipy > >> and NumFOCUS! > More good news: the PSF has approved phase 3! > > >> I hope that that proposal gives a clear idea of the work that's going > to be > >> done over the next months. Note that the http://mingwpy.github.io > contains a > >> lot more background info, description of technical issues, etc. > >> > >> Feedback & ideas very welcome of course! > >> > >> Cheers, > >> Ralf > > > > Hi Ralph, > > > > I've seen you drop hints about this recently, and am interested to > > follow this work. I've been hired as part of the OpenDreamKit project > > to work, in large part, on developing a sensible toolchain for > > building and distributing Sage on Windows. I know you've been > > following the thread on that too. Although the primary goal there is > > "whatever works", I'm personally inclined to focus on the mingwpy / > > mingw-w64 approach, due in large part with my past success with the > > MinGW 32-bit toolchain. > That's good to hear Erik. > (I personally have a desire to improve > > support for building with MSVC as well, but that's a less important > > goal as far as the funding is concerned.) > > > > So anyways, please keep me in the loop about this, as I will also be > > putting effort into this over the next year as well. Has there been > > any discussion about setting up a mailing list specifically for this > > project? > We'll definitely keep you in the loop. We try to do as much of the discussion as possible on the mingwpy mailing list and on https://github.com/mingwpy. If there's significant progress then I guess that'll be announced on this list as well. Cheers, Ralf > Yes, it exists already, but not well advertised : > https://groups.google.com/forum/#!forum/mingwpy > > It would be great to share work. > > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
Hi Sandro, On Sat, Jan 9, 2016 at 4:44 AM, Sandro Tosi wrote: >> I wrote a page on using pip with Debian / Ubuntu here : >> https://matthew-brett.github.io/pydagogue/installing_on_debian.html > > Speaking with my numpy debian maintainer hat on, I would really > appreciate if you dont suggest to use pip to install packages in > Debian, or at least not as the only solution. I'm very happy to accept alternative suggestions or PRs. I know what you mean, but I can't yet see how to write a page that would be good for explaining the benefits / tradeoffs of using deb packages vs mainly or only pip packages vs a mix of the two. Do you have any thoughts? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalize hstack/vstack --> stack; Block matrices like in matlab
On 1/9/2016 4:58 AM, Stefan Otte wrote: > Hey, > > one of my new year's resolutions is to get my pull requests > accepted (or closed). So here we go... > > Here is the update pull request: > https://github.com/numpy/numpy/pull/5057 Here is the docstring: > https://github.com/sotte/numpy/commit/3d4c5d19a8f15b35df50d945b9c8853b683f7ab6#diff-2270128d50ff15badd1aba4021c50a8cR358 > > The new `block` function is very similar to matlab's `[A, B; C, > D]`. > > Pros: - it's very useful (in my experiment) - less friction for > people coming from matlab - it's conceptually simple - the > implementation is simple - it's documented - it's tested > > Cons: - the implementation is not super efficient. Temporary > copies are created. However, bmat also does that. > > Feedback is very welcome! > > > Best, Stefan > > Without commenting on the implementation, I would find this function *very* useful and convenient in my own work. -gyro ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Sat, Jan 9, 2016 at 12:12 PM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > On 09.01.2016 04:38, Nathaniel Smith wrote: > > On Fri, Jan 8, 2016 at 7:17 PM, Nathan Goldbaum > wrote: > >> Doesn't building on CentOS 5 also mean using a quite old version of gcc? > > > > Yes. IIRC CentOS 5 ships with gcc 4.4, and you can bump that up to gcc > > 4.8 by using the Redhat Developer Toolset release (which is gcc + > > special backport libraries to let it generate RHEL5/CentOS5-compatible > > binaries). (I might have one or both of those version numbers slightly > > wrong.) > > > >> I've never tested this, but I've seen claims on the anaconda mailing > list of > >> ~25% slowdowns compared to building from source or using system > packages, > >> which was attributed to building using an older gcc that doesn't > optimize as > >> well as newer versions. > > > > I'd be very surprised if that were a 25% slowdown in general, as > > opposed to a 25% slowdown on some particular inner loop that happened > > to neatly match some new feature in a new gcc (e.g. something where > > the new autovectorizer kicked in). But yeah, in general this is just > > an inevitable trade-off when it comes to distributing binaries: you're > > always going to pay some penalty for achieving broad compatibility as > > compared to artisanally hand-tuned binaries specialized for your > > machine's exact OS version, processor, etc. Not much to be done, > > really. At some point the baseline for compatibility will switch to > > "compile everything on CentOS 6", and that will be better but it will > > still be worse than binaries that target CentOS 7, and so on and so > > forth. > > > > I have over the years put in one gcc specific optimization after the > other so yes using an ancient version will make many parts significantly > slower. Though that is not really a problem, updating a compiler is easy > even without redhats devtoolset. > > At least as far as numpy is concerned linux binaries should not be a > very big problem. The only dependency where the version matters is glibc > which has updated its interfaces we use (in a backward compatible way) > many times. > But here if we use a old enough baseline glibc (e.g. centos5 or ubuntu > 10.04) we are fine at reasonable performance costs, basically only > slower memcpy. > > Scipy on the other hand is a larger problem as it contains C++ code. > Linux systems are now transitioning to C++11 which is binary > incompatible in parts to the old standard. There a lot of testing is > necessary to check if we are affected. > How does Anaconda deal with C++11? > For canopy packages, we use the RH devtoolset w/ gcc 4.8.X, and statically link the C++ stdlib. It has worked so far for the few packages requiring C++11 and gcc > 4.4 (llvm/llvmlite/dynd), but that's not a solution I am a fan of myself, as the implications are not always very clear. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
On Sat, Jan 9, 2016 at 12:20 PM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > On 09.01.2016 12:52, Robert McGibbon wrote: > > Hi all, > > > > I went ahead and tried to collect a list of all of the libraries that > > could be considered to constitute the "base" system for linux-64. The > > strategy I used was to leverage off the work done by the folks at > > Continuum by searching through their pre-compiled binaries > > from https://repo.continuum.io/pkgs/free/linux-64/ to find shared > > libraries that were dependened on (according to ldd) that were not > > accounted for by the declared dependencies that each package made known > > to the conda package manager. > > > > do those packages use ld --as-needed for linking? > there are a lot libraries in that list that I highly doubt are directly > used by the packages. > It is also a common problem when building packages without using a "clean" build environment, as it is too easy to pick up dependencies accidentally, especially for autotools-based packages (unless one uses pbuilder or similar tools). David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
> I wrote a page on using pip with Debian / Ubuntu here : > https://matthew-brett.github.io/pydagogue/installing_on_debian.html Speaking with my numpy debian maintainer hat on, I would really appreciate if you dont suggest to use pip to install packages in Debian, or at least not as the only solution. -- Sandro "morph" Tosi My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi G+: https://plus.google.com/u/0/+SandroTosi ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
> do those packages use ld --as-needed for linking? Is it possible to check this? I mean, there are over 7000 packages that I check. I don't know how they were all built. It's totally possible for many of them to be unused. A reasonably common thing might be that packages use ctypes or dlopen to dynamically load shared libraries that are actually just optional (and catch the error and recover gracefully if the library can't be loaded). -Robert On Sat, Jan 9, 2016 at 4:20 AM, Julian Taylor wrote: > On 09.01.2016 12:52, Robert McGibbon wrote: > > Hi all, > > > > I went ahead and tried to collect a list of all of the libraries that > > could be considered to constitute the "base" system for linux-64. The > > strategy I used was to leverage off the work done by the folks at > > Continuum by searching through their pre-compiled binaries > > from https://repo.continuum.io/pkgs/free/linux-64/ to find shared > > libraries that were dependened on (according to ldd) that were not > > accounted for by the declared dependencies that each package made known > > to the conda package manager. > > > > do those packages use ld --as-needed for linking? > there are a lot libraries in that list that I highly doubt are directly > used by the packages. > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
On 09.01.2016 12:52, Robert McGibbon wrote: > Hi all, > > I went ahead and tried to collect a list of all of the libraries that > could be considered to constitute the "base" system for linux-64. The > strategy I used was to leverage off the work done by the folks at > Continuum by searching through their pre-compiled binaries > from https://repo.continuum.io/pkgs/free/linux-64/ to find shared > libraries that were dependened on (according to ldd) that were not > accounted for by the declared dependencies that each package made known > to the conda package manager. > do those packages use ld --as-needed for linking? there are a lot libraries in that list that I highly doubt are directly used by the packages. signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On 09.01.2016 04:38, Nathaniel Smith wrote: > On Fri, Jan 8, 2016 at 7:17 PM, Nathan Goldbaum wrote: >> Doesn't building on CentOS 5 also mean using a quite old version of gcc? > > Yes. IIRC CentOS 5 ships with gcc 4.4, and you can bump that up to gcc > 4.8 by using the Redhat Developer Toolset release (which is gcc + > special backport libraries to let it generate RHEL5/CentOS5-compatible > binaries). (I might have one or both of those version numbers slightly > wrong.) > >> I've never tested this, but I've seen claims on the anaconda mailing list of >> ~25% slowdowns compared to building from source or using system packages, >> which was attributed to building using an older gcc that doesn't optimize as >> well as newer versions. > > I'd be very surprised if that were a 25% slowdown in general, as > opposed to a 25% slowdown on some particular inner loop that happened > to neatly match some new feature in a new gcc (e.g. something where > the new autovectorizer kicked in). But yeah, in general this is just > an inevitable trade-off when it comes to distributing binaries: you're > always going to pay some penalty for achieving broad compatibility as > compared to artisanally hand-tuned binaries specialized for your > machine's exact OS version, processor, etc. Not much to be done, > really. At some point the baseline for compatibility will switch to > "compile everything on CentOS 6", and that will be better but it will > still be worse than binaries that target CentOS 7, and so on and so > forth. > I have over the years put in one gcc specific optimization after the other so yes using an ancient version will make many parts significantly slower. Though that is not really a problem, updating a compiler is easy even without redhats devtoolset. At least as far as numpy is concerned linux binaries should not be a very big problem. The only dependency where the version matters is glibc which has updated its interfaces we use (in a backward compatible way) many times. But here if we use a old enough baseline glibc (e.g. centos5 or ubuntu 10.04) we are fine at reasonable performance costs, basically only slower memcpy. Scipy on the other hand is a larger problem as it contains C++ code. Linux systems are now transitioning to C++11 which is binary incompatible in parts to the old standard. There a lot of testing is necessary to check if we are affected. How does Anaconda deal with C++11? signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalize hstack/vstack --> stack; Block matrices like in matlab
Hey, one of my new year's resolutions is to get my pull requests accepted (or closed). So here we go... Here is the update pull request: https://github.com/numpy/numpy/pull/5057 Here is the docstring: https://github.com/sotte/numpy/commit/3d4c5d19a8f15b35df50d945b9c8853b683f7ab6#diff-2270128d50ff15badd1aba4021c50a8cR358 The new `block` function is very similar to matlab's `[A, B; C, D]`. Pros: - it's very useful (in my experiment) - less friction for people coming from matlab - it's conceptually simple - the implementation is simple - it's documented - it's tested Cons: - the implementation is not super efficient. Temporary copies are created. However, bmat also does that. Feedback is very welcome! Best, Stefan On Sun, May 10, 2015 at 12:33 PM, Stefan Otte wrote: > Hey, > > Just a quick update. I updated the pull request and renamed `stack` into > `block`. Have a look: https://github.com/numpy/numpy/pull/5057 > > I'm sticking with simple initial implementation because it's simple and > does what you think it does. > > > Cheers, > Stefan > > > > On Fri, Oct 31, 2014 at 2:13 PM Stefan Otte wrote: > >> To make the last point more concrete the implementation could look >> something like this (note that I didn't test it and that it still >> takes some work): >> >> >> def bmat(obj, ldict=None, gdict=None): >> return matrix(stack(obj, ldict, gdict)) >> >> >> def stack(obj, ldict=None, gdict=None): >> # the old bmat code minus the matrix calls >> if isinstance(obj, str): >> if gdict is None: >> # get previous frame >> frame = sys._getframe().f_back >> glob_dict = frame.f_globals >> loc_dict = frame.f_locals >> else: >> glob_dict = gdict >> loc_dict = ldict >> return _from_string(obj, glob_dict, loc_dict) >> >> if isinstance(obj, (tuple, list)): >> # [[A,B],[C,D]] >> arr_rows = [] >> for row in obj: >> if isinstance(row, N.ndarray): # not 2-d >> return concatenate(obj, axis=-1) >> else: >> arr_rows.append(concatenate(row, axis=-1)) >> return concatenate(arr_rows, axis=0) >> >> if isinstance(obj, N.ndarray): >> return obj >> >> >> I basically turned the old `bmat` into `stack` and removed the matrix >> calls. >> >> >> Best, >> Stefan >> >> >> >> On Wed, Oct 29, 2014 at 3:59 PM, Stefan Otte >> wrote: >> > Hey, >> > >> > there are several ways how to proceed. >> > >> > - My proposed solution covers the 80% case quite well (at least I use >> > it all the time). I'd convert the doctests into unittests and we're >> > done. >> > >> > - We could slightly change the interface to leave out the surrounding >> > square brackets, i.e. turning `stack([[a, b], [c, d]])` into >> > `stack([a, b], [c, d])` >> > >> > - We could extend it even further allowing a "filler value" for non >> > set values and a "shape" argument. This could be done later as well. >> > >> > - `bmat` is not really matrix specific. We could refactor `bmat` a bit >> > to use the same logic in `stack`. Except the `matrix` calls `bmat` and >> > `_from_string` are pretty agnostic to the input. >> > >> > I'm in favor of the first or last approach. The first: because it >> > already works and is quite simple. The last: because the logic and >> > tests of both `bmat` and `stack` would be the same and the feature to >> > specify a string representation of the block matrix is nice. >> > >> > >> > Best, >> > Stefan >> > >> > >> > >> > On Tue, Oct 28, 2014 at 7:46 PM, Nathaniel Smith wrote: >> >> On 28 Oct 2014 18:34, "Stefan Otte" wrote: >> >>> >> >>> Hey, >> >>> >> >>> In the last weeks I tested `np.asarray(np.bmat())` as `stack` >> >>> function and it works quite well. So the question persits: If `bmat` >> >>> already offers something like `stack` should we even bother >> >>> implementing `stack`? More code leads to more >> >>> bugs and maintenance work. (However, the current implementation is >> >>> only 5 lines and by using `bmat` which would reduce that even more.) >> >> >> >> In the long run we're trying to reduce usage of np.matrix and ideally >> >> deprecate it entirely. So yes, providing ndarray equivalents of matrix >> >> functionality (like bmat) is valuable. >> >> >> >> -n >> >> >> >> >> >> ___ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion@scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
Hi all, I went ahead and tried to collect a list of all of the libraries that could be considered to constitute the "base" system for linux-64. The strategy I used was to leverage off the work done by the folks at Continuum by searching through their pre-compiled binaries from https://repo.continuum.io/pkgs/free/linux-64/ to find shared libraries that were dependened on (according to ldd) that were not accounted for by the declared dependencies that each package made known to the conda package manager. The full list of these system libraries, sorted in from most-commonly-depend-on to rarest, is below. There are 158 of them. ['linux-vdso.so.1', 'libc.so.6', 'libpthread.so.0', 'libm.so.6', 'libdl.so.2', 'libutil.so.1', 'libgcc_s.so.1', 'libstdc++.so.6', 'libexpat.so.1', 'librt.so.1', 'libpng12.so.0', 'libcrypt.so.1', 'libffi.so.6', 'libresolv.so.2', 'libkeyutils.so.1', 'libcom_err.so.2', 'libp11-kit.so.0', 'libkrb5.so.26', 'libheimntlm.so.0', 'libtasn1.so.6', 'libheimbase.so.1', 'libgssapi.so.3', 'libroken.so.18', 'libhcrypto.so.4', 'libhogweed.so.4', 'libnettle.so.6', 'libhx509.so.5', 'libwind.so.0', 'libgnutls-deb0.so.28', 'libasn1.so.8', 'libgmp.so.10', 'libsasl2.so.2', 'libidn.so.11', 'librtmp.so.1', 'liblber-2.4.so.2', 'libldap_r-2.4.so.2', 'libXdmcp.so.6', 'libX11.so.6', 'libXau.so.6', 'libxcb.so.1', 'libgssapi_krb5.so.2', 'libkrb5.so.3', 'libk5crypto.so.3', 'libkrb5support.so.0', 'libicudata.so.55', 'libicuuc.so.55', 'libhdf5_serial.so.10', 'libcurl-gnutls.so.4', 'libhdf5_serial_hl.so.10', 'libtinfo.so.5', 'libgcrypt.so.20', 'libgpg-error.so.0', 'libnsl.so.1', 'libXext.so.6', 'libncursesw.so.5', 'libpanelw.so.5', 'libXrender.so.1', 'libjbig.so.0', 'libpcre.so.3', 'libglib-2.0.so.0', 'libnvidia-tls.so.352.41', 'libnvidia-glcore.so.352.41', 'libGL.so.1', 'libuuid.so.1', 'libSM.so.6', 'libICE.so.6', 'libgobject-2.0.so.0', 'libgfortran.so.1', 'liblzma.so.5', 'libXt.so.6', 'libgmodule-2.0.so.0', 'libXi.so.6', 'libgstpbutils-1.0.so.0', 'liborc-0.4.so.0', 'libgstreamer-1.0.so.0', 'libgsttag-1.0.so.0', 'libgstvideo-1.0.so.0', 'libxslt.so.1', 'libaudio.so.2', 'libjpeg.so.8', 'libgstaudio-1.0.so.0', 'libgstbase-1.0.so.0', 'libgstapp-1.0.so.0', 'libz.so.1', 'libgthread-2.0.so.0', 'libfreetype.so.6', 'libfontconfig.so.1', 'libdbus-1.so.3', 'libsystemd.so.0', 'libltdl.so.7', 'libGLU.so.1', 'libsqlite3.so.0', 'libpgm-5.1.so.0', 'libgomp.so.1', 'libxcb-render.so.0', 'libxcb-shm.so.0', 'libncurses.so.5', 'libxml2.so.2', 'libXss.so.1', 'libXft.so.2', 'libtk.so', 'libtcl.so', 'libasound.so.2', 'libharfbuzz.so.0', 'libpixman-1.so.0', 'libgio-2.0.so.0', 'libXinerama.so.1', 'libselinux.so.1', 'libXcomposite.so.1', 'libthai.so.0', 'libXdamage.so.1', 'libgdk-x11-2.0.so.0', 'libpangoft2-1.0.so.0', 'libcairo.so.2', 'libpangocairo-1.0.so.0', 'libdatrie.so.1', 'libatk-1.0.so.0', 'libXcursor.so.1', 'libXfixes.so.3', 'libgraphite2.so.3', 'libgdk_pixbuf-2.0.so.0', 'libgtk-x11-2.0.so.0', 'libquadmath.so.0', 'libpango-1.0.so.0', 'libXrandr.so.2', 'libgfortran.so.3', 'libjson-c.so.2', 'libshiboken-python2.7.so.1.1', 'libogg.so.0', 'libvorbis.so.0', 'libatlas.so.3', 'libcurl.so.4', 'libhdf5.so.9', 'libodbcinst.so.1', 'libpcap.so.0.9', 'libnetcdf.so.7', 'libblas.so.3', 'libpulse.so.0', 'libcaca.so.0', 'libgstreamer-0.10.so.0', 'libXxf86vm.so.1', 'libhdf5_hl.so.9', 'libpulse-simple.so.0', 'libasyncns.so.0', 'libwrap.so.0', 'libvorbisenc.so.2', 'libmagic.so.1', 'libssl.so.1.0.0', 'libFLAC.so.8', 'libSDL-1.2.so.0', 'libsndfile.so.1', 'libslang.so.2', 'libglapi.so.0', 'libaio.so.1', 'libgstinterfaces-0.10.so.0', 'libpulsecommon-6.0.so', 'libjpeg.so.62', 'libcrypto.so.1.0.0'] This list actually contains a fair number of false positives, so it would need to be pruned manually. If you stare at it a little while, you might see some libraries in there that you recognize that shouldn't be part of the base system, like libatlas.so.3. This gist https://gist.github.com/rmcgibbo/a13e7623c38ec54fcc93 contains some more detailed data -- for each of libraries in the list above, it gives a list of names of the packages that depend on this library. For example, for libatlas.so.3, the there is only a single package which depends on it, ["scikit-learn-0.11-np16py27_ce0"]. So, probably a bug. "libgfortran.so.1" is also in the list. It's depended on by ["cvxopt-1.1.6-py27_0", "cvxopt-1.1.7-py27_0", "cvxopt-1.1.7-py34_0", "cvxopt-1.1.7-py35_0", "numpy-1.5.1-py27_1", "numpy-1.5.1-py27_3", "numpy-1.5.1-py27_4", "numpy-1.5.1-py27_ce0", "numpy-1.6.2-py27_1", "numpy-1.6.2-py27_3", "numpy-1.6.2-py27_4", "numpy-1.6.2-py27_ce0", "numpy-1.7.0-py27_0", "numpy-1.7.0b2-py27_ce0", "numpy-1.7.0rc1-py27_0", "numpy-1.7.1-py27_0", "numpy-1.7.1-py27_2", "numpy-1.8.0-py27_0", "numpy-1.8.1-py27_0", "numpy-1.8.1-py34_0", "numpy-1.8.2-py27_0", "numpy-1.8.2-py34_0", "numpy-1.9.0-py27_0", "numpy-1.9.0-py34_0", "numpy-1.9.1-py27_0", "numpy-1.9.1-py34_0", "numpy-1.9.2-py27_0", "numpy-1.9.2-py34_0"]. Note that this list of numpy versions doesn't include the latest ones -- all
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On 8 January 2016 at 23:27, Chris Barker wrote: > On Fri, Jan 8, 2016 at 1:58 PM, Robert McGibbon wrote: >> >> I'm not sure if this is the right path for numpy or not, > > > probably not -- AFAICT, the PyPa folks aren't interested in solving teh > problems we have in the scipy community -- we can tweak around the edges, > but we wont get there without a commitment to really solve the issues -- and > if pip did that, it would essentially be conda -- non one wants to > re-impliment conda. I think that's a little unfair to the PyPA people. They would like to solve all of these problems is just a question of priority and expertise. As always in open source you have to scratch your own itch and those guys are working on other things like the security, stability, scalability of the infrastructure, consistency of pip's version handling and dependency resolution etc. Linux wheels is a problem that has been discussed on distutils-sig. The reason it hasn't happened is that it's a lower priority than wheels for OSX/Windows because: 1) Most distros already package this stuff i.e. apt-get numpy 2) On Linux it's much easier to get the appropriate compilers so that pip can build e.g. numpy. 3) The average Linux user is more capable of solving these problems. 4) Getting binary distribution to work across all Linux distributions is significantly harder than for Windows/OSX because of the myriad different distros/versions. Considering point 2 pip install numpy etc. already works for a lot of Linux users even if it is slow because of the time taken to compile (3 minutes on my system). Depending on your use case that problem is partially solved by wheel caching. So if Linux wheels were allowed but didn't always work then that would be a regression for many users. On OSX binary wheels for numpy are already available and work fine AFAIK. The absence of binary numpy wheels for Windows is not down to PyPA. Considering point 4 the idea of compiling on an old base Linux system has been discussed on distutils-sig before and it seems likely to work. The problem is really about the external non-libc dependencies though. The reason progress there has stalled is not because the PyPA folks don't want to solve it but rather because they have other priorities and are hoping that people with more expertise in that area will step up to address those problems. Most of the issues stem from the scientific Python community so ideally someone from the scientific Python community would address how to solve those problems. Recently Nathaniel brought some suggestions to distutils-sig to address the problem of build-requires which is a particular pain point. I think that people there appreciated the effort from someone who understands the needs of hard-to-build packages to improve the way that pip/PyPI works in that area. There was a lot of confusion from people not understanding each others needs but ultimately I thought there was agreement on how to move forward. (Although what happened to that in the end?) The same can happen with other problems like Linux wheels. If you guys here have a clear idea of how to solve the external dependency problem then I'm sure they'll be receptive. Personally I think the best approach is the pyopenblas approach: internalise the external dependency so that pip can work with it. This is precisely what Anaconda does and there's actually no need to make substantive changes to the way pip/pypi/wheel works in order to achieve that. It just needs someone to package the external dependencies as sdist/wheel (and for PyPI to allow Linux wheels). -- Oscar ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: Add the function 'expand_view'
On Do, 2016-01-07 at 22:48 -0500, John Kirkham wrote: > First, off sorry for the long turnaround on responding to these > questions. Below I have tried to respond to everyone's questions and > comments. I have restructured the order of the messages so that my > responses are a little more structured. If anybody has more thoughts > or questions, please let me know. > > From a performance standpoint, `expand_dims` using `reshape` to add > these extra dimensions. So, it has the possibility of not returning a > view, but a copy, which could take some time to build if the array is > large. By using our method, we guarantee a view will always be > returned; so, this allocation will never be encountered. Actually reshape can be used, if stride tricks can do the trick, then reshape is guaranteed to not do a copy. Still can't say I feel it is a worthy addition (but others may disagree), especially since I realized we have `expand_dims` already. I just am not sure it would actually be used reasonably often. But how about adding a `dims=1` argument to `expand_dims`, so that your function becomes at least a bit easier: newarr = np.expand_dims(arr, -1, after.ndim) # If manual broadcast is necessary: newarr = np.broadcast_to(arr, before.shape + arr.shape + after.shape) Otherwise I guess we could morph `expand_dims` partially to your function, though it would return a read-only view if broadcasting is active, so I am a bit unsure I like it. In other words, my personal currently preferred solution to get some of this would be to add a simple `dims` argument to `expand_dims` and add `broadcast_to` to its "See Also" section (also it could be in the `squeeze` see also section I think). To further help out other people, we could maybe even mention the combination in the examples (i.e. broadcast_to example?). - Sebastian > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion