[Numpy-discussion] 14ᵗʰ Advanced Scientific Programming in Python in Bilbao Spain, 5–11 September, 2022
ASPP2022: 14ᵗʰ Advanced Scientific Programming in Python a Summer School by the ASPP faculty and the Faculty of Engineering of the Mondragon University, Bilbao https://aspp.school Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists have been trained to use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques and best practices which are standard in the industry, but especially tailored to the needs of a programming scientist. Lectures are devised to be interactive and to give the students enough time to acquire direct hands-on experience with the materials. Students will work in pairs throughout the school and will team up to practice the newly learned skills in a real programming project — an entertaining computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or R is absolutely required. Basic knowledge of Python and of a version control system such as git, subversion, mercurial, or bazaar is assumed. Participants without any prior experience with Python and/or git should work through the proposed introductory material before the course. We are striving hard to get a pool of students which is international and gender-balanced. Date & Location === 5–11 September, 2022. Bilbao, Spain. Application === You can apply online: https://aspp.school Application deadline: 23:59 UTC, Sunday 1 May, 2022. There will be no deadline extension, so be sure to apply on time. Be sure to read the FAQ before applying: https://aspp.school/wiki/faq Participation is for free, i.e. no fee is charged! Participants however should take care of travel, living, and accommodation expenses by themselves. We are in the process of securing some funds for supporting students with accommodation and living costs. Program === • Version control with git and how to contribute to open source projects with GitHub • Best practices in data visualization • Testing and debugging scientific code • Advanced NumPy • Organizing, documenting, and distributing scientific code • Advanced scientific Python: context managers and generators • Writing parallel applications in Python • Profiling and speeding up scientific code with Cython and numba • Programming in teams Faculty === • Jakob Jordan, Department of Physiology, University of Bern Switzerland • Jenni Rinker, Department of Wind Energy, Technical University of Denmark, Roskilde Denmark • Lisa Schwetlick, Experimental and Biological Psychology, Universität Potsdam Germany • Nicolas Rougier, Inria Bordeaux Sud-Ouest, Institute of Neurodegenerative Diseases, University of Bordeaux France • Pamela Hathway, GfK, Nuremberg Germany • Pietro Berkes, NAGRA Kudelski, Lausanne Switzerland • Rike-Benjamin Schuppner, Institute for Theoretical Biology, Humboldt-Universität zu Berlin Germany • Tiziano Zito, Department of Psychology, Humboldt-Universität zu Berlin Germany • Zbigniew Jędrzejewski-Szmek, Red Hat Inc., Warsaw Poland Organizers == Head of the organization for ASPP and responsible for the scientific program: • Tiziano Zito, Department of Psychology, Humboldt-Universität zu Berlin Germany Organization team in Bilbao: • Aitor Morales-Gregorio, Theoretical Neuroanatomy, Institute of Neuroscience and Medicine (INM-6), Forschungszentrum Jülich, Germany • Carlos Cernuda, Data Analysis & Cybersecurity, Faculty of Engineering, Mondragon Unibertsitatea, Bilbao Spain Website: https://aspp.school Contact: info@aspp.school ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] NEP draft for the future behaviour of scalar promotion
Hi all, NumPy has awkward behaviour when it comes to promotion with 0-D arrays, and Python scalars. This is both a technical challenge (numpy needs to inspect the values where it shouldn't), as well as surprising for users. Roughly speaking, I have made a proposal under the 3 points: * NumPy scalars and NumPy arrays always behave the same. * A NumPy array always respects the dtype * A Python scalar is "weak" so that uint8_arr + 3 returns a uint8_arr The NEP is here: https://25105-908607-gh.circle-artifacts.com/0/doc/neps/_build/html/nep-0050-scalar-promotion.html But please refer to the PR, since above may go away or get outdated: https://github.com/numpy/numpy/pull/21103 Note that I have not 100% made up my mind on these, because some alternatives exist which may give a somewhat easier transition. Because of this, this is a very early draft (expect large changes/rewrite), but some feedback/input may go a long way to make sure we keep moving on this project. For those aware of the issues, it probably makes sense to skip ahead to the "Alternatives" section. I do expect that a large refactor/rewrite will be necessary, but need some feedback to keep moving. I had send the poll recently: https://discuss.scientific-python.org/t/poll-future-numpy-behavior-when-mixing-arrays-numpy-scalars-and-python-scalars/202 just to say, I have not completely ignored it, although (as expected) the results do not give a very simple answer. Many agree with the choices I made, but some also seem to prefer "strong" Python types, or more special handling of NumPy scalars. Please do not hesitate to give opinions! I am not sure we can find a clear "obviously right" solution. Especially since there are tough backwards compatibility choices (even if most users are likely not to notice). So any input is appreciated. Cheers, Sebastian signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
I added a few comments on the PR. The main comments of substance I had boil down to: - consistency with other programming languages/major frameworks (perhaps a few more "examples of consistency" for the new approach with others may help strengthen the arguments?)--I know JAX was mentioned, and their dtype promotion docs are quite nice - one thing I struggled with in deciding if some of the "new behaviors" were nicer was the tension between protecting from accidental overflow vs. a more "purist" view that types should be preserved more strictly; the latter would seem consistent with the "principle of least surprise" when moving from a typed language to NumPy work perhaps, though arguably slightly less user-friendly if naively doing some operations with a less formal view of typing (new Python user messing around with NumPy?) On Mon, 21 Feb 2022 at 16:35, Sebastian Berg wrote: > Hi all, > > NumPy has awkward behaviour when it comes to promotion with 0-D arrays, > and Python scalars. This is both a technical challenge (numpy needs to > inspect the values where it shouldn't), as well as surprising for > users. > > Roughly speaking, I have made a proposal under the 3 points: > * NumPy scalars and NumPy arrays always behave the same. > * A NumPy array always respects the dtype > * A Python scalar is "weak" so that uint8_arr + 3 returns a uint8_arr > > The NEP is here: > > https://25105-908607-gh.circle-artifacts.com/0/doc/neps/_build/html/nep-0050-scalar-promotion.html > > But please refer to the PR, since above may go away or get outdated: > https://github.com/numpy/numpy/pull/21103 > > > Note that I have not 100% made up my mind on these, because some > alternatives exist which may give a somewhat easier transition. > Because of this, this is a very early draft (expect large > changes/rewrite), but some feedback/input may go a long way to make > sure we keep moving on this project. > > For those aware of the issues, it probably makes sense to skip ahead to > the "Alternatives" section. I do expect that a large refactor/rewrite > will be necessary, but need some feedback to keep moving. > > > I had send the poll recently: > > https://discuss.scientific-python.org/t/poll-future-numpy-behavior-when-mixing-arrays-numpy-scalars-and-python-scalars/202 > > just to say, I have not completely ignored it, although (as expected) > the results do not give a very simple answer. Many agree with the > choices I made, but some also seem to prefer "strong" Python types, or > more special handling of NumPy scalars. > > > Please do not hesitate to give opinions! I am not sure we can find a > clear "obviously right" solution. Especially since there are tough > backwards compatibility choices (even if most users are likely not to > notice). So any input is appreciated. > > Cheers, > > Sebastian > > > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: tyler.je.re...@gmail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
> the > latter would seem consistent with the "principle of least surprise" when > moving from a typed language to > NumPy work perhaps, though arguably slightly less user-friendly if naively > doing some operations with > a less formal view of typing (new Python user messing around with NumPy?) fwiw, my rationale here is that many (most?) beginners will eventually become intermediate-to-advanced, at which point purity becomes increasingly important. It is often easier to explain a "pure" principle to a beginner than it is to navigate around magic behaviour as an expert. At scikit-image tutorials we often begin by having the users overflow a uint8 image, then we explain why that's the case and how to work around it. We have also increasingly encountered users surprised/annoyed that scikit-image blew up their uint8 to a float64, using 8x the RAM. > JAX was mentioned, and their dtype promotion docs are quite nice My God! They are awesome. Hadn't seen them before. For reference: https://jax.readthedocs.io/en/latest/design_notes/type_promotion.html I certainly wouldn't mind if NumPy adopted these wholesale. Juan. On Mon, 21 Feb 2022, at 9:39 PM, Tyler Reddy wrote: > I added a few comments on the PR. The main comments of substance I had boil > down to: > - consistency with other programming languages/major frameworks (perhaps a > few more "examples of consistency" for the new approach with others > may help strengthen the arguments?)--I know JAX was mentioned, and their > dtype promotion docs are quite nice > - one thing I struggled with in deciding if some of the "new behaviors" were > nicer was the tension between > protecting from accidental overflow vs. a more "purist" view that types > should be preserved more strictly; the > latter would seem consistent with the "principle of least surprise" when > moving from a typed language to > NumPy work perhaps, though arguably slightly less user-friendly if naively > doing some operations with > a less formal view of typing (new Python user messing around with NumPy?) > > On Mon, 21 Feb 2022 at 16:35, Sebastian Berg > wrote: >> Hi all, >> >> NumPy has awkward behaviour when it comes to promotion with 0-D arrays, >> and Python scalars. This is both a technical challenge (numpy needs to >> inspect the values where it shouldn't), as well as surprising for >> users. >> >> Roughly speaking, I have made a proposal under the 3 points: >> * NumPy scalars and NumPy arrays always behave the same. >> * A NumPy array always respects the dtype >> * A Python scalar is "weak" so that uint8_arr + 3 returns a uint8_arr >> >> The NEP is here: >> https://25105-908607-gh.circle-artifacts.com/0/doc/neps/_build/html/nep-0050-scalar-promotion.html >> >> But please refer to the PR, since above may go away or get outdated: >> https://github.com/numpy/numpy/pull/21103 >> >> >> Note that I have not 100% made up my mind on these, because some >> alternatives exist which may give a somewhat easier transition. >> Because of this, this is a very early draft (expect large >> changes/rewrite), but some feedback/input may go a long way to make >> sure we keep moving on this project. >> >> For those aware of the issues, it probably makes sense to skip ahead to >> the "Alternatives" section. I do expect that a large refactor/rewrite >> will be necessary, but need some feedback to keep moving. >> >> >> I had send the poll recently: >> https://discuss.scientific-python.org/t/poll-future-numpy-behavior-when-mixing-arrays-numpy-scalars-and-python-scalars/202 >> >> just to say, I have not completely ignored it, although (as expected) >> the results do not give a very simple answer. Many agree with the >> choices I made, but some also seem to prefer "strong" Python types, or >> more special handling of NumPy scalars. >> >> >> Please do not hesitate to give opinions! I am not sure we can find a >> clear "obviously right" solution. Especially since there are tough >> backwards compatibility choices (even if most users are likely not to >> notice). So any input is appreciated. >> >> Cheers, >> >> Sebastian >> >> >> ___ >> NumPy-Discussion mailing list -- numpy-discussion@python.org >> To unsubscribe send an email to numpy-discussion-le...@python.org >> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >> Member address: tyler.je.re...@gmail.com > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: j...@fastmail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-arch
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
On Mon, Feb 21, 2022, at 20:56, Juan Nunez-Iglesias wrote: > > the > > latter would seem consistent with the "principle of least surprise" when > > moving from a typed language to > > NumPy work perhaps, though arguably slightly less user-friendly if naively > > doing some operations with > > a less formal view of typing (new Python user messing around with NumPy?) > > fwiw, my rationale here is that many (most?) beginners will eventually become > intermediate-to-advanced, at which point purity becomes increasingly > important. It is often easier to explain a "pure" principle to a beginner > than it is to navigate around magic behaviour as an expert. At scikit-image > tutorials we often begin by having the users overflow a uint8 image, then we > explain why that's the case and how to work around it. Just to play a bit of devil's advocate here, I'd have to say that most people will not expect x[0] + 200 To often yield a number less than 200! I think uint8's are especially problematic because they overflow so quickly (you won't easily run into the same behavior with uint16 and higher). Of course, there is no way to pretend that NumPy integers are Python integers, but by changing the casting table for uint8 a bit we may be able to avoid many common errors. Besides, coming from value based casting, users already have this expectation: In [1]: np.uint8(255) + 1 Out[1]: 256 Currently, NumPy scalars and arrays are treated differently. Arrays have stronger types than scalars, in that users expect: In [3]: np.array([253, 254, 255], dtype=np.uint8) + 3 Out[3]: array([0, 1, 2], dtype=uint8) So perhaps the real question is: how important is it to us that arrays and scalars behave the same in the new casting scheme? (JAX, from the docs you linked, also makes the scalar vs array distinction.) > We have also increasingly encountered users surprised/annoyed that > scikit-image blew up their uint8 to a float64, using 8x the RAM. I know this used to be true, but my sense is that it is less and less so, especially now that almost all skimage functions use floatx internally. Stéfan ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
On Tue, 22 Feb 2022, 6:53 am Stefan van der Walt, wrote: > On Mon, Feb 21, 2022, at 20:56, Juan Nunez-Iglesias wrote: > > > the > > latter would seem consistent with the "principle of least surprise" when > moving from a typed language to > > NumPy work perhaps, though arguably slightly less user-friendly if > naively doing some operations with > > a less formal view of typing (new Python user messing around with NumPy?) > > fwiw, my rationale here is that many (most?) beginners will eventually > become intermediate-to-advanced, at which point purity becomes increasingly > important. It is often easier to explain a "pure" principle to a beginner > than it is to navigate around magic behaviour as an expert. At scikit-image > tutorials we often begin by having the users overflow a uint8 image, then > we explain why that's the case and how to work around it. > > > Just to play a bit of devil's advocate here, I'd have to say that most > people will not expect > > x[0] + 200 > > To often yield a number less than 200! > > I think uint8's are especially problematic because they overflow so > quickly (you won't easily run into the same behavior with uint16 and > higher). Of course, there is no way to pretend that NumPy integers are > Python integers, but by changing the casting table for uint8 a bit we may > be able to avoid many common errors. > But sometimes you actually want the overflow, for bit operations. Probably not with an int8, I believe that is UB in C. /David ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
On Mon, 21 Feb 2022, at 11:50 PM, Stefan van der Walt wrote: > Just to play a bit of devil's advocate here, I'd have to say that most people > will not expect > > x[0] + 200 > > To often yield a number less than 200! It's tricky though, because I would expect np.uint8(255) + 1 to be equal to 0. (As does JAX, see below.) ie, someone, somewhere, is going to be surprised. I don't think we can help that at all. So my argument is that we should prefer the surprising behaviour that is at least consistent in some overarching framework, and the framework itself should be as parsimonious as possible. I'd prefer not to have to write "except for scalars" in a bunch of places in the docs. > I think uint8's are especially problematic because they overflow so quickly > (you won't easily run into the same behavior with uint16 and higher). Of > course, there is no way to pretend that NumPy integers are Python integers, > but by changing the casting table for uint8 a bit we may be able to avoid > many common errors. See, I kinda hate the idea of special-casing one dtype. Common errors might be a good thing — people can very quickly learn to be careful with uint8s. If we try really hard to hide this reality, people will be surprised *later*, or indeed errors may go unnoticed. > Besides, coming from value based casting, users already have this expectation: > > In [1]: np.uint8(255) + 1 > Out[1]: 256 > > Currently, NumPy scalars and arrays are treated differently. Arrays have > stronger types than scalars, in that users expect: > > In [3]: np.array([253, 254, 255], dtype=np.uint8) + 3 > Out[3]: array([0, 1, 2], dtype=uint8) I think the users that expect *both* of those behaviours are a small set. > So perhaps the real question is: how important is it to us that arrays and > scalars behave the same in the new casting scheme? (JAX, from the docs you > linked, also makes the scalar vs array distinction.) No, as far as I can tell, they distinguish between *Python* scalars and arrays, not between JAX scalars and arrays. They do have a concept of weakly typed arrays, but I don't think that's what you get when you do jnp.uint8(x). Indeed I just checked that jnp.uint8(255) + 1 returns a uint8 scalar with value 0. (or 0-dimensional array? Not sure how JAX handles scalars, the exact repr returned is DeviceArray(0, dtype=uint8)) >> We have also increasingly encountered users surprised/annoyed that >> scikit-image blew up their uint8 to a float64, using 8x the RAM. > > I know this used to be true, but my sense is that it is less and less so, > especially now that almost all skimage functions use floatx internally. Greg spent a long time last year making sure that we didn't promote float32 to float64 for this reason. This has reduced some of the burden but not all, and my point is broader: users will not be happy to have uint8 + Python int return an int64 array implicitly. And to quote from the JAX document, which to me seems to be the nail in the coffin for alternatives: > The benefit of these semantics are that you can readily express sequences of > operations with clean Python code, without having to explicitly cast scalars > to the appropriate type. Imagine if rather than writing this: > > 3 * (x + 1) ** 2 > you had to write this: > > np.int32(3) * (x + np.int32(1)) ** np.int32(2) Juan.___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
On Tue, 22 Feb 2022, at 1:01 AM, Stefan van der Walt wrote: > it is easier to explain away `x + 1` behaving oddly over `x[0] + 1` behaving > oddly Is it? I find the two equivalent, honestly. > given that we pretend like NumPy scalars do not exist. This is the leaky abstraction that I think should be plugged in this revamp. > This then argues for making explicit to the user that there are scalars > involved. I.e., no more: > > In [4]: x = np.array([1, 2, 3]) > > In [5]: x[0] > Out[5]: 1 > > But rather > > Out[5]: np.int64(1) Yup. I would be in favour of such a repr change. (And to be clear, it is *only* a repr change, not a behaviour change!) I have indeed run across this a few times, e.g. trying to encode a single value in json only to find that it was a NumPy int64 rather than an int. >>> The benefit of these semantics are that you can readily express sequences >>> of operations with clean Python code, without having to explicitly cast >>> scalars to the appropriate type. Imagine if rather than writing this: >>> >>> 3 * (x + 1) ** 2 >>> you had to write this: >>> >>> np.int32(3) * (x + np.int32(1)) ** np.int32(2) > > And how do you write the much more common > > x[0] + 1 Is it really much more common than arithmetic combining arrays and literals? I'd say it's much *less* common, especially in "idiomatic" NumPy which tries to avoid Python looping over elements. > now? It becomes: x[0] + np.int64(1). I would write it as x[0].astype(np.int64) + 1, and indeed I think I would find that less confusing, reading the code years later, because it would allow me to not even have to think about type promotion. > The reason we had value inspection was that it gave us a cushy "best of both > worlds"; when going with dtype-only casting, you have to give something up. Yes yes, we agree we are giving something up, we merely disagree about what is better to give up long term for our community. For me, the attractiveness of unified scalar and array semantics, together with unified type promotion, beats the attractiveness of hiding overflow from users, especially since the hiding can only ever be patchy.* I 100% agree with you that it is a tradeoff. But, imho, one worth making. * e.g. the same user might initially be happy about the result of x[0] + 1 matching their infinite-precision expectation, but then be surprised by x[0] + 1 -> 256 y[0] = 1 x[0] + y[0] -> 0 # WTH Juan.___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion
> * e.g. the same user might initially be happy about the result of x[0] + 1 > matching their infinite-precision expectation, but then be surprised by > > x[0] + 1 > -> 256 > > y[0] = 1 > x[0] + y[0] > -> 0 # WTH I'll go even further: I would say a common situation where people use syntax like x[0] + 1 is in sanity checks/tests. In which case, it's *very bad* to have different behaviour between x[0] + 1 (e.g. when checking code) and x + 1 (e.g. in production code).___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com