Re: [Numpy-discussion] Style guide for numpy code?

2019-05-09 Thread Eric Wieser
Joe,

While most of your style suggestions are reasonable, I would actually
recommend the opposite of the first point you make in (a)., especially
if you're trying to write generic reusable code.

> For example, an item count is always an integer, but a distance is always a 
> float.

This is close, but `int` and `float` are implementation details. I
think a more precise way to state this is _"an item count is a
`numbers.Integral`, a distance is a `numbers.Real`.

Where this distinction matters is if you start using `decimal.Decimal`
or `fractions.Fraction` for your distances. Those are subclasses of
`numbers.Real`, but if you mix them with floats, you either lose
precision or crash due to refusing to:
```python
In [11]: Fraction(1, 3) + 1.0
Out[11]: 1.

In [12]: Fraction(1, 3) + 1
Out[12]: Fraction(4, 3)

In [15]: Decimal('0.1') + 0
Out[15]: Decimal('0.1')

In [16]: Decimal('0.1') + 0.
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'
```

For an example of this coming up in real-world functions, look at
https://github.com/numpy/numpy/pull/13390

Eric

On Thu, 9 May 2019 at 11:19, Joe Harrington  wrote:
>
> I have a handout for my PHZ 3150 Introduction to Numerical Computing course 
> that includes some rules:
>
> (a) All integer-valued floating-point numbers should have decimal points 
> after them. For
> example, if you have a time of 10 sec, do not use
>
> y = np.e**10 # sec
>
> use
>
> y = np.e**10. # sec
>
> instead.  For example, an item count is always an integer, but a distance is 
> always a float.  A decimal in the range (-1,1) must always have a zero before 
> the decimal point, for readability:
>
> x = 0.23 # Right!
>
> x = .23 # WRONG
>
> The purpose of this one is simply to build the decimal-point habit.  In 
> Python it's less of an issue now, but sometimes code is translated, and 
> integer division is still out there.  For that reason, in other languages, it 
> may be desirable to use a decimal point even for counts, unless integer 
> division is wanted.  Make a comment whenever you intend integer division and 
> the language uses the same symbol (/) for both kinds of division.
>
> (b) Use spaces around binary operations and relations (=<>+-*/). Put a space 
> after “,”.
> Do not put space around “=” in keyword arguments, or around “ ** ”.
>
> (c) Do not put plt.show() in your homework file! You may put it in a comment 
> if you
> like, but it is not necessary. Just save the plot. If you say
>
> plt.ion()
>
> plots will automatically show while you are working.
>
> (d) Use:
>
> import matplotlib.pyplot as plt
>
> NOT:
>
> import matplotlib.pylab as plt
>
> (e) Keep lines to 80 characters, max, except in rare cases that are well 
> justified, such as
> very long strings. If you make comments on the same line as code, keep them 
> short or
> break them over more than a line:
>
> code = code2   # set code equal to code2
>
> # Longer comment requiring much more space because
> # I'm explaining something complicated.
> code = code2
>
> code = code2   # Another way to do a very long comment,
># like this one, which runs over more than
># one line.
>
> (f) Keep blocks of similar lines internally lined up on decimals, comments, 
> and = signs.  This makes them easier to read and verify.  There will be some 
> cases when this is impractical.  Use your judgment (you're not a computer, 
> you control the computer!):
>
> x=   1.  # this is a comment
> y= 378.2345  # here's another
> fred = chuck # note how the decimals, = signs, and
>  # comments line up nicely...
> alacazamshmazooboloid = 2721 # but not always!
>
> (g) Put the units and sources of all values in comments:
>
> t_planet = 523. # K, Smith and Jones (2016, ApJ 234, 22)
>
> (h) I don't mean to start a religious war, but I emphasize the alignment of 
> similar adjacent code lines to make differences pop out and reduce the 
> likelihood of bugs.  For example, it is much easier to verify the correctness 
> of:
>
> a = 3 * x + 3 * 8. * short- 5. * np.exp(np.pi * omega * t)
> a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)
>
> than:
>
> a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
> a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)
>
> (i) Assign values to meaningful variables, and use them in formulae and 
> functions:
>
> ny = 512
> nx = 512
> image = np.zeros((ny, nx))
> expr1 = ny * 3
> expr2 = nx * 4
>
> Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know 
> which of the 512s are in the x direction and which are in the y direction.  
> Or, the student you (now a senior researcher) assign to code the upgrade 
> won't!  Also, it reduces bugs arising from the order of arguments to 
> functions if the args have meaningful names.  This is not to say that you 
> should assign all numbers to functions.  This is fine:
>
> circ = 2 * np.pi * r
>
>

Re: [Numpy-discussion] My Introduction and Getting Started with Numpy.

2019-05-09 Thread Ngoran Clare-Joyce F.
Hello Ralf,

Thank you for the resources, they were very helpful.
I am done setting up the environment and I'm looking forward to making
contributions.

Thank you,
Joyce.

On Mon, May 6, 2019 at 8:57 PM Ralf Gommers  wrote:

> Hi Ngoran, welcome!
>
>
> On Sun, May 5, 2019 at 10:53 AM Joyce Tirnyuy 
> wrote:
>
>> Hi All,
>>
>> I am Ngoran Clare-Joyce, an Electrical Engineer from Cameroon. I use
>> Python and Javascript for Software Development. Over the past year, I have
>> gained insight into Machine Learning and Data Science Algorithms. I have
>> used Numpy, Scipy, Pandas, Pytorch, Scikit-Learn libraries.
>>
>> I have realized that to take my career to the next level, I need to
>> contribute to open source as a way to gain skills, experience and proper
>> understanding of how these libraries work.
>>
>
> Excellent, we can always use help:)
>
>
>> Please, I will appreciate help on how to get started, set up my
>> development environment, some important documentation, and beginners issues
>> so I can start contributing to Numpy.
>>
>
> This is the most recent version of our documentation:
> https://www.numpy.org/devdocs/index.html. It has a link "NumPy Developer
> Guide" which walks you through setting up your development environment.
> There's also "Building and Extending the Documentation" which will help if
> you want to work on improving the documentation. For some beginner issues,
> please have a look at the ones labelled "easy" on GitHub:
> https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%22difficulty%3A+Easy%22
>
> Cheers,
> Ralf
>
>
>>
>> Thanks,
>> Ngoran Clare-Joyce.
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: SciPy 1.3.0rc2 -- please test

2019-05-09 Thread Tyler Reddy
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,

On behalf of the SciPy development team I'm pleased to announce
the release candidate SciPy 1.3.0rc2. Please help us test this pre-release.

The primary motivation for the second release candidate to is to update
wheels to use a more recent OpenBLAS with fixes for SkylakeX AVX
kernel problems.

Sources and binary wheels can be found at:
https://pypi.org/project/scipy/
and at:
https://github.com/scipy/scipy/releases/tag/v1.3.0rc2

One of a few ways to install the release candidate with pip:

pip install scipy==1.3.0rc2

==
SciPy 1.3.0 Release Notes
==

Note: Scipy 1.3.0 is not released yet!

SciPy 1.3.0 is the culmination of 5 months of hard work. It contains
many new features, numerous bug-fixes, improved test coverage and better
documentation. There have been some API changes
in this release, which are documented below. All users are encouraged to
upgrade to this release, as there are a large number of bug-fixes and
optimizations. Before upgrading, we recommend that users check that
their own code does not use deprecated SciPy functionality (to do so,
run your code with ``python -Wd`` and check for ``DeprecationWarning`` s).
Our development attention will now shift to bug-fix releases on the
1.3.x branch, and on adding new features on the master branch.

This release requires Python 3.5+ and NumPy 1.13.3 or greater.

For running on PyPy, PyPy3 6.0+ and NumPy 1.15.0 are required.

Highlights of this release
- --

- - Three new ``stats`` functions, a rewrite of ``pearsonr``, and an exact
  computation of the Kolmogorov-Smirnov two-sample test
- - A new Cython API for bounded scalar-function root-finders in
`scipy.optimize`
- - Substantial ``CSR`` and ``CSC`` sparse matrix indexing performance
  improvements
- - Added support for interpolation of rotations with continuous angular
  rate and acceleration in ``RotationSpline``


New features


`scipy.interpolate` improvements
- 

A new class ``CubicHermiteSpline`` is introduced. It is a piecewise-cubic
interpolator which matches observed values and first derivatives. Existing
cubic interpolators ``CubicSpline``, ``PchipInterpolator`` and
``Akima1DInterpolator`` were made subclasses of ``CubicHermiteSpline``.

`scipy.io` improvements
- ---

For the Attribute-Relation File Format (ARFF) `scipy.io.arff.loadarff`
now supports relational attributes.

`scipy.io.mmread` can now parse Matrix Market format files with empty lines.

`scipy.linalg` improvements
- ---

Added wrappers for ``?syconv`` routines, which convert a symmetric matrix
given by a triangular matrix factorization into two matrices and vice versa.

`scipy.linalg.clarkson_woodruff_transform` now uses an algorithm that
leverages
sparsity. This may provide a 60-90 percent speedup for dense input matrices.
Truly sparse input matrices should also benefit from the improved sketch
algorithm, which now correctly runs in ``O(nnz(A))`` time.

Added new functions to calculate symmetric Fiedler matrices and
Fiedler companion matrices, named `scipy.linalg.fiedler` and
`scipy.linalg.fiedler_companion`, respectively. These may be used
for root finding.

`scipy.ndimage` improvements
- 

Gaussian filter performances may improve by an order of magnitude in
some cases, thanks to removal of a dependence on ``np.polynomial``. This
may impact `scipy.ndimage.gaussian_filter` for example.

`scipy.optimize` improvements
- -

The `scipy.optimize.brute` minimizer obtained a new keyword ``workers``,
which
can be used to parallelize computation.

A Cython API for bounded scalar-function root-finders in `scipy.optimize`
is available in a new module `scipy.optimize.cython_optimize` via
``cimport``.
This API may be used with ``nogil`` and ``prange`` to loop
over an array of function arguments to solve for an array of roots more
quickly than with pure Python.

``'interior-point'`` is now the default method for ``linprog``, and
``'interior-point'`` now uses SuiteSparse for sparse problems when the
required scikits  (scikit-umfpack and scikit-sparse) are available.
On benchmark problems (gh-10026), execution time reductions by factors of
2-3
were typical. Also, a new ``method='revised simplex'`` has been added.
It is not as fast or robust as ``method='interior-point'``, but it is a
faster,
more robust, and equally accurate substitute for the legacy
``method='simplex'``.

``differential_evolution`` can now use a ``Bounds`` class to specify the
bounds for the optimizing argument of a function.

`scipy.optimize.dual_annealing` performance improvements related to
vectorisation of some internal code.

`scipy.signal` improvements
- ---

Two additional methods of discretization are now supported by
`scipy.signal.cont2discrete`: ``impulse`` and ``foh``.


Re: [Numpy-discussion] Style guide for numpy code?

2019-05-09 Thread Joe Harrington
I have a handout for my PHZ 3150 Introduction to Numerical Computing 
course that includes some rules:


(a) All integer-valued floating-point numbers should have decimal points 
after them. For

example, if you have a time of 10 sec, do not use

y = np.e**10 # sec

use

y = np.e**10. # sec

instead.  For example, an item count is always an integer, but a 
distance is always a float.  A decimal in the range (-1,1) must always 
have a zero before the decimal point, for readability:


x = 0.23 # Right!

x = .23 # WRONG

The purpose of this one is simply to build the decimal-point habit.  In 
Python it's less of an issue now, but sometimes code is translated, and 
integer division is still out there.  For that reason, in other 
languages, it may be desirable to use a decimal point even for counts, 
unless integer division is wanted.  Make a comment whenever you intend 
integer division and the language uses the same symbol (/) for both 
kinds of division.


(b) Use spaces around binary operations and relations (=<>+-*/). Put a 
space after “,”.

Do not put space around “=” in keyword arguments, or around “ ** ”.

(c) Do not put plt.show() in your homework file! You may put it in a 
comment if you

like, but it is not necessary. Just save the plot. If you say

plt.ion()

plots will automatically show while you are working.

(d) Use:

import matplotlib.pyplot as plt

NOT:

import matplotlib.pylab as plt

(e) Keep lines to 80 characters, max, except in rare cases that are well 
justified, such as
very long strings. If you make comments on the same line as code, keep 
them short or

break them over more than a line:

code = code2   # set code equal to code2

# Longer comment requiring much more space because
# I'm explaining something complicated.
code = code2

code = code2   # Another way to do a very long comment,
   # like this one, which runs over more than
   # one line.

(f) Keep blocks of similar lines internally lined up on decimals, 
comments, and = signs.  This makes them easier to read and verify.  
There will be some cases when this is impractical.  Use your judgment 
(you're not a computer, you control the computer!):


x=   1.  # this is a comment
y= 378.2345  # here's another
fred = chuck # note how the decimals, = signs, and
 # comments line up nicely...
alacazamshmazooboloid = 2721 # but not always!

(g) Put the units and sources of all values in comments:

t_planet = 523.     # K, Smith and Jones (2016, ApJ 234, 22)

(h) I don't mean to start a religious war, but I emphasize the alignment 
of similar adjacent code lines to make differences pop out and reduce 
the likelihood of bugs.  For example, it is much easier to verify the 
correctness of:


a = 3 * x + 3 * 8. * short- 5. * np.exp(np.pi * omega * t)
a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)

than:

a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)

(i) Assign values to meaningful variables, and use them in formulae and 
functions:


ny = 512
nx = 512
image = np.zeros((ny, nx))
expr1 = ny * 3
expr2 = nx * 4

Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know 
which of the 512s are in the x direction and which are in the y 
direction.  Or, the student you (now a senior researcher) assign to code 
the upgrade won't!  Also, it reduces bugs arising from the order of 
arguments to functions if the args have meaningful names.  This is not 
to say that you should assign all numbers to functions.  This is fine:


circ = 2 * np.pi * r

(j) All functions assigned for grading must have full docstrings in 
numpy's format, as well as internal comments.  Utility functions not 
requested in the assignment and that the user will never see can have 
reduced docstrings if the functions are simple and obvious, but at least 
give the one-line summary.


(k) If you modify an existing function, you must either make a Git entry 
or, if it is not under revision control, include a Revision History 
section in your docstring and record your name, the date, the version 
number, your email, and the nature of the change you made.


(l) Choose variable names that are meaningful and consistent in style.  
Document your style either at the head of a module or in a separate text 
file for the project.  For example, if you use CamelCaps with initial 
capital, say that.  If you reserve initial capitals for classes, say 
that.  If you use underscores for variable subscripts and camelCaps for 
the base variables, say that.  If you accept some other style and build 
on that, say that.  There are too many good reasons to have such styles 
for only one to be the community standard.  If certain kinds of values 
should get the same variable or base variable, such as fundamental 
constants or things like amplitudes, say that.


(j) It's best if variables that will appear in formulae ar

Re: [Numpy-discussion] Style guide for numpy code?

2019-05-09 Thread Chris Barker - NOAA Federal
Oops,

Somehow that got sent before I was done. (Like my use of the passive voice
there?)

Here is a complete message:

Do any of you know of a style guide for computational / numpy code?

I don't mean code that will go into numpy itself, but rather, users code
that uses numpy (and scipy, and...)

I know about (am a proponent of) PEP8, but it doesn’t address the unique
needs of scientific programming.

This is mostly about variable names. In scientific code, we often want:

- variable names that match the math notation- so single character names,
maybe upper or lower case to mean different things ( in ocean wave
mechanics, often “h” is the water depth, and “H” is the wave height)

-to distinguish between scalar, vector, and matrix values — often UpperCase
means an array or matrix, for instance.

But despite (or because of) these unique needs, a style guide would be
really helpful.

Anyone have one? Or even any notes on what you do yourself?

Thanks,
-CHB




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion