On Tue, Feb 14, 2017 at 5:35 PM, Gustav Larsson <lars...@cs.uchicago.edu> wrote:
> 1. For object arrays, I would default to calling format on each element >> (your "map principle") rather than raising an error. >> > > I'm glad you brought this up as a possibility. It might be possible, but > there are some issues that would need to be resolved. First of all, {} and > {:} always works and gives the same result it currently does. So, this only > affects the situation where the format spec is non-empty. I think there are > two main issues: > > Heterogeneity: Let's say we have x = np.array([12.3, True, 'string', > Foo(10)], dtype=np.object). Then, presumably {:.1f} should cause a > ValueError since the string does not support format type 'f'. This could > create a lot of ValueError land mines for the user. > Things will absolutely break if you try to do complex operations on in-homogeneously typed arrays. I would put the onus on the user in such a case. > For x[:2] however it should work and produce something like [12.3 1.0]. > Note, the "map principle" still can't be strictly true. Let's say we have > an array with type object and mostly string-like elements. Then {:5s} will > still not produce exactly {:5s} element-wise, because the string > representations need to be repr-based inside the array (otherwise it could > break for newlines and things like that and produce spaces that make the > boundary between elements ambiguous). This brings me to the next issue. > Indeed, this will be a departure from the behavior without a format string, which just uses repr. In my mind, this is the strongest argument against using the map principle here, because there is a discontinuous shift between providing and not providing a format string. > Str vs. repr: If we have a homogeneous object-array with types Foo and Foo > implements __format__, it would be great if this worked. However, one issue > is that Foo.__format__ might return things like newline (or spaces), which > would break (or confuse) the printed output (unless it is made incredibly > smart to support "vertical alignment"). This issue is essentially the same > as for strings in general, which is why they use repr instead. I can think > of two solutions: 1) Try to sanitize (or repr-ify) the string returned by > __format__ somehow; 2) Put the responsibility on the user and simply let > the rendering break if Foo.__format__ does not play well. > I wouldn't do anything fancy here to worry about line breaks. It's basically impossible to get this right for edge cases, so I would certainly put the responsibility on the user. On another note, about Python 2 vs 3: I would definitely take the approach of copying the Python 3 behavior on all versions of NumPy (when feasible) and not being too concerned about compatibility with format on Python 2. The future is Python 3.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion