Re: [Numpy-discussion] String type again.

Charles R Harris Fri, 18 Jul 2014 10:39:37 -0700

On Fri, Jul 18, 2014 at 10:59 AM, Nathaniel Smith <[email protected]> wrote:


> On Fri, Jul 18, 2014 at 5:54 PM, Chris Barker <[email protected]>
> wrote:
> >
> > This is why I see no downside to latin-1 -- if you don't use the > 127
> code
> > points, it's the same thing -- if you do, you get some extra handy
> > characters. The only difference is that a proper ascii type would not let
> > you store anything above 127 at all -- why restrict ourselves?
>
> IMO the extra characters aren't the most compelling argument for
> latin1 over ascii. Latin1 gives the nice assurance that if some jerk
> *does* give me an "ascii" file that somewhere has some byte with the
> 8th bit set, then I can still load the data and fix things by hand.
> This is trickier if numpy just refuses to touch the data, blowing up
> with an exception when I try. In general it's easy to create numpy
> arrays containing arbitrary bitpatterns, so it's nice to have some
> strategy for what to do with them.
>
>
Just to throw in one more complication, there is no buffer protocol for a
fixed encoding type. In Python 3 'c', 's', 'p' are all considered as bytes,
in Python 2 as strings.

Chuck

_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] String type again.

Reply via email to