18.07.2014 18:10, Julian Taylor kirjoitti: [clip] > We break code either way. Either we break applications using S as > string type, but now it becomes bytes in python3. Or we break > applications treating S as byte type and we change it to string in > python3. > > Unfortunately we missed the opportunity when adding python3 support > to fix the same exact same bytes/text boundary issue which is the > main reason why pythons3 exists in the first place. We should have > made porting to numpy3 a intentionally(!) backward incompatible > change just like python itself did. > > Now we are stuck with deciding, which option breaks less. On the > one hand, that S is bytes in python3 is somewhat established by now > and lots of workarounds are already place. On the other hand, I > think code that relies on S being bytes is in the minority and > python3 usage is probably still insignificant in this area. > Unfortunately getting actual numbers and not wild guesses on this > is probably not easy.
One way to try this out is to change the meaning of 'S' and see how badly e.g. pandas or matplotlib break on py3 as a consequence. Another approach would be to add a new 1-byte unicode as a type code different from 'S'. The automatic ASCII encoding in constructor/assignment on Py3 can be deprecated, which would make 'S' a strict bytes dtype. This also is not perfect, since array(['foo']) on Py2 should for backward compatibility continue returning dtype='S'. Moreover, already existing code does not make use of it. -- Pauli Virtanen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion