Travis E. Oliphant wrote: > Currently that means that they are "unicode" strings of basic size UCS2 > or UCS4 depending on the platform. It is this duality that has some > people concerned. For all other data-types, NumPy allows the user to > explicitly request a bit-width for the data-type.
Why is that a desirable property? Also: Why does have NumPy support for Unicode arrays in the first place? > Before embarking on this journey, however, we are seeking advice from > individuals wiser to the way of Unicode on this list. My initial reaction is: use whatever Python uses in "NumPy Unicode". Upon closer inspection, it is not all that clear what operations are supported on a Unicode array, and how these operations relate to the Python Unicode type. In any case, I think NumPy should have only a single "Unicode array" type (please do explain why having zero of them is insufficient). If the purpose of the type is to interoperate with a Python unicode object, it should use the same width (as this will allow for mempcy). If the purpose is to support arbitrary Unicode characters, it should use 4 bytes (as two bytes are insufficient to represent arbitrary Unicode characters). If the purpose is something else, please explain what the purpose is. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com