On Tue, Jan 25, 2011 at 5:43 PM, M.-A. Lemburg <m...@egenix.com> wrote: > I also don't see how this could save a lot of memory. As an example > take a French text with say 10mio code points. This would end up > appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB), > one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending > on how many accents are used). That's a saving of -10MB compared to > today's implementation :-)
If I am reading the pep right, which I may not be as I am no expert on unicode, the new implementation would actually give a 10MB saving since the wchar field is optional, so only the str (Latin-1) and utf8 fields would need to be stored. How it decides not to store one field or another would need to be clarified in the pep is I am right. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com