Il 27/05/2021 05:54, Cameron Simpson ha scritto:
On 26May2021 12:11, Jon Ribbens wrote:
On 2021-05-26, Alan Gauld wrote:
I confess I had just assumed the unicode strings were stored
in native unicode UTF8 format.
If you do that then indexing and slicing strings becomes very slow.
True, bu
On Thu, May 27, 2021 at 1:56 PM Cameron Simpson wrote:
>
> On 26May2021 12:11, Jon Ribbens wrote:
> >On 2021-05-26, Alan Gauld wrote:
> >> I confess I had just assumed the unicode strings were stored
> >> in native unicode UTF8 format.
> >
> >If you do that then indexing and slicing strings beco
On 26May2021 12:11, Jon Ribbens wrote:
>On 2021-05-26, Alan Gauld wrote:
>> I confess I had just assumed the unicode strings were stored
>> in native unicode UTF8 format.
>
>If you do that then indexing and slicing strings becomes very slow.
True, but that isn't necessarily a show stopper. My im
On 26/05/2021 22:15, Tim Chase wrote:
> If you don't decode it upon reading it in, it should still be 100MB
> because it's a stream of encoded bytes.
I usually convert them to utf8.
> You don't specify what you then do with this humongous string,
Mainly I search for regex patterns which can
On 2021-05-26 18:43, Alan Gauld via Python-list wrote:
> On 26/05/2021 14:09, Tim Chase wrote:
>>> If so, doesn't that introduce a pretty big storage overhead for
>>> large strings?
>>
>> Yes. Though such large strings tend to be more rare, largely
>> because they become unweildy for other reas
On 26/05/2021 14:09, Tim Chase wrote:
>> If so, doesn't that introduce a pretty big storage overhead for
>> large strings?
>
> Yes. Though such large strings tend to be more rare, largely because
> they become unweildy for other reasons.
I do have some scripts that work on large strings - mainl
On 5/26/2021 12:07 PM, Chris Angelico wrote:
On Thu, May 27, 2021 at 1:59 AM Jon Ribbens via Python-list
wrote:
On 2021-05-26, Alan Gauld wrote:
On 25/05/2021 23:23, Terry Reedy wrote:
In CPython's Flexible String Representation all characters in a string
are stored with the same number of
On Thu, May 27, 2021 at 1:59 AM Jon Ribbens via Python-list
wrote:
>
> On 2021-05-26, Alan Gauld wrote:
> > On 25/05/2021 23:23, Terry Reedy wrote:
> >> In CPython's Flexible String Representation all characters in a string
> >> are stored with the same number of bytes, depending on the largest
>
On 2021-05-26, Alan Gauld wrote:
> On 25/05/2021 23:23, Terry Reedy wrote:
>> In CPython's Flexible String Representation all characters in a string
>> are stored with the same number of bytes, depending on the largest
>> codepoint.
>
> I'm learning lots of new things in this thread!
>
> Does th
On 2021-05-26 08:18, Alan Gauld via Python-list wrote:
> Does that mean that if I give Python a UTF8 string that is mostly
> single byte characters but contains one 4-byte character that
> Python will store the string as all 4-byte characters?
As best I understand it, yes: the cost of each "chara
On Wed, May 26, 2021 at 10:04 PM Alan Gauld via Python-list
wrote:
>
> On 25/05/2021 23:23, Terry Reedy wrote:
>
> > In CPython's Flexible String Representation all characters in a string
> > are stored with the same number of bytes, depending on the largest
> > codepoint.
>
> I'm learning lots of
On 25/05/2021 23:23, Terry Reedy wrote:
> In CPython's Flexible String Representation all characters in a string
> are stored with the same number of bytes, depending on the largest
> codepoint.
I'm learning lots of new things in this thread!
Does that mean that if I give Python a UTF8 string
12 matches
Mail list logo