Agree.
Nevertheless, the performance impact I got on OOP way of doing things
can't be wiped out: when you ask an object to give its text
representation but use non latin1 character, you get an important
penalty. In the long term, it looks like a problem for Pharo.

Hilaire


Le 05/01/2016 13:20, Sven Van Caekenberghe a écrit :
> 0450822759String with ByteString and WideString subclasses has been a 
> standard feature of Squeak/Pharo for a long time. The transparent automatic 
> conversion between the two is a feature, not a limitation.
>
> In se, there is nothing wrong with it.
>
> Yes, other representations of Strings are possible, but is is far from sure 
> that they would be faster overall. The current implementation favours Latin1 
> (and thus ASCII), because that is so common. In my work image I count them as 
> follows:
>
> ByteString allInstances size. "301498"
>
> WideString allInstances size. "136"
>
> That is less than 0.05%.
>
>> On 05 Jan 2016, at 13:08, Hilaire <hila...@drgeo.eu> wrote:
>>
>> Le 04/01/2016 11:05, Henrik Johansen a écrit :
>>> In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so 
>>> yes, streaming a wide char causes the streams collection to be converted 
>>> from Byte to WideString.
>>> Conversion is done using become, which currently triggers a full heap scan 
>>> for references, and is thus very slow.
>>> One could add a fast-path along the lines of #pastEndPut: (which has 
>>> already broken any assumption that a reference to the collection will 
>>> reflect all writes for the lifetime of stream, for the same performance 
>>> problems one would face using #become:); if collection is a ByteString and 
>>> anObject is a wide characters, replace collection with a WideString, and 
>>> *then* call at:put:
>>> But, it is not a very nice thing to add to a generic streaming class, nor 
>>> is it a very attractive at this point in time considering that making 
>>> become: a fast operation is one of the problems solved by Spur.
>> So wait and see for Spur?
>> To not forget about it, it is recorded here, and it should be kept open
>> for later check:
>> https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol
>>
>> It is  possible to turn around this problem, but this sort of annoyance
>> with Pharo internal encoding regularly arises, so I am not sure what to
>> think about the state of Pharo regarding internal encoding. Now days is
>> not supposed to be all utf-8?
>>
>> Thanks
>>
>> Hilaire
>>
>> -- 
>> Dr. Geo
>> http://drgeo.eu
>>
>>
>>
>
>


-- 
Dr. Geo
http://drgeo.eu



Reply via email to