> On 05 Jan 2016, at 14:26, Hilaire <hila...@drgeo.eu> wrote: > > Agree. > Nevertheless, the performance impact I got on OOP way of doing things > can't be wiped out: when you ask an object to give its text > representation but use non latin1 character, you get an important > penalty. In the long term, it looks like a problem for Pharo.
No, that is not 100% correct. You can use any Unicode anywhere tranparantly and the performance penalty is low. Pharo supports Unicode everywhere for 100% (given you use the right font). The problem occurs only when you take a collection of 1000s of these objects in a tool that wants to convert them all at once but separately to strings. Then the cumulative performance penalty becomes quite noticeable, true. The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ? I believe that fast table tries to do better here. > Hilaire > > > Le 05/01/2016 13:20, Sven Van Caekenberghe a écrit : >> 0450822759String with ByteString and WideString subclasses has been a >> standard feature of Squeak/Pharo for a long time. The transparent automatic >> conversion between the two is a feature, not a limitation. >> >> In se, there is nothing wrong with it. >> >> Yes, other representations of Strings are possible, but is is far from sure >> that they would be faster overall. The current implementation favours Latin1 >> (and thus ASCII), because that is so common. In my work image I count them >> as follows: >> >> ByteString allInstances size. "301498" >> >> WideString allInstances size. "136" >> >> That is less than 0.05%. >> >>> On 05 Jan 2016, at 13:08, Hilaire <hila...@drgeo.eu> wrote: >>> >>> Le 04/01/2016 11:05, Henrik Johansen a écrit : >>>> In the fallback code for WriteStream >> #nextPut:, at:put: is called, so >>>> yes, streaming a wide char causes the streams collection to be converted >>>> from Byte to WideString. >>>> Conversion is done using become, which currently triggers a full heap scan >>>> for references, and is thus very slow. >>>> One could add a fast-path along the lines of #pastEndPut: (which has >>>> already broken any assumption that a reference to the collection will >>>> reflect all writes for the lifetime of stream, for the same performance >>>> problems one would face using #become:); if collection is a ByteString and >>>> anObject is a wide characters, replace collection with a WideString, and >>>> *then* call at:put: >>>> But, it is not a very nice thing to add to a generic streaming class, nor >>>> is it a very attractive at this point in time considering that making >>>> become: a fast operation is one of the problems solved by Spur. >>> So wait and see for Spur? >>> To not forget about it, it is recorded here, and it should be kept open >>> for later check: >>> https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol >>> >>> It is possible to turn around this problem, but this sort of annoyance >>> with Pharo internal encoding regularly arises, so I am not sure what to >>> think about the state of Pharo regarding internal encoding. Now days is >>> not supposed to be all utf-8? >>> >>> Thanks >>> >>> Hilaire >>> >>> -- >>> Dr. Geo >>> http://drgeo.eu >>> >>> >>> >> >> > > > -- > Dr. Geo > http://drgeo.eu