On 8 Apr 2018, at 12:41, David Chisnall <[email protected]> wrote: > > On 8 Apr 2018, at 10:55, Richard Frith-Macdonald > <[email protected]> wrote: >> >> >> >>> On 6 Apr 2018, at 11:00, David Chisnall <[email protected]> wrote: >>> >>> It would probably help catch more bugs if we made use of NSString’s >>> class-cluster nature more in -base. I have just fixed a bug in GSString >>> where we were checking one object matched a particular class before >>> dereferencing the _flags ivar of the other. I caught this because the >>> other was a GSTinyString, which is almost never a valid pointer. >> >> Possibly, but performance *is* an issue here. The NSString code was >> rewritten some years ago (moving away from them use of class cluster >> features) as a result of extensive profiling of real-world applications >> which were running too slow, precisely because NSString methods are very >> heavily used in real apps. At the time somethjing like 20% of the CPU was >> wasted in method dispatch overheads (the -characterAtIndex: method is one of >> the cluster primitives and a major culprit) but there were also performance >> issues due to buffer allocation and copying of internal representations. >> The changes made a substantial improvement in general performance as well as >> causing multipler orders of magnitude improvement in a few pathological >> cases. > > I agree that we should be improving performance for critical code, but > unfortunately it appears that we have done so at the expense of correctness > in a number of places. As per my other email, > -rangeOfComposedCharacterSequenceAtIndex: appears to give the wrong results > in almost every nontrivial case, and is unfortunately one of the primitive > methods for a lot of things. > > I also note that a lot of the NSString method implementations are not well > optimised. In a number of places, -characterAtIndex: is called repeatedly, > when -getCharacters:range: is normally significantly more efficient. The ICU > UText interface provides something very similar to -getCharacters:range: as > its primitive method (a callback that fills a buffer with UTF-16 characters) > and has some carefully optimised routines.
I’ve pushed my WIP changes to the newabi branch - review is welcome! This branch disables the GSString implementation of -rangeOfComposedCharacterSequenceAtIndex: and falls back to the NSString one (which is also wrong, but now consistently wrong). David _______________________________________________ Gnustep-dev mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gnustep-dev
