John I got this too from reading the class documentation an hour ago. To smoke these out, a person could comment out the undesirable calls in a wx header, perhaps one that was temporarily moved into a place at a higher priority in the INCLUDE file search space.
Then "make -i" perhaps on a subset of multi-threaded source files (object file make targets), or the whole shebang for maximum pain. On 5/3/19 10:06 AM, John Beard wrote: > Hi Jeff, > > I think it is the index access operator that performs this caching, to allow > you to access > the n'th code point any number of times while only iterating the string only > once. > > However, you can still use the iterator access safely. It is only index based > access that > is cached and thread-unsafe. > > This is what the wxString documention recommends. Furthermore, in any Unicode > string, > regardless of encoding (8, 16, 32), index access is almost entirely useless > anyway, as > code units/points are only indirectly related to glyphs and/or perceived > characters > anyway. If you need to parse a Unicode string, you must iterate from the > start. There is > no way around it. > > If we're crashing due to cross thread access by index, the bug is probably > that we access > the string by index at all. If this was accessed by iterator, cross thread, > and the string > is not changed, it's fine. If the string is changed in another thread, cached > iterators > are invalid (same as if you change an C++ container in a single thread. The > standard tells > you what iterators are invalidated for each operation on a container). > > I may have got the wrong end of the wxStick here (I can't check it for myself > right now), > but as far as I can tell, this is fixable by just never caching indices, as > if we were > looking at a C-style char array, and using iterators instead. > > We should probably also turn off the unsafe string conversions by defining > wxNO_UNSAFE_WXSTRING_CONV, if it is not already define. > > Cheers, > > John > > On 3 May 2019 16:35:30 CEST, Jeff Young <j...@rokeby.ie> wrote: > > Yes, we know exactly why it crashes: in order to speed up iterator access > each > iterator keeps a pointer into the last location accessed (so that i-1 and > i+1 can be > fast). These pointers are kept in a linked-list. Adding and removing > pointers from > this list is not thread-protected. > > Note that wxWidgets will add/remove a pointer even for something > seemingly innocuous > like an Empty() check. So doing mutex locks on our side for non-const > iterator access > is not sufficient. > > The worst part is that since two threads collide on the same string only > rarely, we > don’t even know how many of these bugs we have. We’ve fixed 3 or 4 of > them (by adding > our own mutex checking on any access), but are there 0 or 10 more? > Haven’t a clue. > >>> It is between sad and breath taking. > > Indeed. > > Cheers, > Jeff. > >> On 3 May 2019, at 15:16, Dick Hollenbeck <d...@softplc.com >> <mailto:d...@softplc.com>> wrote: >> >> Thanks Jeff. >> >> On 5/3/19 4:22 AM, Jeff Young wrote: >>> Hi Dick, >>> >>>>> h) What is the list of deficiencies with current string usage? >>> >>> I only have one issue with the current use of wxString, but it’s a big >>> one: it crashes >>> (unpredictably) when used multi-threaded in UTF8 mode. >> >> The fact that it is onely *One* issue is an important data point. >> >> Since you know it is crashing in this class, you must know approximately >> where, and >> under >> what kind of read/write activity. Of course, if read activity triggers >> a lazy >> (deferred) >> transformation, then this distinction can get blurred. But more >> information on source >> file locations would be very helpful to me. >> >> Another important data point you brought is that the wx library >> designers are advising >> against using wxString for core application. It will take a couple of >> hours to even >> contemplate that, it is basically staggering to me. It is between sad >> and breath >> taking. >> Sounds like they designed themselves into a corner and are now >> acknowledging that what >> they designed is more of an API commitment that they want to disavow >> than a real >> solution. >> >> I can see where that can happen. Superior designs come from experience. >> Experience >> comes >> with usage and time, neither of which are always available up front. >> >> >> >> >> >>> >>> This design document makes for fascinating >>> reading: https://wiki.wxwidgets.org/Development:_UTF-8_Support. It >>> appears that the >>> current wxString is at least in part modelled on QtString. >>> >>> There’s also a bunch of interesting info >>> here: https://docs.wxwidgets.org/trunk/overview_string.html, which I >>> believe is more >>> up-to-date than the previous link. In particular, there’s the mention >>> that wxString >>> handles extra-BMP characters transparently when compiled in UTF8 mode >>> (currently >>> used by >>> Kicad), but does NOT when compiled in default mode (in which case the >>> app must handle >>> surrogate pairs). This of course directly leads to your point (d): >>> >>>>>> d) What does the set of characters that don't fall into UCS2 >>>>>> actually look like? >>>>>> How big >>>>>> is this set, really? (UTF16 is bigger than UCS2 and picks up the >>>>>> difference.) >>> >>> Do we really need to handle extra-BMP characters? >>> >>> An even more recent version of the second document >>> (https://docs.wxwidgets.org/trunk/classwx_string.html) finally makes an >>> oblique >>> reference >>> to the multi-threading issue by starting with this (rather unhelpful) >>> suggestion: >>> >>> Note >>> While the use of wxString >>> <https://docs.wxwidgets.org/trunk/classwx_string.html> is >>> unavoidable in wxWidgets program, you are encouraged to use the >>> standard string >>> classes |std::string| or |std::wstring| in your applications and >>> convert them to and >>> from wxString <https://docs.wxwidgets.org/trunk/classwx_string.html> >>> only when >>> interacting with wxWidgets. >>> >>> >>> Cheers, >>> Jeff. >>> >>> >>>> On 3 May 2019, at 02:03, Dick Hollenbeck <d...@softplc.com >>>> <mailto:d...@softplc.com> <mailto:d...@softplc.com>> wrote: >>>> >>>> On 5/2/19 5:32 PM, Dick Hollenbeck wrote: >>>>> On 4/30/19 4:36 AM, Jeff Young wrote: >>>>>> We had talked earlier about throwing the wxWidgets UTF8 compile >>>>>> switch to get rid of >>>>>> our wxString re-entrancy problems. However, I noticed that the 6.0 >>>>>> work >>>>>> packages doc >>>>>> includes an item for std::string-ization of the BOARD. (While a lot >>>>>> more work, this >>>>>> is a better solution because it also increases our >>>>>> gui-toolkit-choice flexibility.) >>>>>> >>>>>> I’d like to propose that we use std::wstring for that. UTF8 should >>>>>> *only* be an >>>>>> encoding format (similar to s-expr). It should never be used >>>>>> internally. That’s >>>>>> what >>>>>> unicode wchar_t’s are for. >>>>>> >>>>>> And I’d like to propose that we extend std::wstring-ization to >>>>>> SCH_ITEM and >>>>>> LIB_ITEM. >>>>>> (Then we can get rid of a bunch of our ugly mutex hacks.) >>>>> >>>>> >>>>> I've been looking at this for a few months now. I think it is so >>>>> important, that a >>>>> sub-committee should be formed, and if that committee takes as long >>>>> as 4 months >>>>> to come to >>>>> a recommendation, this would not be too long. This issue is simply >>>>> too critical. >>>>> >>>>> I would like to volunteer to be on that committee. For the entire >>>>> list to >>>>> participate in >>>>> this simply does not make sense to me. I would welcome the >>>>> opportunity to study >>>>> this with >>>>> a team of 5-6 players. More than that probably leads to anxiety. >>>>> Then, given the >>>>> recommendations, the list would of course have an opportunity to >>>>> raise questions >>>>> and take >>>>> shots, before a strategy is formulated, and before anything is >>>>> implemented. >>>>> >>>>> Again, approximately: >>>>> >>>>> committee recommendations -> list approval -> strategy formulation >>>>> -> implementation >>>>> >>>>> >>>>> Up to now I have looked at many libraries and have [way *too* much] >>>>> experience in >>>>> multiple >>>>> languages on multiple platforms, so I think I can be valuable >>>>> contributor. >>>>> >>>>> The final work product initially would simply be a list of >>>>> recommendations, that >>>>> quickly >>>>> transforms to a strategy thereafter. This is an enormous >>>>> undertaking, so I suggest >>>>> against racing to a solution. It could look a lot easier than it >>>>> will ultimately >>>>> be, as >>>>> is typical in software development. But the return on investment >>>>> needs to be >>>>> near optimal >>>>> in the end. >>>>> >>>>> Some questions to answer are: >>>>> >>>>> a) How did wxString get to its current state? Is is merely a >>>>> conglomeration of after >>>>> thought, or is is anywhere near optimal. >>>>> >>>>> b) Why so many forms of it? Can one form be chosen for all platforms? >>>>> >>>>> c) How does wxString it compare to QtString? >>>>> >>>>> d) What does the set of characters that don't fall into UCS2 actually >>>>> look like? >>>>> How big >>>>> is this set, really? (UTF16 is bigger than UCS2 and picks up the >>>>> difference.) >>>>> >>>>> e) For data files, I think UTF8 is fine. So the change is for RAM >>>>> manipulation of >>>>> strings. Aren't we talking about a RAM resident string that bridges >>>>> into the GUI >>>>> seamlessly? >>>>> >>>>> f) What does new C++ language support offer? >>>>> >>>>> g) What do C++ language designers suggest? >>>> >>>> h) What is the list of deficiencies with current string usage? >>>> >>>> >>>>> >>>>> >>>>> etc. >>>>> >>>>> But this is best continued in a smaller group, as said. >>>>> >>>>> >>>>> The other thing that I bring to this is vast familiarity with KiCad's >>>>> internal >>>>> workings, >>>>> string use cases, and goals. >>>>> >>>>> Let me know if I can help. >>>>> >>>>> Regards, >>>>> >>>>> Dick >>>>> >>>>> >>>>> _______________________________________________ >>>>> Mailing list: https://launchpad.net/~kicad-developers >>>>> Post to : kicad-developers@lists.launchpad.net >>>>> <mailto:kicad-developers@lists.launchpad.net> >>>>> <mailto:kicad-developers@lists.launchpad.net> >>>>> Unsubscribe : https://launchpad.net/~kicad-developers >>>>> More help : https://help.launchpad.net/ListHelp >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Mailing list: https://launchpad.net/~kicad-developers >>>> Post to : kicad-developers@lists.launchpad.net >>>> <mailto:kicad-developers@lists.launchpad.net> >>>> <mailto:kicad-developers@lists.launchpad.net> >>>> Unsubscribe : https://launchpad.net/~kicad-developers >>>> More help : https://help.launchpad.net/ListHelp > _______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp