Could that damage have already been done by prior concurrent access? Maybe a subsequent read only operation fails because your linked list is already hay wired...
Walking a linked list at a later point in time than when it was damaged can do this.. On 5/3/19 10:18 AM, Jeff Young wrote: > I don’t believe that’s the case. Neither of the two crashes that I tracked > down involved > direct index access (or any change to the string). One was calling > foo.IsEmpty(), and the > other foo.Length(). Both use const iterators under the hood. > > When wxUSE_UNICODE_UTF8 is off, wxWidgets gets around parsing the string by > just not doing it: > > Remarks > Note that while the behaviour of wxString > <https://docs.wxwidgets.org/trunk/classwx_string.html> when > |wxUSE_UNICODE_WCHAR==1| resembles > UCS-2 encoding, it's not completely correct to refer to wxString > <https://docs.wxwidgets.org/trunk/classwx_string.html> as UCS-2 encoded > since you can > encode code points outside the /BMP/ in a wxString > <https://docs.wxwidgets.org/trunk/classwx_string.html> as two code units > (i.e. as a > surrogate pair; as already mentioned however wxString > <https://docs.wxwidgets.org/trunk/classwx_string.html> will "see" them as > two > different code points) > > >> On 3 May 2019, at 16:06, John Beard <john.j.be...@gmail.com >> <mailto:john.j.be...@gmail.com>> wrote: >> >> Hi Jeff, >> >> I think it is the index access operator that performs this caching, to allow >> you to >> access the n'th code point any number of times while only iterating the >> string only once. >> >> However, you can still use the iterator access safely. It is only index >> based access >> that is cached and thread-unsafe. >> >> This is what the wxString documention recommends. Furthermore, in any >> Unicode string, >> regardless of encoding (8, 16, 32), index access is almost entirely useless >> anyway, as >> code units/points are only indirectly related to glyphs and/or perceived >> characters >> anyway. If you need to parse a Unicode string, you must iterate from the >> start. There is >> no way around it. >> >> If we're crashing due to cross thread access by index, the bug is probably >> that we >> access the string by index at all. If this was accessed by iterator, cross >> thread, and >> the string is not changed, it's fine. If the string is changed in another >> thread, cached >> iterators are invalid (same as if you change an C++ container in a single >> thread. The >> standard tells you what iterators are invalidated for each operation on a >> container). >> >> I may have got the wrong end of the wxStick here (I can't check it for >> myself right >> now), but as far as I can tell, this is fixable by just never caching >> indices, as if we >> were looking at a C-style char array, and using iterators instead. >> >> We should probably also turn off the unsafe string conversions by defining >> wxNO_UNSAFE_WXSTRING_CONV, if it is not already define. >> >> Cheers, >> >> John >> >> On 3 May 2019 16:35:30 CEST, Jeff Young <j...@rokeby.ie >> <mailto:j...@rokeby.ie>> wrote: >> >> Yes, we know exactly why it crashes: in order to speed up iterator >> access each >> iterator keeps a pointer into the last location accessed (so that i-1 >> and i+1 can be >> fast). These pointers are kept in a linked-list. Adding and removing >> pointers from >> this list is not thread-protected. >> >> Note that wxWidgets will add/remove a pointer even for something >> seemingly innocuous >> like an Empty() check. So doing mutex locks on our side for non-const >> iterator >> access is not sufficient. >> >> The worst part is that since two threads collide on the same string only >> rarely, we >> don’t even know how many of these bugs we have. We’ve fixed 3 or 4 of >> them (by >> adding our own mutex checking on any access), but are there 0 or 10 >> more? Haven’t a >> clue. >> >>>> It is between sad and breath taking. >> >> Indeed. >> >> Cheers, >> Jeff. >> >>> On 3 May 2019, at 15:16, Dick Hollenbeck <d...@softplc.com >>> <mailto:d...@softplc.com>> wrote: >>> >>> Thanks Jeff. >>> >>> On 5/3/19 4:22 AM, Jeff Young wrote: >>>> Hi Dick, >>>> >>>>>> h) What is the list of deficiencies with current string usage? >>>> >>>> I only have one issue with the current use of wxString, but it’s a big >>>> one: it crashes >>>> (unpredictably) when used multi-threaded in UTF8 mode. >>> >>> The fact that it is onely *One* issue is an important data point. >>> >>> Since you know it is crashing in this class, you must know >>> approximately where, and >>> under >>> what kind of read/write activity. Of course, if read activity triggers >>> a lazy >>> (deferred) >>> transformation, then this distinction can get blurred. But more >>> information on source >>> file locations would be very helpful to me. >>> >>> Another important data point you brought is that the wx library >>> designers are advising >>> against using wxString for core application. It will take a couple of >>> hours to even >>> contemplate that, it is basically staggering to me. It is between sad >>> and breath >>> taking. >>> Sounds like they designed themselves into a corner and are now >>> acknowledging that what >>> they designed is more of an API commitment that they want to disavow >>> than a real >>> solution. >>> >>> I can see where that can happen. Superior designs come from experience. >>> Experience comes >>> with usage and time, neither of which are always available up front. >>> >>> >>> >>> >>> >>>> >>>> This design document makes for fascinating >>>> reading: https://wiki.wxwidgets.org/Development:_UTF-8_Support. It >>>> appears that the >>>> current wxString is at least in part modelled on QtString. >>>> >>>> There’s also a bunch of interesting info >>>> here: https://docs.wxwidgets.org/trunk/overview_string.html, which I >>>> believe is more >>>> up-to-date than the previous link. In particular, there’s the mention >>>> that wxString >>>> handles extra-BMP characters transparently when compiled in UTF8 mode >>>> (currently >>>> used by >>>> Kicad), but does NOT when compiled in default mode (in which case the >>>> app must handle >>>> surrogate pairs). This of course directly leads to your point (d): >>>> >>>>>>> d) What does the set of characters that don't fall into UCS2 >>>>>>> actually look >>>>>>> like? How big >>>>>>> is this set, really? (UTF16 is bigger than UCS2 and picks up the >>>>>>> difference.) >>>> >>>> Do we really need to handle extra-BMP characters? >>>> >>>> An even more recent version of the second document >>>> (https://docs.wxwidgets.org/trunk/classwx_string.html) finally makes >>>> an oblique >>>> reference >>>> to the multi-threading issue by starting with this (rather unhelpful) >>>> suggestion: >>>> >>>> Note >>>> While the use of wxString >>>> <https://docs.wxwidgets.org/trunk/classwx_string.html> is >>>> unavoidable in wxWidgets program, you are encouraged to use the >>>> standard string >>>> classes |std::string| or |std::wstring| in your applications and >>>> convert them >>>> to and >>>> from wxString >>>> <https://docs.wxwidgets.org/trunk/classwx_string.html> only when >>>> interacting with wxWidgets. >>>> >>>> >>>> Cheers, >>>> Jeff. >>>> >>>> >>>>> On 3 May 2019, at 02:03, Dick Hollenbeck <d...@softplc.com >>>>> <mailto:d...@softplc.com> <mailto:d...@softplc.com>> wrote: >>>>> >>>>> On 5/2/19 5:32 PM, Dick Hollenbeck wrote: >>>>>> On 4/30/19 4:36 AM, Jeff Young wrote: >>>>>>> We had talked earlier about throwing the wxWidgets UTF8 compile >>>>>>> switch to get >>>>>>> rid of >>>>>>> our wxString re-entrancy problems. However, I noticed that the 6.0 >>>>>>> work >>>>>>> packages doc >>>>>>> includes an item for std::string-ization of the BOARD. (While a >>>>>>> lot more work, >>>>>>> this >>>>>>> is a better solution because it also increases our >>>>>>> gui-toolkit-choice flexibility.) >>>>>>> >>>>>>> I’d like to propose that we use std::wstring for that. UTF8 should >>>>>>> *only* be an >>>>>>> encoding format (similar to s-expr). It should never be used >>>>>>> internally. >>>>>>> That’s what >>>>>>> unicode wchar_t’s are for. >>>>>>> >>>>>>> And I’d like to propose that we extend std::wstring-ization to >>>>>>> SCH_ITEM and >>>>>>> LIB_ITEM. >>>>>>> (Then we can get rid of a bunch of our ugly mutex hacks.) >>>>>> >>>>>> >>>>>> I've been looking at this for a few months now. I think it is so >>>>>> important, that a >>>>>> sub-committee should be formed, and if that committee takes as long >>>>>> as 4 months >>>>>> to come to >>>>>> a recommendation, this would not be too long. This issue is simply >>>>>> too critical. >>>>>> >>>>>> I would like to volunteer to be on that committee. For the entire >>>>>> list to >>>>>> participate in >>>>>> this simply does not make sense to me. I would welcome the >>>>>> opportunity to study >>>>>> this with >>>>>> a team of 5-6 players. More than that probably leads to anxiety. >>>>>> Then, given the >>>>>> recommendations, the list would of course have an opportunity to >>>>>> raise questions >>>>>> and take >>>>>> shots, before a strategy is formulated, and before anything is >>>>>> implemented. >>>>>> >>>>>> Again, approximately: >>>>>> >>>>>> committee recommendations -> list approval -> strategy formulation >>>>>> -> >>>>>> implementation >>>>>> >>>>>> >>>>>> Up to now I have looked at many libraries and have [way *too* much] >>>>>> experience >>>>>> in multiple >>>>>> languages on multiple platforms, so I think I can be valuable >>>>>> contributor. >>>>>> >>>>>> The final work product initially would simply be a list of >>>>>> recommendations, that >>>>>> quickly >>>>>> transforms to a strategy thereafter. This is an enormous >>>>>> undertaking, so I suggest >>>>>> against racing to a solution. It could look a lot easier than it >>>>>> will >>>>>> ultimately be, as >>>>>> is typical in software development. But the return on investment >>>>>> needs to be >>>>>> near optimal >>>>>> in the end. >>>>>> >>>>>> Some questions to answer are: >>>>>> >>>>>> a) How did wxString get to its current state? Is is merely a >>>>>> conglomeration of >>>>>> after >>>>>> thought, or is is anywhere near optimal. >>>>>> >>>>>> b) Why so many forms of it? Can one form be chosen for all >>>>>> platforms? >>>>>> >>>>>> c) How does wxString it compare to QtString? >>>>>> >>>>>> d) What does the set of characters that don't fall into UCS2 >>>>>> actually look like? >>>>>> How big >>>>>> is this set, really? (UTF16 is bigger than UCS2 and picks up the >>>>>> difference.) >>>>>> >>>>>> e) For data files, I think UTF8 is fine. So the change is for RAM >>>>>> manipulation of >>>>>> strings. Aren't we talking about a RAM resident string that bridges >>>>>> into the GUI >>>>>> seamlessly? >>>>>> >>>>>> f) What does new C++ language support offer? >>>>>> >>>>>> g) What do C++ language designers suggest? >>>>> >>>>> h) What is the list of deficiencies with current string usage? >>>>> >>>>> >>>>>> >>>>>> >>>>>> etc. >>>>>> >>>>>> But this is best continued in a smaller group, as said. >>>>>> >>>>>> >>>>>> The other thing that I bring to this is vast familiarity with >>>>>> KiCad's internal >>>>>> workings, >>>>>> string use cases, and goals. >>>>>> >>>>>> Let me know if I can help. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Dick >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Mailing list: https://launchpad.net/~kicad-developers >>>>>> Post to : kicad-developers@lists.launchpad.net >>>>>> <mailto:kicad-developers@lists.launchpad.net> >>>>>> <mailto:kicad-developers@lists.launchpad.net> >>>>>> Unsubscribe : https://launchpad.net/~kicad-developers >>>>>> More help : https://help.launchpad.net/ListHelp >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Mailing list: https://launchpad.net/~kicad-developers >>>>> Post to : kicad-developers@lists.launchpad.net >>>>> <mailto:kicad-developers@lists.launchpad.net> >>>>> <mailto:kicad-developers@lists.launchpad.net> >>>>> Unsubscribe : https://launchpad.net/~kicad-developers >>>>> More help : https://help.launchpad.net/ListHelp >> > _______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp