String access is a factor in the performance of the new real-time connectivity algorithm in eeschema, since all connectivity is established by parsing labels and pin names. I have not done benchmarks comparing various options for string storage, but we would need to watch that space too if we change how strings work.
-Jon On Tue, Apr 30, 2019 at 8:41 PM John Beard <john.j.be...@gmail.com> wrote: > On 30/04/2019 16:01, Jeff Young wrote: > > Primarily for performance reasons. > > WRT performance, I did a few benchmarks for reference (on Linux) > > Loading this large CIAA PCB[1] allocates, out of a peak usage of 467MB > of heap with a 0.01% threshold: > > * 9.6MB of std::basic_string<wchar_t>::_M_assign > * 9.4MB of this is from wxString operator= assignments > * ~600kB of std::basic_string<wchar_t>::_M_construct, (wxString ctor) > > So I'm not sure memory usage is a major factor to worry about (strings > allocate storage on the heap, so we should see basically all the > interesting things in the heap profile). UTF-8 could be as little as 1/4 > UTF-32 (all strings are ASCII), but even then, it's a few MB saved. > > Now, in terms of performance, opening Pcbnew with no file gives: > > #4 3.36% __gconv_transform_utf8_internal > #5 2.51% __mbsrtowcs_l > #6 2.50% wxMBConv::ToWChar > #8 2.07% std::basic_string<wxhar_t>::_M_assign > #9 1.88% wxMBConvStrictUTF8::ToWChar > #14 1.27% EscapeString (kicad function) > #17 0.85% __GI___strlen_sse2 > > #18 0.85% wxUniChar::From8bit > > > #19 0.84% wxUniChar::operator== > > And plenty more string-y things in the top 50 or so lines. So it seems > the biggest cost for strings is converting them from UTF-8 to wchar_t > strings in WX (this is probably not the same on Windows). But it's not > really a stunning cost. > > However, loading the CIAA board, and there are basically no string > operations above 0.5%, and only a handful even above 0.25%. When doing > DRC, strings don't break 0.1%: nearly all the significant work is > looking things up in std::maps and geometry. > > So string performance doesn't seem to be *that* critical, as it's > quickly drowned out under real workloads. It looks to me (and I'm happy > to be corrected, I'm not a perf expert), like string operations in KiCad > are not much of a bottleneck. > > > Because characters are different lengths, you have to scan the string > > to find the n’th character. > > Even with UTF-32, you can only do an O(1) lookup of the n'th *code > point* or *code unit* (the same in UTF-32, not in UTF-8), not the n'th > *encoded character*. > > That's true even if you normalise the strings first. Not all code points > map one-to-one to an encoded character (it can be one-to-none, > one-to-one, many-to-one). And that's even without considering grapheme > clustering. > > Cheers, > > John > > PS / OT: If we had to optimise one thing, > PolygonTriangulation::Vertex::inTriangle is the single hungriest > function, chewing 6.19% of all CPU time, double that of each of the next > 3: __gnu_cxx::__exchange_and_add (2.76%), PolygonTriangulation::isEar > (2.73%) and even malloc (2.27%). > > Other than that fairly mundane 6%-er, there are no eye-popping > performance hogs simply on loading a PCB. Which is nice. > > [1]: > > https://github.com/ciaa/Hardware/blob/master/PCB/ACC/CIAA_ACC/ciaa_acc.kicad_pcb > > _______________________________________________ > Mailing list: https://launchpad.net/~kicad-developers > Post to : kicad-developers@lists.launchpad.net > Unsubscribe : https://launchpad.net/~kicad-developers > More help : https://help.launchpad.net/ListHelp >
_______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp