On Sat, Oct 28, 2023 at 05:42:50PM +0100, Gavin Smith wrote: > I managed to disable a lot of the new XS code and get the test suite > to pass. I had to leave the XS translation module active due to the > coupling that now exists between it and the XS parser.
Also I doubt that any slowdown could come from doing in C the code that was done in perl previously in Parsetexi.pm. To me having this code in XS is both more logical and (probably) faster. > As you can see, my attempt at disabling the new modules reverses most of, > but not all, of the slowdown. I think that you can also comment out rebuild_document when none of the XS is overriding the perl code, but I have not tested. One thing that comes to my mind is that I removed simple_parser https://git.savannah.gnu.org/cgit/texinfo.git/commit/?id=4a3d02c0fc1932350d925fb957e0758a5290436c it could explain some increase of the time used by gdt. > I'm still trying to find causes for the remaining slowdown. I profiled > with NYTProf and think that build_document is one possibility, as it > may does more than build_texinfo_tree did. I do not think so. The only additional things it does (store identifiers_target) correspond to the fact that set_labels_identifiers_target is now done in C instead of using perl code as it did previously, but I dount that it requires much more time. However, even if it does not do more, doing it twice could be a reason for the slowdown if the time passed in build_texinfo_tree and other parser results passing to perl codes is important. > For the glibc manual, it is > called 2412 times (at least once per parser object). As you know, > there is a new parser for every @def* command in the Texinfo sources, > so per-parser overhead can be significant. I do not get it. If you are speaking about the translation happening in complete_indices, calls of gdt_tree -> gdt -> replace_convert_substrings do not require a new parser, the current parser is reused. There is is still a parsing and a storing of a document that is later on removed, plus substitutions in the tree. But still this should be faster than the same code in perl. > I see there are also > changes to index sorting, but haven't investigated them enough to > understand if this would have a performance impact. Hopefully this should have a positive impact by caching some regexps results. > It was important to be able to disable these new modules in order to see > this remaining slowdown. I still argue for making it easy to cleanly > disable these new modules unless or until they do not slow down the program > as much. It seems like this could be relatively easy, by adding a variable which is tested when loading XS code and that's it, unless I missed something? > If the > promised benefits of the new development never materialised, it would > mean that the post-7.1 development of texi2any was not worth pursuing. I would be very surprised if there was no speedup of the HTML converter. Right now it is very slow, with the main loop in C it should be much faster. > This is from my perspective of somebody who is not familiar with the > new code and doesn't understand how it all works. I've spent hours > trying to work this out over the last few days because I view it as a > threat to the future development of the program. The slowdown is not that big, that being said I agree that it would be nice to understand why with XS for structure/transformations it is slower than with perl. > If the Perl object for the parse tree is built twice, this is a definite > problem, and something that needs to be remedied before the new XS code > can be considered to be in a finished state. To me it is not in a finished state before the HTML converter main loop is fully in C when there is no user customization. -- Pat
