D24982: Small improvements in some XML files
This revision was automatically updated to reflect the committed changes. Closed by commit R216:02a3fb1d7c39: Small improvements in some XML files (authored by nibags, committed by cullmann). REPOSITORY R216 Syntax Highlighting CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D24982?vs=70781&id=75298 REVISION DETAIL https://phabricator.kde.org/D24982 AFFECTED FILES autotests/folding/highlight.y.fold data/syntax/apparmor.xml data/syntax/coffee.xml data/syntax/fortran-fixed.xml data/syntax/fortran-free.xml data/syntax/html.xml data/syntax/logcat.xml data/syntax/mustache.xml data/syntax/rust.xml data/syntax/selinux-cil.xml data/syntax/selinux-fc.xml data/syntax/selinux.xml data/syntax/xml.xml data/syntax/yacc.xml data/syntax/yaml.xml To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, rrosch, LeGast00n, cblack, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
cullmann accepted this revision. cullmann added a comment. This revision is now accepted and ready to land. I think we can go with this as is. Given we have tests for a lot of stuff, regressions shouldn't be that likely. I will merge this, before we let that rot even more. Thanks for the work on this, btw.! REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, rrosch, LeGast00n, cblack, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags updated this revision to Diff 70781. nibags added a comment. - Resolve conflicts REPOSITORY R216 Syntax Highlighting CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D24982?vs=70779&id=70781 REVISION DETAIL https://phabricator.kde.org/D24982 AFFECTED FILES autotests/folding/highlight.y.fold data/syntax/apparmor.xml data/syntax/coffee.xml data/syntax/fortran-fixed.xml data/syntax/fortran-free.xml data/syntax/html.xml data/syntax/logcat.xml data/syntax/mustache.xml data/syntax/rust.xml data/syntax/selinux-cil.xml data/syntax/selinux-fc.xml data/syntax/selinux.xml data/syntax/xml.xml data/syntax/yacc.xml data/syntax/yaml.xml To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags updated this revision to Diff 70779. nibags added a comment. - Resolve merge conflicts. REPOSITORY R216 Syntax Highlighting CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D24982?vs=68824&id=70779 REVISION DETAIL https://phabricator.kde.org/D24982 AFFECTED FILES autotests/folding/highlight.y.fold data/syntax/apparmor.xml data/syntax/coffee.xml data/syntax/fortran-fixed.xml data/syntax/fortran-free.xml data/syntax/html.xml data/syntax/logcat.xml data/syntax/mustache.xml data/syntax/rust.xml data/syntax/selinux-cil.xml data/syntax/selinux-fc.xml data/syntax/selinux.xml data/syntax/xml.xml data/syntax/yacc.xml data/syntax/yaml.xml To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags added a comment. Captures can also be used without dynamic rules, for example in the regular expression: `("+)[^"]*\1` In these cases, a `capture="true"` attribute can be used to enable captures, but in dynamic rules the detection will be automatic. I could do a benchmark the test units with the disabled captures to check the improvement in performance, and see if these changes are worth making. I don't know if you want to include this diff. If you prefer, I can update this diff, remove the inclusion of `?:` in regex and leave only the other improvements. REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
dhaumann added a comment. I think we should decide what to do with this patch, as over time it will get merge conflicts. In general, I hoped we can have some sort of auto detection to disable captures in a smart way. If this is not possible, maybe we should accept this patch? Did you measure speed improvements? REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
cullmann added a comment. Could we perhaps just do some minimal checking like "no dynamic stuff at all" => turn off all captures? REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
dhaumann added a comment. Hm right... too bad. I was hoping to find an automated way to detect this. Since relying on the user to optimize the RegExps will always be suboptimal. @cullmann Do you have any ideas? REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags added a comment. No, the dynamic flag is used to insert the captures already stored in `%N` (the captures are stored in the RegExpr rules and then "inserted" in the rules with the dynamic flag) REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
dhaumann added a comment. > [...] > One option would be to add a **capture** or **dontCapture** attribute to enable or disable captures for RegExpr rules. Also, captures could be enabled or disabled in all RegExpr rules using the group, adding an element for that. But don't we have this already with the `dynamic` flag? Or am I mixing things? REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags added a comment. I had also thought about using `QRegularExpression::DontCaptureOption`, which is equivalent to using `(?:...)`, but I wasn't sure how much the real improvement in performance is. However, disabling captures avoids allocating unnecessary QString for each capture. One option would be to add a **capture** or **dontCapture** attribute to enable or disable captures for RegExpr rules. Also, captures could be enabled or disabled in all RegExpr rules using the group, adding an element for that. REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
dhaumann added a comment. I wonder if the `?:` optimizations make sense. QRegularExpression has the option `QRegularExpression::DontCaptureOption` to not capture anything. Looking into our code we have: 561 bool RegExpr::doLoad(QXmlStreamReader& reader) 562 { 563 m_regexp.setPattern(reader.attributes().value(QStringLiteral("String")).toString()); // here we set the pattern -> OK 564 565 const auto isMinimal = Xml::attrToBool(reader.attributes().value(QStringLiteral("minimal"))); 566 const auto isCaseInsensitive = Xml::attrToBool(reader.attributes().value(QStringLiteral("insensitive"))); 567 m_regexp.setPatternOptions( // if (m_dynamic == false), we could add the 568(isMinimal ? QRegularExpression::InvertedGreedinessOption : QRegularExpression::NoPatternOption) | // flag QRegularExpression::DontCaptureOption 569(isCaseInsensitive ? QRegularExpression::CaseInsensitiveOption : QRegularExpression::NoPatternOption)); 570 571// optimize the pattern for the non-dynamic case, we use them OFTEN 572m_dynamic = Xml::attrToBool(reader.attributes().value(QStringLiteral("dynamic"))); 573if (!m_dynamic) { 574m_regexp.optimize(); 575} 576// [...] In other words: The current patch adds many `?:` which also makes the RegExps harder to read. So: Do we really get a performance gain here? Wouldn't it be possible to get an even better performance gain by using the flag `DontCaptureOption`? Currently, I am not yet convinced, can you give this a try? :-) REPOSITORY R216 Syntax Highlighting REVISION DETAIL https://phabricator.kde.org/D24982 To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags updated this revision to Diff 68824. nibags added a comment. - Increase version of javascript.xml REPOSITORY R216 Syntax Highlighting CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D24982?vs=68823&id=68824 BRANCH improve-some-xml REVISION DETAIL https://phabricator.kde.org/D24982 AFFECTED FILES autotests/folding/highlight.y.fold data/syntax/apparmor.xml data/syntax/coffee.xml data/syntax/fortran-fixed.xml data/syntax/fortran-free.xml data/syntax/html.xml data/syntax/javascript.xml data/syntax/logcat.xml data/syntax/mustache.xml data/syntax/rust.xml data/syntax/selinux-cil.xml data/syntax/selinux-fc.xml data/syntax/selinux.xml data/syntax/xml.xml data/syntax/yacc.xml data/syntax/yaml.xml To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann
D24982: Small improvements in some XML files
nibags created this revision. nibags added reviewers: Framework: Syntax Highlighting, dhaumann, cullmann. Herald added projects: Kate, Frameworks. Herald added subscribers: kde-frameworks-devel, kwrite-devel. nibags requested review of this revision. REVISION SUMMARY **Changes:** - Replace some unnecessary RegExpr rules with other rules. - Use non-capturing groups in some regular expressions. - Add `##Modelines` in comments. - Add folding for multi-line comments, add attribute `region` in element ``. - Add `column` atributte in some rules. TEST PLAN make test REPOSITORY R216 Syntax Highlighting BRANCH improve-some-xml REVISION DETAIL https://phabricator.kde.org/D24982 AFFECTED FILES autotests/folding/highlight.y.fold data/syntax/apparmor.xml data/syntax/coffee.xml data/syntax/fortran-fixed.xml data/syntax/fortran-free.xml data/syntax/html.xml data/syntax/javascript.xml data/syntax/logcat.xml data/syntax/mustache.xml data/syntax/rust.xml data/syntax/selinux-cil.xml data/syntax/selinux-fc.xml data/syntax/selinux.xml data/syntax/xml.xml data/syntax/yacc.xml data/syntax/yaml.xml To: nibags, #framework_syntax_highlighting, dhaumann, cullmann Cc: kwrite-devel, kde-frameworks-devel, LeGast00n, GB_2, domson, michaelh, ngraham, bruns, demsking, cullmann, sars, dhaumann