https://bugs.kde.org/show_bug.cgi?id=455995

            Bug ID: 455995
           Summary: Regex replace: evaluate assertions before replacing
           Product: kate
           Version: Git
          Platform: openSUSE RPMs
                OS: Linux
            Status: REPORTED
          Severity: minor
          Priority: NOR
         Component: search
          Assignee: kwrite-bugs-n...@kde.org
          Reporter: groszdaniel...@gmail.com
  Target Milestone: ---

SUMMARY
When using regular expressions in the Edit / Replace... feature of Kate or
KWrite, the Replace All feature behaves in an unexpected way when using
assertions at the beginning of the regex.

The most typical example is replacing "^ " with "": I'd expect this to remove
(at most) one space from the beginning of every line, but it actually removes
all initial spaces from each line.

Other examples: If I replace "(?<= )." with "", I'd expect it to remove the
first character after each space, but it actually removes everything after the
first space in each line. Likewise, if I replace "\b\w" with "", I'd expect it
to remove the first character of each word, but it actually removes each word
character.

My guess as to why this happens is that in each replacement step, Kate first
preforms a replacement, then moves the cursor to the end of the replacement
text (which is empty in our examples), and then performs the next search
beginning from there.

Instead, when using Replace All, it should first find all instances to replace,
and then perform all the replacements (or perhaps do something more efficient,
but equivalent in effect). This is what other regex replacement engines seem to
do, such as those of sed and javascript (at least in effect; I don't know how
they are implemented).

When using the Replace button, rather than Replace All, it should probably take
into account the result of previous replacements, but not the last replacement,
when finding the next occurrence of the search string.

The issue doesn't occur with Kate's Search & Replace plugin, since it finds all
occurrences first.

STEPS TO REPRODUCE
1. Create a file with this content:
a
␣b
␣␣c
␣␣␣d
␣␣␣␣e
2. Edit / Replace...
3. Mode: Regular expression
4. Find: ^␣
5. Leave Replace: empty
6. Replace All

OBSERVED RESULT
a
b
c
d
e

EXPECTED RESULT
a
b
␣c
␣␣d
␣␣␣e

SOFTWARE/OS VERSIONS
Kate 21.04.x, 22.04.2, git master (da4b519d2), KWrite 22.04.2
Operating System: openSUSE Tumbleweed 20220625
KDE Plasma Version: 5.25.1
KDE Frameworks Version: 5.95.0
Qt Version: 5.15.2

ADDITIONAL INFORMATION
Bug 142598 (reported on 2007-03-06, closed as fixed on 2007-08-31) probably had
the same cause. I don't know if it was ever actually fixed, but if so, it broke
again at some point. After I reopened that bug in June 2021, I was told it
wasn't reproducible, and to open a new report for a new bug.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to