print ("small string");
print (
"This is a very long string");
and I need to format it as so:
print ("small string\n");
print (
"This is a very long string\n");
Ideally, I would like to do this in one command and I would also like to
understand the regex itself. So, given the above, here is what I
understand of the regex pattern:
%s/print\s*(\s*"[^"]*\(\\n\)\@<!\ze"/&\\n/g
% - globally
s - substitute
/ - delimeter
print\s*(\s*" - my phrase to match including zero or more matching
spaces at the end print, then a literal paren then zero or more spaces
up until the quote
[^"]* - then everything that is not a quote (zero or more)
Doing well up through here...
( - The beginning of the group ???
\\n - literal \n
) - End group ????
\@<! - Nothing, requires no match behind ???
You've got the understanding right (though those parens are "\("
and "\)" with backslashes). Those four lines in concert assert
that a literal "\n" doesn't come before the current point.
Without the grouping, it would only assure that the previous atom
(in this case, the "n") didn't appear here, so you'd have
problems with things like
print("terminal n")
because it sees the terminal "n" so it doesn't do the
substitution. By grouping them, you assert "and when you get to
this point [before the closing quote] and there isn't a literal
backslash-en here, then we match"
In here, you're missing the "\ze" which means "when doing the
replacement, treat it as though the thing we're substituting
ended here, even though there's more stuff we're looking for
(namely, the double-quote that's next)"
" - my ending quote to match in the pattern print ("")
correct
/& - ???
This is standard substitution...the slash is the break between
the search and its replacement. The ampersand is "the whole
previous match". In this case, it's slightly tweaked because of
the "\ze" that we used...the thing we replace goes up through
(but not including) the second double-quote. So it drops in
everything from "print" through the end of the internal string
(sans-closing-quote)
\\n - literal \n
correct...appending the literal \n you want.
/ - delimeter
g - each occurrence on the line
Then we have the spanning multiple lines option:
\_ [^"]*
that's
\_[
not
\_ [
\_ - match text over multiple lines (Is this like another regex
engine, like the one sed uses?)
It's a vim thing:
:help /\_
should drop you in the fray. It prefixes (infixes?)a number of
atoms that could include whitespace, so for your change, you'd
likely want to do something like change the \s atoms to \_s to
include newlines.
Does this make since? The area I am having difficulty with is /& and how
the grouping is working.
Hopefully this sheds some light on matters and helps you tweak
your own regexps in the future. If you have any questions, feel
free to ask.
-tim