On Mon, Dec 28, 2015 at 12:20 PM, Bram Moolenaar <b...@moolenaar.net> wrote: > > Brett Stahlman wrote: > >> > > Given a file containing the following 2 lines... >> > > 1a3 >> > > 123xyz >> > > >> > > ...try the following tests, and note the unexpected results. >> > > >> > > Case 1.1: >> > > call cursor(1, 1) >> > > echo searchpos('\%(\([a-z]\)\|\_.\)\{-}xyz', 'pcW') >> > > => [1, 1, 2] >> > > >> > > Case 1.2: >> > > call cursor(1, 2) >> > > echo searchpos('\%(\([a-z]\)\|\_.\)\{-}xyz', 'pcW') >> > > => [2, 1, 1] >> > > Question: Why does the \_. not permit earlier match at cursor pos (1, 2)? >> > > Note: Clearly, submatch should be 2, not 1, but this error is simply a >> > > consequence of the first error: since match doesn't begin on 1st line, >> > > the "a" at cursor pos can't be captured. >> > >> > This is because of the 'c' flag in 'cpoptions'. The Vi-compatible way >> > of searching is to start at the first column and skip over the match. >> > Then take the first match after the start position. >> >> If this is how it works, then I would have assumed it would have skipped the >> match it returned for Case 1.1 (at starting position 1,1). But perhaps not >> skipping the match at column 1 had something to do with (from help on 'cpo') >> "...but not further than the start of the next line"? If so, the help text >> isn't very clear in this case. It seems to be describing search >> "continuation", and my tests were for an isolated search beginning at an >> arbitrary buffer position. Also, the term "next line" is a bit misleading: >> in this case, it seems to refer to what would have been the next line of a >> *previous* search. But I guess the Vi designers didn't want to complicate >> the implementation by maintaining the state needed to differentiate between >> a subsequent search for the same pattern without intervening cursor movement >> and a new search... >> >> > >> > > Case 1.3: >> > > call cursor(1, 3) >> > > echo searchpos('\%(\([a-z]\)\|\_.\)\{-}xyz', 'pcW') >> > > => [2, 1, 1] >> > > Note: Why isn't a match found at cursor pos (1, 3)? >> > > >> > > Repeat these tests with a \zs in the pattern, and note how the capture >> > > is matched unconditionally... >> > > >> > > Case 2.1: >> > > call cursor(1, 1) >> > > echo searchpos('\%(\([a-z]\)\|\_.\)\{-}\zsxyz', 'pcW') >> > > => [2, 4, 2] >> > > >> > > Case 2.2: >> > > call cursor(1, 2) >> > > echo searchpos('\%(\([a-z]\)\|\_.\)\{-}\zsxyz', 'pcW') >> > > => [2, 4, 2] >> > > >> > > Case 2.3: >> > > call cursor(1, 3) >> > > echo searchpos('\%(\([a-z]\)\|\_.\)\{-}\zsxyz', 'pcW') >> > > => [2, 4, 2] >> > > Note: Submatch should be 1, not 2, here. It's as though the \zs forces >> > > the >> > > capture to match unconditionally. >> > > >> > > Points to note... Originally, I thought the error had to do with the 'p' >> > > flag, but that appears not to be the case: the submatch errors are >> > > simply a >> > > consequence of the incorrectly determined start locations. Also, it >> > > appears >> > > the results would have been the same with * as they were with \{-}. >> > > Finally, the unexpected behavior is not limited to \_., but is seen even >> > > when (e.g.) explicit \n is used. >> > >> > After removing 'c' from 'cpoptions', does it work as you expect? >> >> Not as I expected, but the first 3 tests, at least, work as I now expect. >> >> Case 3.3, however, makes no sense to me now. It returns... >> => [2, 4, 2] >> ...even though there's nothing to match the [a-z]. If I change the "1a3" to >> "123", it returns... >> => [2, 4, 1] >> ...which tells me that the parens were capturing the "a" *before* the start >> position, in spite of the 'W' flag prohibiting wrap. This tells me that the >> search must be starting before the cursor position, most likely at the start >> of the cursor line. I would not have expected that a forward search with no >> lookbehind of any sort could find anything prior to the starting cursor >> position. But I guess it's not really finding a match prior to the cursor >> position - just checking to see what needs to be skipped? But with &cpo no >> longer containing 'c', and the 'c' flag passed to searchpos(), why would it >> even need this sort of "skip-over" test prior to cursor position? > > The search always starts in the first column. Then when a match is > found and it's before the cursor, another search is done at the next > position.
Interesting. So IIUC, that could result in a lot of redundant searches, when the pattern appears multiple times on the same line prior to start position: e.g., with the following text and cursor position... 123 123 123 123 <cursor> 123 123 123 A search for "123" would have to try and discard 4 matches before finding one to return; a subsequent search from the new location would have to discard 5 matches, a subsequent search would discard 6 matches, and so on... Although this could be inefficient in certain pathological, long-line scenarios, the bigger issue is the effect it has on the returned 'submatch' value when the 'p' flag is used. > Vi compatible is to continue after the matched pattern. When > removing 'c' from 'cpo' it searches from the next column. > > With the \zs the search in the first column returns a position after the > start position, thus it's a match. Without the \zs the column would be > the first column. > > I can see this is not what you expect or what you want. We can add > another flag to actually start at the search start position. I guess that makes sense; either that, or perhaps alter the existing implementation to ensure that a capture can't capture anything before the starting location (unless the capture occurs in a look-behind context). Thanks, Brett S. > > -- > Computers are not intelligent. They only think they are. > > /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net \\\ > /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ > \\\ an exciting new programming language -- http://www.Zimbu.org /// > \\\ help me help AIDS victims -- http://ICCF-Holland.org /// -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.