Re: manpage searches "^\s+keyword\s" vs. ???
On 2019-01-30 11:40, Andrey Repin wrote: >> I've always used "^\s+keyword\s" as a way to search for some keyword >> starting a section. > Welcome to the club. >> On linux it still works, but on cygin it doesn't like '\s' as symbol for >> white space. >> Any idea why there might be a difference? >> I note an option that could do similar in less -- '&pattern' turns OFF >> single special characters, I tried that on linux and it turned off the '\s' >> matching space. That's nice..um how about other way? >> Well didn't know if there might be some other op to go the other way, but >> didn't see anything. any ideas? > I've been puzzled by this since… forever, it seems. > This is something in less, but all the `man less` says is "regular expression > library provided by your system". > I guess this is down to compilation options at this point. The full class [[:space:]] works as expected. Probably config options picking the BSD POSIX ERE library without char class esc shortcuts, rather than allowing Glib, ICU, or PCRE ERE library with char class esc shortcuts. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: manpage searches "^\s+keyword\s" vs. ???
On Wed, Jan 30, 2019 at 11:09 AM Eric Blake wrote: > Not so much compilation options of man and less, but rather the code > used in Cygwin itself for handling regex. The configuration of less supports many different regex libraries. I downloaded the source and ran "./configure --with-regex=pcre" and built a nice version of less that fully supports \b and the various other perl regex extensions. The output of cygwin's standard "less --version" indicates it was compiled with posix regex, while linux suppliers seem to all use gnu regex (which also supports various perl-isms these days). I think it would be nice to tweak the less package to be compiled with pcre regex. ..wayne.. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: manpage searches "^\s+keyword\s" vs. ???
On 1/30/19 1:09 PM, Eric Blake wrote: > \s is a non-standard regex extension - glibc provides it, Cygwin has not > (at least, historically). POSIX provides [[:space:]] as a portable > alternative (although not all libc have implemented all of POSIX yet), > but is annoyingly long to type. > > Similarly, BSD regex (which is where Cygwin derives its regex from) > supports the non-standard regex extension [[:<:]] as a word boundary, > while glibc has the same feature but spelled \<. I also seem to recall > a patch in the past to teach Cygwin to respect \< by expanding it to > [[:<:]] before calling into the BSD-derived code (although I couldn't > actually find one in a quick search); a similar patch to expand \s into > [[:space:]] would be a reasonable idea. Found it: https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/regex/regcomp.c;h=180f599c#l425 and indeed, Cygwin fakes \< and \> but NOT \s or \b (for those, you'd have to submit a patch to that spot in regcomp.c). > >> I guess this is down to compilation options at this point. > > Not so much compilation options of man and less, but rather the code > used in Cygwin itself for handling regex. Also a good read: https://stackoverflow.com/questions/9792702/does-bash-support-word-boundary-regular-expressions -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: manpage searches "^\s+keyword\s" vs. ???
On Jan 30 13:09, Eric Blake wrote: > On 1/30/19 12:40 PM, Andrey Repin wrote: > > > > > I've been puzzled by this since… forever, it seems. > > This is something in less, but all the `man less` says is "regular > > expression > > library provided by your system". > > \s is a non-standard regex extension - glibc provides it, Cygwin has not > (at least, historically). POSIX provides [[:space:]] as a portable > alternative (although not all libc have implemented all of POSIX yet), > but is annoyingly long to type. > > Similarly, BSD regex (which is where Cygwin derives its regex from) > supports the non-standard regex extension [[:<:]] as a word boundary, > while glibc has the same feature but spelled \<. I also seem to recall > a patch in the past to teach Cygwin to respect \< by expanding it to > [[:<:]] before calling into the BSD-derived code (although I couldn't > actually find one in a quick search); a similar patch to expand \s into > [[:space:]] would be a reasonable idea. > > > I guess this is down to compilation options at this point. > > Not so much compilation options of man and less, but rather the code > used in Cygwin itself for handling regex. FreeBSD code since we can't use glibc code for licensing reasons. As usual: Patches welcome! (Even a complet replacement wouldn't hurt as long as licensing is no issue) Corinna -- Corinna Vinschen Cygwin Maintainer signature.asc Description: PGP signature
Re: manpage searches "^\s+keyword\s" vs. ???
On 1/30/19 12:40 PM, Andrey Repin wrote: > > I've been puzzled by this since… forever, it seems. > This is something in less, but all the `man less` says is "regular expression > library provided by your system". \s is a non-standard regex extension - glibc provides it, Cygwin has not (at least, historically). POSIX provides [[:space:]] as a portable alternative (although not all libc have implemented all of POSIX yet), but is annoyingly long to type. Similarly, BSD regex (which is where Cygwin derives its regex from) supports the non-standard regex extension [[:<:]] as a word boundary, while glibc has the same feature but spelled \<. I also seem to recall a patch in the past to teach Cygwin to respect \< by expanding it to [[:<:]] before calling into the BSD-derived code (although I couldn't actually find one in a quick search); a similar patch to expand \s into [[:space:]] would be a reasonable idea. > I guess this is down to compilation options at this point. Not so much compilation options of man and less, but rather the code used in Cygwin itself for handling regex. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: manpage searches "^\s+keyword\s" vs. ???
Greetings, L A Walsh! > I've always used "^\s+keyword\s" as a way to search for some > keyword starting a section. Welcome to the club. > On linux it still works, but on cygin it doesn't like '\s' as > symbol for white space. > Any idea why there might be a difference? > I note an option that could do similar in less -- '&pattern' > turns OFF single special characters, I tried that on linux > and it turned off the '\s' matching space. That's nice..um > how about other way? > Well didn't know if there might be some other op to go the > other way, but didn't see anything. > any ideas? I've been puzzled by this since… forever, it seems. This is something in less, but all the `man less` says is "regular expression library provided by your system". I guess this is down to compilation options at this point. -- With best regards, Andrey Repin Wednesday, January 30, 2019 21:36:27 Sorry for my terrible english... -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
manpage searches "^\s+keyword\s" vs. ???
I've always used "^\s+keyword\s" as a way to search for some keyword starting a section. On linux it still works, but on cygin it doesn't like '\s' as symbol for white space. Any idea why there might be a difference? I note an option that could do similar in less -- '&pattern' turns OFF single special characters, I tried that on linux and it turned off the '\s' matching space. That's nice..um how about other way? Well didn't know if there might be some other op to go the other way, but didn't see anything. any ideas? thanks... -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple