> Q: I'm assuming any glob patterns would implicitly be anchored to the end of > the path string (as they are in bash)?
Yes. In ctags, '(<pattern>)' matches to file names not path names, like '.c.h'. > Yes I know... In fact after originally looking at global and ctags > I thought how potentially dangerous ctags's --force-language option > was and that's why I called my extension suffixless_langmap. > My intention was that this option wouldn't force anything but instead > provide a default language when there wasn't a file suffix. > > For example, in project include directories you quite often get other > artefacts like .c, .texi, .html (I know that these get excluded) and > .inc files (MSVS). If the --force-language override option is used on > those include directories then files with a suffix don't automatically > get handled the way they should. Instead you'd possibly have to put in > additional more specific --force-language overrides to reinstate default > behaviour for certain extensions. E.g.: You are right. It is a important point. You should be able to finely control. How about using a 'file list' instead of a direct path. --language-force=<lang>:<file list> File list is a file which lists file names. e.g. [cppfiles] +----------------------------- |include/c++/4.8/algorithm |include/c++/4.8/bits/stl_algo.h |include/c++/5.1/algorithm $ gtags --language-force=cpp:cppfiles You can use find(1) command to make a file list. This will satisfy your request too, because find(1) has both glob and regex. :) New priority: [high] 1. --language-force=<lang>:<file list> 2. langmap=<lang>:<suffix or glob pattern list> [low] What do you think? > If/when someone comes to work on this, my patch is probably still worth > a look as 70-80% of it is done with respect to the proposal above. > Either way some of it may be of use. Thank you so much. Regards, Shigio 2016-10-05 4:09 GMT+09:00 Cooper, Anthony <[email protected]>: > SECURITY CLASSIFICATION: OFFICIAL > > > Good morning :-) > > > -----Original Message----- > > From: [email protected] [mailto:[email protected]] On Behalf Of > > Shigio YAMAGUCHI > > Sent: 04 October 2016 01:19 > > To: Cooper, Anthony > > Cc: [email protected] > > Subject: Re: GNU Global Parsing Suffixless Files Patch > > > > Good morning :) > > I understood regex version of --language-force is very powerful. > > However, it seems too powerful for us to manage it completely. > > > > How about releasing the real path version and '()' syntax first? > > It's simple and easy to understand, and is similar to ctags. > > At the stage now, no one can judge whether regex version is needed, > > because no one has used even the real path version. > > > > > E.g. If I had: > > > Default: \ > > > :GTAGS_OPTIONS=--force-language=yacc\:(sys\$): \ > > > --force-language='cpp\:(^\\./Microsoft > Visual)': > > > > > > Then this would say match all files ending in sys and treat them as > > > yacc and any suffixless files with a path starting with `./Microsoft > > > Visual' are to be treated as cpp files. > > > > Using the real path version and '()' syntax, that is realized easily > like this: > > > > [gtags.conf] > > :langmap=yacc\:(*sys): > > > > $ gtags --force-language='yacc:Microsoft Visual' > > > > A very minor point: the `Microsoft Visual' examples are different as my RE > only matches at the head of the path. > > I guess I get nervous putting in more limited matching mechanisms inside > an option that is designed to override the normal default/sane behaviour; I > would like to be as precise as possible in my overrides. Also most would > use the simple substring match, but regex's are there for edge cases that > we haven't thought of. Most devs are comfortable with REs. > > Q: I'm assuming any glob patterns would implicitly be anchored to the end > of the path string (as they are in bash)? > > > > One thing to note, made in the man page and help text, is this > > > switch won't affect any files with a suffux, which some people might > > > expect with `force' in the name of the switch. > > > > In ctags, --language-force option ignores suffixes. I'd like to follow > > ctags method. > > Yes I know... In fact after originally looking at global and ctags I > thought how potentially dangerous ctags's --force-language option was and > that's why I called my extension suffixless_langmap. My intention was that > this option wouldn't force anything but instead provide a default language > when there wasn't a file suffix. > > For example, in project include directories you quite often get other > artefacts like .c, .texi, .html (I know that these get excluded) and .inc > files (MSVS). If the --force-language override option is used on those > include directories then files with a suffix don't automatically get > handled the way they should. Instead you'd possibly have to put in > additional more specific --force-language overrides to reinstate default > behaviour for certain extensions. E.g.: > > --force-language=cpp:include --force-language=c:.c > --force-language=makefile:([Mm]akefile) ... > > However with REs you could be more selective in your initial > --force-language setting and avoid the subsequent detailed extension > overrides. > > --force-language='cpp:(/include/(.*/)*[^/.]?$)' > > In a glob pattern as far as I'm aware there's no way of saying `select > files not containing a period' :-(. > > > > > $ ctags --language-force=c test.php # test.php is treated as C source > > file > > > > How about setting the following priority? > > (This --language-force is the real path version) > > > > [high] > > 1. --language-force=<lang>:<file> > > 2. --language-force=<lang>:<directory> > > 3. langmap=<lang>:<suffix or glob pattern list> [low] > > > > e.g. > > [gtags.conf] > > :langmap=c\:.x([Mm]ake): > > > > $ gtags --language-force=perl:dir1 --language-force=php:php.x > > > > ./ > > |-dir1/ > > | |-test.x => perl by --language-force=perl:dir1 > > | |-Make => perl by --language-force=perl:dir1 > > | |-php.x => php by --language-force=php:php.x > > |-dir2 > > |-test.x => c by langmap=c\:.x([Mm]ake): > > |-Make => c by langmap=c\:.x([Mm]ake): > > > > The priorities look fine to me. > > Whilst I think it's a _bit_ of a pity not to have REs for the reasons > pointed out above, none of the issues are insurmountable with a glob > implementation, just possibly less obvious? But more consistent as you say > with ctags. So as you say start off with globs and see :-). > > Many thanks for being so helpful and constructive, it is appreciated as is > Global. > > If/when someone comes to work on this, my patch is probably still worth a > look as 70-80% of it is done with respect to the proposal above. Either way > some of it may be of use. > > Regards, > > Tony. > > > > Did you correctly receive the new patch for 6.5.5? > > > > Sorry but I did not read that at all. I would like to discuss about > > the specification not about the implementation. > > > > Regards, > > Shigio > > > > > > 2016-10-03 21:34 GMT+09:00 Cooper, Anthony > > <[email protected]>: > > > > > > SECURITY CLASSIFICATION: OFFICIAL > > > > > > Good morning :-) (See comments below) > > > > > -----Original Message----- > > > From: [email protected] [mailto:[email protected]] On > Behalf Of > > > Shigio YAMAGUCHI > > > Sent: 01 October 2016 00:17 > > > To: Cooper, Anthony > > > Cc: [email protected] > > > Subject: Re: GNU Global Parsing Suffixless Files Patch > > > > > > Before implementation, I would like to make clear the > specification. > > > > > > > Assorted projects I've come across have include and Include > (the > > > > example below is a trivial but a real one relating to > MS-Windows) > > > > and some even have include dirs names XInclude or something > > similar > > > > (can't remember the project now, wasn't X11 but probably an X > > client). > > > > > > Let me ask a couple of questions, please. > > > > > > > > > Q1: Is the following (1) and (2) equal? > > > > > > (1) --language-force='cpp:([Ii]nclude)' > > > (2) --language-force='cpp:include' --language- > > force='cpp:Include' > > > > > > If so, you think that (1) is better than (2) since it is > shorter? > > > > Yes precisely. Although perhaps I gave a rather weak example. A > > stronger case would be when differentiating between say: > > /usr/include/C++/4.8/algorithm > > /usr/include/C++/5.1/algorithm > > /usr/include/C++/..../algorithm > > And: > > ./project/helper-programs/algorithm/sort/qsort <- script > or > > binary > > > > Or to match: > > .../include/sys > > But not: > > .../include/system_errors > > > > If I wanted to catch the first set of files in both example without > > tripping up over the second then I could do --language- > > force=cpp:(algorithm\$) and --language-force=cpp:(sys\$). > > > > > > > > Q2: Does (1) above match to the followings? > > > > > > ./XXXincludeYYY/ > > > ./XXXincludeYYY.php > > > ./project/include/release/ > > > ./project/include/release/test.php > > > > Yes. The matching is a dumb substring or regex match on the path > > string available around where decide_lang() is called. No anchoring by > > default. > > > > > > > > Q3: Regex '^' and '$' are available? If so, what does they mean? > > > > Yes they are. `^' would mean start matching at the beginning of the > > path and `$' would mean match the end of the path (particularly useful > > for just picking up matches against a file name as directories in > > themselves aren't processed beyond traversal). File globbing doesn't > > make ^ and $ available and I have come across other > > programs/situations where I have been frustrated by this for want of a > regex. E.g. If I had: > > Default: \ > > :GTAGS_OPTIONS=--force-language=yacc\:(sys\$): \ > > --force-language='cpp\:(^\\./Microsoft > Visual)': > > Then this would say match all files ending in sys and treat them as > > yacc and any suffixless files with a path starting with `./Microsoft > > Visual' are to be treated as cpp files. > > > > One thing to note, made in the man page and help text, is this > switch > > won't affect any files with a suffux, which some people might expect > > with `force' in the name of the switch. > > > > Did you correctly receive the new patch for 6.5.5? > > > > Many thanks once again :-). > > > > Regards Tony. > > > > > > Regards, > > > Shigio > > > > > > -- > > > > > > Shigio YAMAGUCHI <[email protected]> > > > PGP fingerprint: D1CB 0B89 B346 4AB6 5663 C4B6 3CA5 BBB3 57BE > > DDA3 > > > > > > > > > > > __________________________________________________________ > > > > > ____________ > > > This email has been scanned by the Symantec Email Security.cloud > > service. > > > For more information please visit http://www.symanteccloud.com > > > > > __________________________________________________________ > > > ____________ > > > > > > **************************************************** > > ************************ > > Communications with GCHQ may be monitored and/or recorded > > for system efficiency and other lawful purposes. Any views or > > opinions expressed in this e-mail do not necessarily reflect GCHQ > > policy. This email, and any attachments, is intended for the > > attention of the addressee(s) only. Its unauthorised use, > > disclosure, storage or copying is not permitted. If you are not > the > > intended recipient, please notify [email protected]. > > > > This information is exempt from disclosure under the Freedom of > > Information Act 2000 and may be subject to exemption under > > other UK information legislation. Refer disclosure requests to > > GCHQ on 01242 221491 ext 30306 (non-secure) or email > > [email protected] > > > > **************************************************** > > ************************ > > > > > > > > > > > > > > -- > > > > Shigio YAMAGUCHI <[email protected]> > > PGP fingerprint: D1CB 0B89 B346 4AB6 5663 C4B6 3CA5 BBB3 57BE DDA3 > > > > > > __________________________________________________________ > > ____________ > > This email has been scanned by the Symantec Email Security.cloud service. > > For more information please visit http://www.symanteccloud.com > > __________________________________________________________ > > ____________ > > -- Shigio YAMAGUCHI <[email protected]> PGP fingerprint: D1CB 0B89 B346 4AB6 5663 C4B6 3CA5 BBB3 57BE DDA3
_______________________________________________ Bug-global mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-global
