Re: [PATCH 1/2] gcc symbol database

2012-09-05 Thread Yunfeng ZHANG
What progress about my patch?


Re: [PATCH 1/2] gcc symbol database

2012-08-12 Thread Dodji Seketeli
Hello Yunfeng,

Thank you for following up, and sorry for me reviewing your patches so
lately.  The libcpp changes are coming along nicely, IMHO.  I like the
fact that they are getting pretty minimal.  I just have a few mostly
cosmetic comments at this point.

[...]

> diff -cpr .pc/symdb_enhance_libcpp/libcpp/include/cpplib.h 
> libcpp/include/cpplib.h
> *** .pc/symdb_enhance_libcpp/libcpp/include/cpplib.h  Wed Jul 25 10:57:16 2012
> --- libcpp/include/cpplib.h   Wed Jul 25 10:57:31 2012
> *** struct cpp_callbacks
> *** 526,531 
> --- 526,548 
>be expanded.  */
> cpp_hashnode * (*macro_to_expand) (cpp_reader *, const cpp_token *);
>   
> +   /* The more powerful function extracts token than cpp_get_token. Later
> +* callbacks show it. */
> +   void (*lex_token) (cpp_reader *, const cpp_token*);

For the sake of clarity, I'd change the signature and comment to
something more classically informative like (note that there should be
two spaces after between the '.' and the */):

/* This callback is invoked when a token is lexed.  The RESULT
   parameter represents the lexed token.  */
void (*lex_token) (cpp_reader *, const cpp_token *result);

> +   /* The pair is called when gcc expands macro. Enable lex_token in
> +* macro_start_expand, you can catch all macro tokens. The pair can be 
> called
> +* nested, and the second parameter of macro_end_expand notifies you 
> whether
> +* macro is still expanding. */
> +   void (*macro_start_expand) (cpp_reader *, const cpp_token *,
> +   const cpp_hashnode *);

Likewise, I'd change this into something like what comes below.  Also,
please try to use the same style as the rest of the comments of the
file; for instance, do not start each line of comment with a '*'; put
two spaces after each '.':

/* This callback is invoked whenever a function-like macro
   represented by MACRO_NODE is about to be expanded.  MACRO_TOKEN
   is the token for the macro name and MACRO_NODE represents the
   macro.  */
void (*macro_start_expand) (cpp_reader *, 
const cpp_token *macro_token,
const cpp_hashnode *macro_node);

> +   void (*macro_end_expand) (cpp_reader *, bool);

I believe that you don't need the last parameter, because this
callback should be called only when we know the macro *has* been
expanded.  So later down in _cpp_pop_context, I'd change:

*** _cpp_pop_context (cpp_reader *pfile)
*** 2245,2250 
--- 2250,2257 
/* decrease peak memory consumption by feeing the context.  */
pfile->context->next = NULL;
free (context);
+   if (pfile->cb.macro_end_expand)
+ pfile->cb.macro_end_expand (pfile, in_macro_expansion_p (pfile));
  }

into:

if (in_macro_expansion_p (file) && pfile->cb.macro_end_expand)
  pfile->cb.macro_end_expand (pfile);

The callback that you actually set in your application, in so.c (in
gcc.tgz) would then change from:

static void
cb_macro_end (cpp_reader * pfile, bool in_expansion)
{
  cpp_callbacks *cb = cpp_get_callbacks (pfile);
  if (!in_expansion)
{
  cache_end_let ();
  mo_leave ();
  cb->lex_token = NULL;
  cb->macro_end_expand = NULL;
}
}

to:

static void
cb_macro_end (cpp_reader * pfile)
{
  cpp_callbacks *cb = cpp_get_callbacks (pfile);
  cache_end_let ();
  mo_leave ();
  cb->lex_token = NULL;
  cb->macro_end_expand = NULL;
}

And for the macro_end_expand, I'd add a comment and change the signature to:

/* This callback is invoked whenever the macro previously notified by
   the macro_start_expand has been expanded.  */
void (*macro_end_expand) (cpp_reader *);

> +   /* The pair is called when cpp directive (starting from `#', such as
> +* `#define M', `#endif' etc) is encountered and reaches end. When enable
> +* lex_token in start_directive, the sequence is lex_token("define"),
> +* lex_token("M") ... */
> +   void (*start_directive) (cpp_reader *);

I'd change the comment to:

/* This callback is invoked whenever a preprocessor directive (such as
   #define M) is encountered and about to be handled.  */

> +   void (*end_directive) (cpp_reader *);
> +

And I'd add a comment here that roughly reads:

/* This callback is invoked when the directive previously notified
by a call to the start_directive callback has been handled.
Information about the directive is accessible at the field
PFILE->directive.  */
void (*end_directive) (cpp_reader *pfile);

[...]

>   static _cpp_buff *
>   funlike_invocation_p (cpp_reader *pfile, cpp_hashnode *node,
> !   _cpp_buff **pragma_buff, unsigned *num_args)
>   {
> const cpp_token *token, *padding = NULL;
>   
> --- 963,970 
>  returned buffer.  */
>   static _cpp_buff *
>   funlike_invocation_p (cpp

Re: [PATCH 1/2] gcc symbol database

2012-07-19 Thread Dodji Seketeli
Yunfeng ZHANG  writes:

> Hi Dodji Seketeli:
>> This is what I was trying to tell you in my previous message, when I was
>> saying:
>>
>> > _cpp_pop_context is really the function that marks the end of a
>> > given macro expansion, especially when the predicate
>> > in_macro_expansion_p (introduced recently in trunk for gcc 4.8)
>> > returns true.
>
> So which version I will worked on if I want to release new patch, 4.8
> trunk?

Yes, please.  I think new development of this sort should happen in
trunk, as much as possible.

Thank you for your time.

-- 
Dodji


Re: [PATCH 1/2] gcc symbol database

2012-07-19 Thread Yunfeng ZHANG
Hi Dodji Seketeli:
> This is what I was trying to tell you in my previous message, when I was
> saying:
>
> > _cpp_pop_context is really the function that marks the end of a
> > given macro expansion, especially when the predicate
> > in_macro_expansion_p (introduced recently in trunk for gcc 4.8)
> > returns true.

So which version I will worked on if I want to release new patch, 4.8 trunk?

Yunfeng


Re: [PATCH 1/2] gcc symbol database

2012-07-18 Thread Yunfeng ZHANG
To Dodji Seketeli:

Thanks for you check my patch, I will release it again later.

Yunfeng


Re: [PATCH 1/2] gcc symbol database

2012-07-18 Thread Dodji Seketeli
Yunfeng ZHANG  writes:

>> It took me a couple of minutes to understand what you meant here, so
>> please let me re-phrase to make sure I got it.
>>
>> You are saying that the callback function of the cb_lex_token event is
>> set by the callback function of the macro_start_expand event.
>>
>> Is that correct?
>
> Yes.

Thank you for making this clear.

>> And this makes me wonder why you'd need the second parameter of
>> macro_start_expand (the token).  I believe you should have all the
>> information you need with the first, second, and last parameter.  As I
>> said in my previous email, you can get the file offset of the token in
>> your client code by doing 'file_offset = line + column'.  So that token
>> should not be needed.  Thus, calling macro_start_expand from inside
>> funlike_invocation_p once you are sure the expansion of the macro is
>> going to take place, is possible.
>
> The only thing is the file-offset or source_location of the macro
> leader token.

To try to avoid confusion, I think what you call "source_location of the
macro leader token" is actually the spelling location of the macro.

> I don't know how to get it when macro_start_expand is called in
> funlike_invocation_p intern.

[...]

> BTW, I can change my plugin to use line/column instead of fileoffset, there
> ins't design limitation, only time.

I think you can add a new source_location parameter to
funlike_invocation_p function, and pass it the result->src_loc that you
need.

-- 
Dodji


Re: [PATCH 1/2] gcc symbol database

2012-07-17 Thread Yunfeng ZHANG
> It took me a couple of minutes to understand what you meant here, so
> please let me re-phrase to make sure I got it.
>
> You are saying that the callback function of the cb_lex_token event is
> set by the callback function of the macro_start_expand event.
>
> Is that correct?

Yes.

> And this makes me wonder why you'd need the second parameter of
> macro_start_expand (the token).  I believe you should have all the
> information you need with the first, second, and last parameter.  As I
> said in my previous email, you can get the file offset of the token in
> your client code by doing 'file_offset = line + column'.  So that token
> should not be needed.  Thus, calling macro_start_expand from inside
> funlike_invocation_p once you are sure the expansion of the macro is
> going to take place, is possible.

The only thing is the file-offset or source_location of the macro leader token.
I don't know how to get it when macro_start_expand is called in
funlike_invocation_p intern. The parameter `result' of enter_macro_context
doesn't be delivered down. And file_offset = sum(`all former
column_total_count' + current column_number'), so it can't be deduced from
current line_map + source_location desgin.

BTW, I can change my plugin to use line/column instead of fileoffset, there
ins't design limitation, only time.

Yunfeng


Re: [PATCH 1/2] gcc symbol database

2012-07-17 Thread Dodji Seketeli
Yunfeng ZHANG  writes:

> Please allow me to resend former sample:
> #define Z(a) a
> #define Y Z
> #define X(p) p + Y
> X(1)(2);
> The flow is:
> 1) `X' -- leader macro token by macro_start_expand.
> 2) `(', `1', `)' -- macro tokens, by cb_lex_token.
> 3) macro_end_arg.
> 4) `1', `+' -- macro replacement tokens, by symdb_cpp_token.
> 5) `(', `2', `)' -- macro tokens, by cb_lex_token.
> 6) macro_end_arg.
> 7) `2' -- macro replacement tokens, by symdb_cpp_token.
> 8) macro_end_expand.
>
> The thing I emphasized here is cb_lex_token is set by macro_start_expand
> intern -- it isn't valid anytime.

It took me a couple of minutes to understand what you meant here, so
please let me re-phrase to make sure I got it.

You are saying that the callback function of the cb_lex_token event is
set by the callback function of the macro_start_expand event.

Is that correct?

> So
> buff = funlike_invocation_p (pfile, node, &pragma_buff,
> ...
> if (buff == NULL)
>   {
> ...
>   }
> if macro_start_expand is moved to the clause block `buff != NULL', it's too
> later to set cb_lex_token because funlike_invocation_p has read some macro
> tokens.

Then for function-like macros, you can move the call to
macro_start_expand into funlike_invocation_p, right before it calls
collect_args (collect_args is the function that reads the macro
arguments).

And this makes me wonder why you'd need the second parameter of
macro_start_expand (the token).  I believe you should have all the
information you need with the first, second, and last parameter.  As I
said in my previous email, you can get the file offset of the token in
your client code by doing 'file_offset = line + column'.  So that token
should not be needed.  Thus, calling macro_start_expand from inside
funlike_invocation_p once you are sure the expansion of the macro is
going to take place, is possible.

For non-function-like macros, you can call macro_start_expand after the block

  if (macro->fun_like)
{

inside enter_macro_context.

> Of course I can remove macro_end_arg totally, because from the sample, it's 
> the
> fact that macro tokens aren't always before any macro replacement tokens. But
> macro_end_expand must be prepared to deal with cancel case.

I still think that with the plan above, you don't need any cancel case
because macro_start_expand is going to be called only when we know the
macro is going to be expanded.

-- 
Dodji


Re: [PATCH 1/2] gcc symbol database

2012-07-17 Thread Yunfeng ZHANG
Please allow me to resend former sample:
#define Z(a) a
#define Y Z
#define X(p) p + Y
X(1)(2);
The flow is:
1) `X' -- leader macro token by macro_start_expand.
2) `(', `1', `)' -- macro tokens, by cb_lex_token.
3) macro_end_arg.
4) `1', `+' -- macro replacement tokens, by symdb_cpp_token.
5) `(', `2', `)' -- macro tokens, by cb_lex_token.
6) macro_end_arg.
7) `2' -- macro replacement tokens, by symdb_cpp_token.
8) macro_end_expand.

The thing I emphasized here is cb_lex_token is set by macro_start_expand
intern -- it isn't valid anytime. So
buff = funlike_invocation_p (pfile, node, &pragma_buff,
...
if (buff == NULL)
  {
...
  }
if macro_start_expand is moved to the clause block `buff != NULL', it's too
later to set cb_lex_token because funlike_invocation_p has read some macro
tokens.

Of course I can remove macro_end_arg totally, because from the sample, it's the
fact that macro tokens aren't always before any macro replacement tokens. But
macro_end_expand must be prepared to deal with cancel case.

BTW, currently it isn't necessary to my plugin to collect macro tokens, so if
gcc doesn't support collect macro token event, macro_start_expand and
macro_end_expand can be moved into enter_macro_context and _cpp_pop_context
individually.

Yunfeng


Re: [PATCH 1/2] gcc symbol database

2012-07-16 Thread Dodji Seketeli
Yunfeng ZHANG  writes:

>> But the "meaning" of the macro_end_arg event is really not clear to
>> ...
>>  and so on.
>
> Let's see a sample:
> #define Z(a) a
> #define Y Z
> #define X(p) p + Y
> X(1)(2);
> With my solution, user get
> 1) `X' -- leader macro token by macro_start_expand.
> 2) `(', `1', `)', `(', `2', `)' -- macro tokens, by cb_lex_token.
> 3) macro_end_arg.
> 4) `1', `+', `2' -- macro replacement tokens, by symdb_cpp_token.
> 5) macro_end_expand.

I see.  So the meaning of macro_end_arg is something like:

"The tokens that were just reported by the lex_token event were
 arguments to a function-like macro".

Is that correct?  

If yes, then I guess you can keep that macro_end_arg event, but then I'd
remove its "cancel" argument by doing what I said in my previous
messages.  Which is, don't call macro_start_expand if X is not going to
be expanded.  To do that, move macro_start_expand in
enter_macro_context, like I proposed there.

Of course, that would not change the sequence of the events you'd get in
the example above.  But in this example below:

 1  #define Y 1
 2  
 3  int
 4  Y (int a)
 5  {
 6return a + 1;
 7  }
 8  
 9  int
10  main ()
11  {
12return Y(a);
13  }

When parsing Y(a), what I am proposing won't emit a
macro_start_expansion at line 12, but with your implementation, it will
emit a macro_start_expansion, followed by a macro_end_arg with the
cancel argument set to true.  Which is unnecessary if the
macro_start_expansion is not called at all there.

Of course, you'd have to change the hunk:

*** enter_macro_context (cpp_reader *pfile, 
*** 1015,1020 
--- 1015,1022 
  pfile->state.parsing_args = 1;
  buff = funlike_invocation_p (pfile, node, &pragma_buff,
   &num_args);
+ if (pfile->cb.macro_end_arg)
+   pfile->cb.macro_end_arg (pfile, buff == NULL);
  pfile->state.parsing_args = 0;
  pfile->keep_tokens--;
  pfile->state.prevent_expansion--;

so that "pfile->cb.macro_end_arg (pfile)" is not invoked when
funlike_invocation_p returns NULL.

Does that make sense?

If yes, the comment of that event should be adapted to have a more
meaningful meaning.  I believe the comments of the other events should
be adapted to, in the light of the description that I did in my previous
message.

>> Please reply to these questions so that I understand why you need this
>> field at all, rather than getting it from the linemap infrastructure
>> by calling, for instance, linemap_expand_location.
>
> linemap_expand_location can return the column and row of a token, not file
> offset of its, there's no more explanation why using token file-offset, when I
> started my project, I think file-offset is better than current
> linemap+source_location because it costed less time to encode/decode
> source_location field and can act just like previous solution,

What I meant was that if you prefer storing file offsets in the database
of your plugin, rather than line/column, then why not saying that it's
equal to something like (line + column), instead of adding a new field
in libcpp itself.

> it's an improvement to gcc too, with it gcc can store
> symbol+fileoffset to elf intern for ld/gdb usage, of course,

I wouldn't be so sure.  DWARF already has a sophisticated and compact
way to store line/column information, so I don't think the file offset
gives us anything in practice.

> my database can benefit from it -- less space and use it as sort field
> directly.

Sure, but then you can just compute it from the result of
linemap_expand_location, as I said above.

> The only thing is it make challenge on gcc infrastructure, so I leave
> it to a seperate patch called gcc_negotiate_patch and hope to discuss
> the first two patches only.

If you are not convinced, then OK, let's leave that part aside from now.

> And to the field file-offset, when the token is macro-replacement token, I
> recommend token.file_offset = -1 * .file_offset. I think
> gdb is happy to this.

Right now, for tokens resulting from macro expansion, the location that
is encoded in debug information is already the location of the expansion
point of the macro.  So we already do what you say, in a sense.

Cheers.

-- 
Dodji


Re: [PATCH 1/2] gcc symbol database

2012-07-16 Thread Yunfeng ZHANG
> But the "meaning" of the macro_end_arg event is really not clear to
> ...
>  and so on.

Let's see a sample:
#define Z(a) a
#define Y Z
#define X(p) p + Y
X(1)(2);
With my solution, user get
1) `X' -- leader macro token by macro_start_expand.
2) `(', `1', `)', `(', `2', `)' -- macro tokens, by cb_lex_token.
3) macro_end_arg.
4) `1', `+', `2' -- macro replacement tokens, by symdb_cpp_token.
5) macro_end_expand.
Item 2 is very important since if `X(1)' doesn't return any macro replacement
tokens, then `(2)' is a legal clause.
So please review my former mail again.

And it seems in gcc-4.8, gcc can trace several contiguous macro expansion
contexts can be associated to the same macro, so macro cascaded case has
disappeared.

> Please reply to these questions so that I understand why you need this
> field at all, rather than getting it from the linemap infrastructure
> by calling, for instance, linemap_expand_location.

linemap_expand_location can return the column and row of a token, not file
offset of its, there's no more explanation why using token file-offset, when I
started my project, I think file-offset is better than current
linemap+source_location because it costed less time to encode/decode
source_location field and can act just like previous solution, it's an
improvement to gcc too, with it gcc can store symbol+fileoffset to elf intern
for ld/gdb usage, of course, my database can benefit from  it -- less space and
use it as sort field directly. The only thing is it make challenge on gcc
infrastructure, so I leave it to a seperate patch called gcc_negotiate_patch
and hope to discuss the first two patches only.

And to the field file-offset, when the token is macro-replacement token, I
recommend token.file_offset = -1 * .file_offset. I think
gdb is happy to this.

Sincerely
Yunfeng


Re: [PATCH 1/2] gcc symbol database

2012-07-13 Thread Dodji Seketeli
> I'm sorry recent weeks I've got busy

No problem.  I am sorry to reply this late to your message as well.

> Later is the response from previous mail.

Please do not take this bad, but it will really ease communication (and
the review) if, when you reply to this message, you write your answers
*below* the parts of you are replying to, and remove the parts (of my
message) you are not replying to.  I believe this is a good habit to
have at least when you are interacting with the GCC mailing lists.

Now here are my comments on your patches.

> -
> macro_end_expand can't be moved to _cpp_pop_context. My plugin shows later
> results when `Gs callee _cpp_pop_context'.
>   GS multiple definition:
>   1)libcpp/directives.c 8130 skip_rest_of_line DEF_FUNC
>   2)libcpp/macro.c 67161 expand_arg DEF_FUNC
>   3)libcpp/macro.c 72900 cpp_get_token_1 DEF_FUNC
>   4)libcpp/traditional.c 10560 _cpp_scan_out_logical_line DEF_FUNC

I still think that the macro_end_expand can and should be moved to
_cpp_pop_context, even though that function is called from multiple
places.  That is, in the 'if' block:

  if (context->c.macro)
{

you just have to write something like:

if (in_macro_expansion_p)
  /* We are at the end of the expansion of a macro.  */
  macro_end_expand (...);

This is what I was trying to tell you in my previous message, when I was
saying:

> _cpp_pop_context is really the function that marks the end of a
> given macro expansion, especially when the predicate
> in_macro_expansion_p (introduced recently in trunk for gcc 4.8)
> returns true.

Am I missing something?

> So macro_start_expand can't be moved to enter_macro_context since the pair
> should be matched in the same function cpp_get_token_1.

In the light of what I said above, I think macro_start_expand should
be moved to enter_macro_context, if you agree with my previous
statements.


> 1) macro_start_expand and macro_end_arg are used to tag all macro
  tokens.

I think there are some concepts that need to be a bit more clearly
stated here, otherwise we'll be talking past each other.

You are basically making the pre-processor emit events at some useful
points of its operations.  And then, your client code (the code of
your plugin) reacts whenever these useful events are emitted.  It
reacts to these events to accomplish some useful tasks.

The tasks your wants to accomplish here, somehow, is to mark the tokens
that results from the expansion of a macro.

Right?

I think it will make the code more maintainable to have the events be
emitted at points that are "well defined".  And this is where I am
having issues with your patch.

So, it seems clear to me that macro_start_expand is an event that is
emitted at the beginning of the expansion of a macro.

But the "meaning" of the macro_end_arg event is really not clear to
me.  From your code, it looks like it's a hack that lets you detect if
the macro_start_expand event really was emitted in cases where we are
sure that the macro is going to be expanded.

Let me explain a bit more.

When a macro M is encountered, enter_macro_context triggers its
expansion; but sometimes it does not.  And what I understand is that, in
an ideal world, you want to call macro_start_expand only in cases where
enter_macro_context actually triggers the expansion.  That way, your
plugin can use the tokens of the replacement-list of M (passed to
macro_start_expand) as it sees fit.

But in this patch, you are going a hackish route by unconditionally
calling macro_start_expand whenever M is encountered (right after
enter_macro_context is called), and if enter_macro_context does *not*
trigger the expansion of M, you tell your plugin to "back out", by
calling macro_end_arg with its 'cancel' parameter set to a 'true'
boolean.  Otherwise, if enter_macro_context happens to really have
expanded M, you are calling macro_end_arg with its 'cancel' parameter
set to a 'false' value, effectively telling your plugin "OK, M was
really expanded".

And what I was saying in my previous email is that you could arrange
to emit the macro_start_expand event *only* when you are sure that
enter_macro_context is really going to expand M.

That is why I was saying in my previous message:

> IMHO, it's more maintainable to put the call to
> pfile->cb.macro_start_expand inside enter_macro_context, as that
> is the function that really triggers the macro expansion.

And then, I told you where to put that call in enter_macro_context so
that you are sure to emit the event only when you are sure
enter_macro_context is going to expand M (and I am repeating it here
again):

> Maybe if you put the call to macro_start_expand in this
> enter_macro_context function only after we are sure we are going
> to actually proceed with the expansion 
>
>>            return 0;
>>          }
>
> ... here, then you wouldn't need the macro_end_arg at
> all.  W

Re: [PATCH 1/2] gcc symbol database

2012-06-27 Thread Yunfeng ZHANG
Hi Dodji Seketeli:

Is it possible to gcc to accept libcpp.patch and plugin.patch?

I recently rewrite my doc.txt which mainly add a new section , it's focused on pfile.context usage linking to
macro. I think it's important to use cb_macro_start/end callbacks
because most users only care about the outest macro expansion, test
whether pfile.context.prev == NULL, however if he hasn't known macro
cascaded case, his code will crashed.

Sincerely
 Yunfeng
// vim: foldmarker=<([{,}])> foldmethod=marker
 Gcc symbol database (symdb)
 zyf.zer...@gmail.com
  November 24, 2009
   revised on May 24, 2012

// Purpose <([{
The file is used to record the idea I got -- collecting gcc internal data
(definition, file-dependence etc.) and outputting them into database for
further usage.  Have you knowed cscope? but I think it's more appropriate that
symbols should be collected by gcc itself. Later sections can be cataloged
into two genres

For user (here user is IDE-like develop tools, not final user)
1) Need to know what symdb can do, goto section .
2) Goto section  for how to using symdb.
3)  and  have more about the plugin.
For gcc internal developer
1) Section  defines some new token types used in my symdb.
2) Sections  shows some complex cases linking to
macro expansion, I list calling sequence from plugin-side and stack snapshot
from gcc-side in every section, read them carefully, it's the key to
understand so.c:class mo.  test/testplan.txt and test/macro have the
testcases.
3) Section  makes focus on which files and how are changed in
the patch.

Before we go, let's clear up some terminology or abbreviation used in my symdb
1) cpp abbreviates from c preprocess (follows gcc intern convention); however,
cxx represents c++.
2) In gcc/c-ppoutput.c, gcc defines compilation unit as compiling a file, new
noun `compilation session' means compiling all files of a project.
// }])>

For User
// Feature List <([{
1) The plugin only works on C not C++.
2) Plugin can collect all extern definitins and dump them to database.
3) As the convention, the members of an extern enum are collected and dumped
to database.
4) Funtion call relationship are collected too, just like cscope.
5) You can use table FileDependence of database to reconstruct file dependence
relationship.
6) Not in cscope, you can use `gs addsym/rmsym' to re-edit the database,
remove the duplicate results and append new symbols to the database, see
section .
7) I finished a vim script `helper.vim' to help you using the database in vim.
8) My plugin is better than cscope in any cases, since I can catch definition
after macro expansion (such as tell you where `sys_open' is defined in linux
source) and skip `#ifdef/#if'.
// }])>

// User Manual <([{
Note: Using my plugin on correct code, buggy code maybe cause my plugin
infinite-loop.

Prepare stage (patch on gcc-4.6.3):
1) cp gcc.patches/* gcc.src/patches
2) quilt push -a
3) make # as usual
More about compilation suite (such as crosstool-ng-1.13.2):
1) Since gcc plugin is implemented as shared library, so disable compiling
static toolchain option.
2) add `--enable-plugin' to your gcc configure line, or append
`CT_CC_GCC_ENABLE_PLUGINS=y' to your crossng.config.
3) See section  for a sample command line on gcc-4.6.3.

Compiling source by patched gcc (cd myplugin.src/):
1) make
2) cp gs helper.vim init.sql target.src/ && cd target.src/
3) ./gs initdb ./ # Initialize database. If you want to custom the plugin,
update plugin-control-fields of database:ProjectOverview.
4) Append `-fplugin=/path/to/symdb.so
-fplugin-arg-symdb-dbfile=/target.src/gccsym.db' to your CFLAGS.
5) Since sqlite uses file-lock to synchronize database, so use `make -j1' to
compile source.
6) It will cost more time to compile your project, because my plugin need
compare whether a token has been inserted into database and multi-core can't
help you -- see previous steps, do it overnight.
7) ./gs vacuumdb ./ # Rearrange and defrag your database.
Of course, you can use some short-cuts to compile your projects without any
modification.
alias gcc='gccplugin -fplugin=symdb.so -fplugin-arg-symdb-dbfile=gccsym.db'
alias make='make -j1'

Working with new database:
1) cd /target.src
2) vi
3) execute `:source helper.vim'
4) Using `CTRL-]' to search a definition.
5) Using `CTRL-[' to search which functions calls the function.
6) Using `CTRL-T' to jump back.
7) Using `Gs def yoursymbol' to search a definition.
8) Using `Gs callee yourfunction' to search function call relationship.

Vim quickref:
Since my database stores the file-offset of every token, so
1) Using `:go fileoffset' to jump to the token.
2) Using `g' on the char to get the file-offset.

Testing my plugin:
1) cd test && ./run.sh
// }])>

// Multiple-results from my database and my solution <([{
Things are not always perfect, here

Re: [PATCH 1/2] gcc symbol database

2012-06-04 Thread Yunfeng ZHANG
To Dodji Seketeli:
Thanks, I will republish my libcpp patch later.

2012/6/4 Dodji Seketeli :
> Hello YunFeng,
>
> Thank you for taking the time to work on this.  I cannot accept or
> deny your patches, but I thought I could maybe comment on some parts
> of them.
>
> First, IMHO, some of the new plugin events you are proposing to add to
> libcpp seem to make sense, at least so that people can write
> etags/ctags-like cross-reference tools like the one you are proposing.
>
> For now, I will not comment on the choices you've made for the
> cross-reference tool itself, even though they are interesting to see,
> at least to understand why you need the libcpp events you are
> proposing.
>
> Thus, I'll start with the libcpp changes.
>
> [...]
>
>> diff -upr .pc/symdb_enhance_libcpp/libcpp/directives.c libcpp/directives.c
>
> [...]
>
>> @@ -492,6 +492,8 @@ _cpp_handle_directive (cpp_reader *pfile
>>    else if (skip == 0)
>>      _cpp_backup_tokens (pfile, 1);
>>
>> +  if (pfile->cb.end_directive)
>> +    pfile->cb.end_directive (pfile);
>
> It seems to me that there are possibly several places where we handle
> directives.  In those places, the function start_directive and
> end_directive are called.  So IMHO, it would seems more appropriate
> and maintainable to call pfile->cb.start_directive in the
> start_directive function, and likewise, call pfile->cg.end_directive
> in the end_directive function.
>
> You would then remove the call below from _cpp_lex_token:
>
> @@ -1886,6 +1889,8 @@ _cpp_lex_token (cpp_reader *pfile)
>          handles the directive as normal.  */
>           && pfile->state.parsing_args != 1)
>         {
> +          if (pfile->cb.start_directive)
> +        pfile->cb.start_directive (pfile, result);
>           if (_cpp_handle_directive (pfile, result->flags & PREV_WHITE))
>         {
>           if (pfile->directive_result.type == CPP_PADDING)
>
> [...]
>
>
>> +++ libcpp/include/cpplib.h    2012-05-25 14:56:56.745507332 +0800
>> @@ -218,10 +218,10 @@ struct GTY(()) cpp_identifier {
>>         node;
>>  };
>>
>> -/* A preprocessing token.  This has been carefully packed and should
>> -   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
>> +/* A preprocessing token. */
>>  struct GTY(()) cpp_token {
>>    source_location src_loc;    /* Location of first char of token.  */
>> +  int file_offset;
>
> Enlarging this struct that is *widely* used might not be accepted by
> the maintainers, because of the memory toll it takes.  Or you'd need
> to prove that you cannot do otherwise.
>
> So why is this file_offset member needed?
>
> From the rest of the patch, it looks like you are using it to store
> the offset of the token, from the beginning of its line.  What is the
> difference between that, and the column number of the token, as you
> could get by calling linemap_expand_location on the src_loc of the
> token, effectively doing away with this new file_offset member?  Note
> that the linemap_expand_location call was introduced in 4.7.
>
> [...]
>
>> +++ libcpp/macro.c    2012-05-25 14:56:56.749508416 +0800
>> @@ -1029,9 +1029,13 @@ enter_macro_context (cpp_reader *pfile,
>>            if (pragma_buff)
>>          _cpp_release_buff (pfile, pragma_buff);
>>
>> +          if (pfile->cb.macro_end_arg)
>> +        pfile->cb.macro_end_arg (pfile, true);
>
> Presumably, if we reach this point, I believe it means there was no
> argument to what seemed to be a function-like macro.  Was it?
>
> Thus, I don't understand why calling this pfile->cb.macro_end_arg call
> achieves.  Actually, I think I don't understand the meaning of that
> event callback, to begin with.  I have read and re-read the comment
> below:
>
>> @@ -522,6 +522,24 @@ struct cpp_callbacks
>>       be expanded.  */
>
> [...]
>
>> +  /* Called when a function-like macro stops collecting macro parameters,
>> +   * cancel = true, macro expansion is canceled. */
>> +  void (*macro_end_arg) (cpp_reader *, bool cancel);
>
> and I don't understand why you need the information conveyed by this
> macro_end_arg event.
>
> Maybe if you put the call to macro_start_expand in this
> enter_macro_context function only after we are sure we are going to
> actually proceed with the expansion 
>
>>            return 0;
>>          }
>
> ... here, then you wouldn't need the macro_end_arg at all.  Would that
> make sense?
>
>>        if (macro->paramc > 0)
>>          replace_args (pfile, node, macro,
>>                (macro_arg *) buff->base,
>> @@ -2263,6 +2267,8 @@ cpp_get_token_1 (cpp_reader *pfile, sour
>>        if (pfile->context->c.macro)
>>          ++num_expanded_macros_counter;
>>        _cpp_pop_context (pfile);
>> +      if (pfile->cb.macro_end_expand)
>> +        pfile->cb.macro_end_expand (pfile);
>
> _cpp_pop_context is really the function that marks the end of a given
> macro expansion, especially when the predicate in_macro_expansion_p
> (introduced recently in trunk for gcc 4.8) returns true.
>
> So IMHO, I seems mor

Re: [PATCH 1/2] gcc symbol database

2012-06-04 Thread Dodji Seketeli
Hello YunFeng,

Thank you for taking the time to work on this.  I cannot accept or
deny your patches, but I thought I could maybe comment on some parts
of them.

First, IMHO, some of the new plugin events you are proposing to add to
libcpp seem to make sense, at least so that people can write
etags/ctags-like cross-reference tools like the one you are proposing.

For now, I will not comment on the choices you've made for the
cross-reference tool itself, even though they are interesting to see,
at least to understand why you need the libcpp events you are
proposing.

Thus, I'll start with the libcpp changes.

[...]

> diff -upr .pc/symdb_enhance_libcpp/libcpp/directives.c libcpp/directives.c

[...]

> @@ -492,6 +492,8 @@ _cpp_handle_directive (cpp_reader *pfile
>else if (skip == 0)
>  _cpp_backup_tokens (pfile, 1);
> 
> +  if (pfile->cb.end_directive)
> +pfile->cb.end_directive (pfile);

It seems to me that there are possibly several places where we handle
directives.  In those places, the function start_directive and
end_directive are called.  So IMHO, it would seems more appropriate
and maintainable to call pfile->cb.start_directive in the
start_directive function, and likewise, call pfile->cg.end_directive
in the end_directive function.

You would then remove the call below from _cpp_lex_token:

@@ -1886,6 +1889,8 @@ _cpp_lex_token (cpp_reader *pfile)
  handles the directive as normal.  */
   && pfile->state.parsing_args != 1)
 {
+  if (pfile->cb.start_directive)
+pfile->cb.start_directive (pfile, result);
   if (_cpp_handle_directive (pfile, result->flags & PREV_WHITE))
 {
   if (pfile->directive_result.type == CPP_PADDING)

[...]


> +++ libcpp/include/cpplib.h2012-05-25 14:56:56.745507332 +0800
> @@ -218,10 +218,10 @@ struct GTY(()) cpp_identifier {
> node;
>  };
> 
> -/* A preprocessing token.  This has been carefully packed and should
> -   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
> +/* A preprocessing token. */
>  struct GTY(()) cpp_token {
>source_location src_loc;/* Location of first char of token.  */
> +  int file_offset;

Enlarging this struct that is *widely* used might not be accepted by
the maintainers, because of the memory toll it takes.  Or you'd need
to prove that you cannot do otherwise.

So why is this file_offset member needed?

>From the rest of the patch, it looks like you are using it to store
the offset of the token, from the beginning of its line.  What is the
difference between that, and the column number of the token, as you
could get by calling linemap_expand_location on the src_loc of the
token, effectively doing away with this new file_offset member?  Note
that the linemap_expand_location call was introduced in 4.7.

[...]

> +++ libcpp/macro.c2012-05-25 14:56:56.749508416 +0800
> @@ -1029,9 +1029,13 @@ enter_macro_context (cpp_reader *pfile,
>if (pragma_buff)
>  _cpp_release_buff (pfile, pragma_buff);
> 
> +  if (pfile->cb.macro_end_arg)
> +pfile->cb.macro_end_arg (pfile, true);

Presumably, if we reach this point, I believe it means there was no
argument to what seemed to be a function-like macro.  Was it?

Thus, I don't understand why calling this pfile->cb.macro_end_arg call
achieves.  Actually, I think I don't understand the meaning of that
event callback, to begin with.  I have read and re-read the comment
below:

> @@ -522,6 +522,24 @@ struct cpp_callbacks
>   be expanded.  */

[...]

> +  /* Called when a function-like macro stops collecting macro parameters,
> +   * cancel = true, macro expansion is canceled. */
> +  void (*macro_end_arg) (cpp_reader *, bool cancel);

and I don't understand why you need the information conveyed by this
macro_end_arg event.

Maybe if you put the call to macro_start_expand in this
enter_macro_context function only after we are sure we are going to
actually proceed with the expansion 

>return 0;
>  }

... here, then you wouldn't need the macro_end_arg at all.  Would that
make sense?

>if (macro->paramc > 0)
>  replace_args (pfile, node, macro,
>(macro_arg *) buff->base,
> @@ -2263,6 +2267,8 @@ cpp_get_token_1 (cpp_reader *pfile, sour
>if (pfile->context->c.macro)
>  ++num_expanded_macros_counter;
>_cpp_pop_context (pfile);
> +  if (pfile->cb.macro_end_expand)
> +pfile->cb.macro_end_expand (pfile);

_cpp_pop_context is really the function that marks the end of a given
macro expansion, especially when the predicate in_macro_expansion_p
(introduced recently in trunk for gcc 4.8) returns true.

So IMHO, I seems more maintainable to move the call to
pfile->cb.macro_end_expand in _cpp_pop_context.

>if (pfile->state.in_directive)
>  continue;
>result = &pfile->avoid_paste;
> @@ -2321,8 +2327,14 @@ cpp_get_token_1 (cpp_reader *pfile, sour
>  }
>  }