Hi,
I wrote a small lex scanner to transform .js files into C files (with just function prototypes), including the comments, to be able to run doxygen on them.


For example output, see http://stud4.tuwien.ac.at/~e0225227/xpfedocs/, which documents Mozilla xpfe/ files.


As written, this script also transforms /* comments into /**, and // into /// (because most of the mozilla .js code wasn't written with doxygen-style comments in mind, but some functions do have normal comments above them).


Well, this may be of interest to some people. I attached the .l script.

Usage from doxygen:
INPUT_FILTER           = /tmp/runner
FILE_PATTERNS          = *.js

Where /tmp/runner looks something like:
#!/bin/sh
~/Projects/jsparse/scanner < $1


There are a few known problems:
1) It only documents global functions and variables, not functions on JS Objects (via .prototype or something)
2) Due to rewriting of /* to /**, the first function or variable will have the license block as its documentation.


There are a few other bugs as well. Please don't mail me about them.
But all in all, this works surprisingly well.
%s COMMENT_CPP COMMENT_C STRING_S STRING_D EXPECT_I NESTED

%%
  int nesting = 0;
  unsigned long number;

<COMMENT_C>\*\/  { printf("%s\n", yytext); BEGIN 0; }
<COMMENT_CPP>\n  { ECHO; BEGIN 0; }
<COMMENT_C>\n     ECHO;

<COMMENT_C,COMMENT_CPP>.  { ECHO; } /* pass through unmodified */

<INITIAL>\/\*           { ECHO; putchar('*'); BEGIN COMMENT_C; }
<INITIAL>\/\/           { ECHO; putchar('/'); BEGIN COMMENT_CPP; }

<INITIAL>[-!+*/%=?:]      ECHO; /* operators */

<INITIAL>"function"[ \t\f\n]   |
<INITIAL>"const"[ \t\f\n]   |
<INITIAL>"var"[ \t\f\n]               { ECHO; BEGIN EXPECT_I; }

<EXPECT_I>[a-zA-Z_][_a-zA-Z0-9]*   { ECHO; BEGIN 0; } /* Identifier (variant 1) */
<EXPECT_I>"("                    { BEGIN 0; yyless(1); }
<EXPECT_I>.                      { printf("WARNING! expected identifier, found %s\n", 
yytext); }

  /* There may be a global if statement - ignore it, if so */
<INITIAL>if[ \t]+"("[^)]+")"  ;
<INITIAL>"else"               ;
  /* Same for global for statement */
<INITIAL>for[ \t]+"("[^)]+")"  ;

  /* string literals - keep them in, for initializers */
<STRING_S>\'    { BEGIN 0; ECHO; }
<STRING_D>\"    { BEGIN 0; ECHO; }
<INITIAL>\'              { BEGIN STRING_S; ECHO; }
<INITIAL>\"              { BEGIN STRING_D; ECHO; }

<STRING_S,STRING_D>. ECHO;

<INITIAL>"try"                   ++nesting; /* Avoid printing { */
<NESTED>catch[ \t]+"("[^)]+")"  --nesting; /* balance ++nesting from before */


<INITIAL>"."           printf(".");

<INITIAL>[a-zA-Z_][a-zA-Z_0-9]* printf("%s", yytext); /* Identifier (variant 2) */

<INITIAL>[()]          printf("%s", yytext);

<INITIAL>\{            { printf("{\n"); ++nesting; BEGIN NESTED; }
<NESTED>\{               ++nesting;
<NESTED>\}             { --nesting; if (nesting == 0) { printf("}\n"); BEGIN 0; } }

<NESTED>.             ;

<INITIAL>[0-9]+  |
<INITIAL>0x[0-9A-Fa-f]+  ECHO;  /* Number */


[ \t\f\n]      ;

<INITIAL>[<>&|]         ; /* relational op */


<INITIAL>\;            printf(";\n");


<INITIAL>[\[\]]        ECHO; /* array access */

<INITIAL>,             ECHO;

all: scanner

clean:
        rm -f lex.yy.c scanner

lex.yy.c: lex.l
        lex $^

scanner: lex.yy.c
        $(CC) -o $@ $^ -ll

.PHONY: all clean

Reply via email to