Re: a simple formatter

2008-02-21 Thread Alex Rousskov
On Thu, 2008-02-21 at 20:01 +0200, Tsantilas Christos wrote:

>   I believe, I solved (most of?) the problems in astyle.
> I am attaching the new version of the formatter.pl
> I test it only using the astyle version 1.21.

Christos,

Thanks a lot for polishing this!

Any objections to committing Christos' astyle wrapper and then
applying it to HEAD and selected branches? If there are no objections,
let's set a date for this. How about March, after 3.0.2 is released,
with a final 7-day warning?

Or is it better to reformat before releasing 3.0.2 so that it is easier
to backport future fixes?

> Also I am attaching a simple script I used to check the formated code.
> This scripts removes any space, tabs and newlines from the original file
> and the formated file, and compare their md5 signatures.This script
> requires the tr and md5sum utilities. I test it only on a linux system.
> 
> This script can not detect bad formated code but maybe can detect bad
> modifications (eg source code blocks removal) on source code.
> 
> Running this script I found 5 files with different md5 signatures but
> all of these files had formating modifications similar to the following:
> From
> if (i {
>  ...
> }
> To
> if (i  ...
> }

Ha! Who would have thought that such transpositions are possible? I
guess we need to strip not just whitespace, but comments as well (which
is more difficult) for the MD5 test to work 100%. I would not spend time
on that now.

Thank you,

Alex.




Re: a simple formatter

2008-02-21 Thread Tsantilas Christos
Hi all,
  I believe, I solved (most of?) the problems in astyle.
I am attaching the new version of the formatter.pl
I test it only using the astyle version 1.21.

This version does simpler hooks than the previous versions.
The input filter:
 1) Converts patterns
unsigned int aparam:1;
  to
unsigned int aparam__FORASTYLE__1;

 2) Scans for #endif/#else/#if patterns and adds at input the line:
  "//__ASTYLECOMMENT__\n"
   The comment is important, just a simple newline will not work.

   Example 1:
   #endif /*A comment*/
   {
   f++;

 converted to:

   #endif /*A comment*/
   //__ASTYLECOMMENT__
   {
   f++;

   Example 2:
   /* #endif A comment*/
   {
   f++;

 converted to:

   /* #endif A comment*/
   //__ASTYLECOMMENT__
   {
   f++;

The output filter just replaces the __FORASTYLE__ with a " : " and
removes the line "//__ASTYLECOMMENT__"

Although I looked carefully without finding any bad formated code maybe
there are still exists problems.

 Also I am attaching a simple script I used to check the formated code.
This scripts removes any space, tabs and newlines from the original file
and the formated file, and compare their md5 signatures.This script
requires the tr and md5sum utilities. I test it only on a linux system.

This script can not detect bad formated code but maybe can detect bad
modifications (eg source code blocks removal) on source code.

Running this script I found 5 files with different md5 signatures but
all of these files had formating modifications similar to the following:
From
if (i

formater.pl
Description: Perl program


md5checker.sh
Description: application/shellscript


Re: a simple formatter

2008-02-13 Thread Tsantilas Christos
Alex Rousskov wrote:
> On Wed, 2008-02-13 at 20:35 +0200, Tsantilas Christos wrote:
>> Alex Rousskov wrote:
>>> On Wed, 2008-02-13 at 13:52 +1300, Amos Jeffries wrote:
>>>
 Or is it possible to omit the src/tools.cc src/stat.cc files from astyle
 until they can be cleaned manually to work better.
>>> I bet there are other files with problems. I doubt Christos verified
>>> each and every source file.
>>>
>> I was not found other problematic files but yes it is possible that
>> there are other problematic files too. It is very difficult to examine
>> all the code. Looking the diffs it is possible I am loosing thinks...
> 
> Did you do the whitespace-free-MD5 test we talked about before?


No I did not. This test can not detect most of the errors astyle does.
Do you think will help?



Re: a simple formatter

2008-02-13 Thread Alex Rousskov
On Wed, 2008-02-13 at 20:35 +0200, Tsantilas Christos wrote:
> Alex Rousskov wrote:
> > On Wed, 2008-02-13 at 13:52 +1300, Amos Jeffries wrote:
> > 
> >> Or is it possible to omit the src/tools.cc src/stat.cc files from astyle
> >> until they can be cleaned manually to work better.
> > 
> > I bet there are other files with problems. I doubt Christos verified
> > each and every source file.
> > 
> I was not found other problematic files but yes it is possible that
> there are other problematic files too. It is very difficult to examine
> all the code. Looking the diffs it is possible I am loosing thinks...

Did you do the whitespace-free-MD5 test we talked about before?

Thank you,

Alex.




Re: a simple formatter

2008-02-13 Thread Tsantilas Christos
Alex Rousskov wrote:
> On Wed, 2008-02-13 at 13:52 +1300, Amos Jeffries wrote:
> 
>> Or is it possible to omit the src/tools.cc src/stat.cc files from astyle
>> until they can be cleaned manually to work better.
> 
> I bet there are other files with problems. I doubt Christos verified
> each and every source file.
> 
I was not found other problematic files but yes it is possible that
there are other problematic files too. It is very difficult to examine
all the code. Looking the diffs it is possible I am loosing thinks...

> 
> Agreed. And I think it would be fairly easy to disable that in code
> since there is already a command line option that affects the behavior.
> In fact, Christos, have you tried using "-cc 0"?

Yes, using this parameter the comment go to the previews line:

test++; /*A comment*/

 converted to:
 /*A comment */
test++;


> 
>> Does it still do that if the comments are seperate from code lines? I'm
>> thinking how would this affect the longer comments and auto-docs comments
>> might be a problem.
> 
> Yes, needs checking.

There is not any problem here. The comments have the same indentation as
the code.


> I did not count any showstoppers for bcpp.

I think the (1), (2) (3)  and (4) are just formating styles and there is
not any problem with them. The problematic cases are the (5) (6) and (7)
which maybe can fixed using a perl script like this we trying to build
for use with astyle.

I must say that the bcpp is not trying to add/remove spaces located
inside the code (sorry I did not mentioned before ). For example does
not trying to format the following:
 afunc(a,b,t);
Because of that it is a simpler formatter and this is why it makes less
mistakes.
The astyle in the other hand trying to be a full formatter but in some
cases breaks the code which is bad

Also for bcpp looks easier to touch its code, but I can not say how easy
is to fix thinks ( and possibly it is better to spend this time for
squid ...)

> 
> I doubt the rest is OK for astyle, and we should not forget that astyle
> requires pre- and post-processing. It is likely that similar pre-post
> processing or code modification can fix bcpp problems as well.
> 
> In general, I think astyle _quality_ is much worse (it actually breaks
> valid programs), but bcpp is a dead project (last change - 2005/07/25)
> and astyle does show signs of life once in a while.
> 
> I think the biggest question right now is whether Christos can confirm
> that astyle (with pre- and post-processing) screws up only two source
> files.
> 
> Thank you,
> 
> Alex.
> 
> 
> 



Re: a simple formatter

2008-02-13 Thread Alex Rousskov
On Wed, 2008-02-13 at 13:52 +1300, Amos Jeffries wrote:

> Or is it possible to omit the src/tools.cc src/stat.cc files from astyle
> until they can be cleaned manually to work better.

I bet there are other files with problems. I doubt Christos verified
each and every source file.

> > It changes the current indentation format of squid3 as follows:
> >
> > 1) The class definitions have the following format:
> >
> > class aclass
> > {
> > public:
> >void a func()
> >  .
> > };
> 
> I suppose we could live with that if it was automatically done.

Some of us actually prefer the above formatting!

> > 2) The if else formated as:
> >
> >   if(...) {
> >   }
> >   else {
> >   }
> 
> Thats wanted yes?

Not all want this, but it is not a big deal anyway.

> > 3) When there are comments with code in aline the bcpp will try to ident
> > the comments. For example
> >   i++; /*a comment*/
> >
> >  converted to:
> >   i++; /*a comment*/
> >
> > There is a command line parameter the -cc which defines in which column
> > the comments will be placed. You can not tell to bcpp to not format the
> > comments.
> 
> We could live with this I think in small comments.

Agreed. And I think it would be fairly easy to disable that in code
since there is already a command line option that affects the behavior.
In fact, Christos, have you tried using "-cc 0"?

> Does it still do that if the comments are seperate from code lines? I'm
> thinking how would this affect the longer comments and auto-docs comments
> might be a problem.

Yes, needs checking.

> >
> > 4) The function calls which placed in more than one line formated as:
> >
> >afuctioncall(arg1,arg2,.
> >argn..);
> >
> 
> weird. that would need fixing.

What's wrong with that? I always use that style, I think.

> > 5) Does not indent the multiline preprocessor macros:
> > #define GENGRAPH(X,Y,Z) \
> > GRAPH_TITLE(Y,Z) \
> > GRAPH_PER_MIN(X) \
> > GRAPH_PER_HOUR(X) \
> > GRAPH_END
> >
> 
> aha. nice reason to get rid of those macros sooner :-)

This will make class macros difficult to read, but not a showstopper.
Perhaps there are some /*magic_codes*/ that stop and resume formatting
in bcpp?

> > 6) When a function/method definitions takes more than one lines does not
> > formated well:
> >
> > void aclass::afunction (int a, int b
> > int c,ind);
> 
> should not need to care about line length. but this is a naasty one anyway.

Ugly, but not a show stopper, IMO.

> > 7) For the constructors it uses the following format (does not indent
> > data members) :
> >
> > aconstructor::aconstructor():
> > data1(...),
> > data2(...),
> > data3(...),
> > {
> > }
> 
> Another thing we could possibly live with.

Agreed. Same as the one above. This one is probably easy to fix in the
sources.


>   bcpp:  1-OK, 3-PASSABLE, 2-BAD
>   astyle: 2 problem files. rest OK.

I did not count any showstoppers for bcpp.

I doubt the rest is OK for astyle, and we should not forget that astyle
requires pre- and post-processing. It is likely that similar pre-post
processing or code modification can fix bcpp problems as well.

In general, I think astyle _quality_ is much worse (it actually breaks
valid programs), but bcpp is a dead project (last change - 2005/07/25)
and astyle does show signs of life once in a while.

I think the biggest question right now is whether Christos can confirm
that astyle (with pre- and post-processing) screws up only two source
files.

Thank you,

Alex.




Re: a simple formatter

2008-02-12 Thread Amos Jeffries
> Some comments about astyle and bcpp formatters. Also I am attaching a
> new script  which solves the problems mentioned by Amos.
>
>
> Astyle:
>   Looks that it is difficult to fix all problems of astyle.
> Looks that the "#ifdef...#else ..." confuses the astyle. For example is
> not possible for me to format well the src/tools.cc and  src/stat.cc
> files.
> However with formater.pl script looks that the formated source code is
> correct (but in some cases not well formated).
>
>
> The bcpp  in the other hand looks that it does not breaks the code and
> has more predictable behaviour. But still is not excellent.
>

Given all the problems below. Would it be easy enough to get the bcpp
problems sorted out as bugs and use the fixed version?

Or is it possible to omit the src/tools.cc src/stat.cc files from astyle
until they can be cleaned manually to work better.


> It changes the current indentation format of squid3 as follows:
>
> 1) The class definitions have the following format:
>
> class aclass
> {
> public:
>void a func()
>  .
> };

I suppose we could live with that if it was automatically done.

>
> 2) The if else formated as:
>
>   if(...) {
>   }
>   else {
>   }

Thats wanted yes?


> 3) When there are comments with code in aline the bcpp will try to ident
> the comments. For example
>   i++; /*a comment*/
>
>  converted to:
>   i++; /*a comment*/
>
> There is a command line parameter the -cc which defines in which column
> the comments will be placed. You can not tell to bcpp to not format the
> comments.

We could live with this I think in small comments.

Does it still do that if the comments are seperate from code lines? I'm
thinking how would this affect the longer comments and auto-docs comments
might be a problem.

>
> 4) The function calls which placed in more than one line formated as:
>
>afuctioncall(arg1,arg2,.
>argn..);
>

weird. that would need fixing.

>
> 5) Does not indent the multiline preprocessor macros:
> #define GENGRAPH(X,Y,Z) \
> GRAPH_TITLE(Y,Z) \
> GRAPH_PER_MIN(X) \
> GRAPH_PER_HOUR(X) \
> GRAPH_END
>

aha. nice reason to get rid of those macros sooner :-)

>
> 6) When a function/method definitions takes more than one lines does not
> formated well:
>
> void aclass::afunction (int a, int b
> int c,ind);

should not need to care about line length. but this is a naasty one anyway.

>
> 7) For the constructors it uses the following format (does not indent
> data members) :
>
> aconstructor::aconstructor():
> data1(...),
> data2(...),
> data3(...),
> {
> }

Another thing we could possibly live with.

So
  bcpp:  1-OK, 3-PASSABLE, 2-BAD
vs
  astyle: 2 problem files. rest OK.

Amos




Re: a simple formatter

2008-02-12 Thread Tsantilas Christos
Some comments about astyle and bcpp formatters. Also I am attaching a
new script  which solves the problems mentioned by Amos.


Astyle:
  Looks that it is difficult to fix all problems of astyle.
Looks that the "#ifdef...#else ..." confuses the astyle. For example is
not possible for me to format well the src/tools.cc and  src/stat.cc files.
However with formater.pl script looks that the formated source code is
correct (but in some cases not well formated).


The bcpp  in the other hand looks that it does not breaks the code and
has more predictable behaviour. But still is not excellent.

It changes the current indentation format of squid3 as follows:

1) The class definitions have the following format:

class aclass
{
public:
   void a func()
 .
};

2) The if else formated as:

  if(...) {
  }
  else {
  }

3) When there are comments with code in aline the bcpp will try to ident
the comments. For example
  i++; /*a comment*/

 converted to:
  i++; /*a comment*/

There is a command line parameter the -cc which defines in which column
the comments will be placed. You can not tell to bcpp to not format the
comments.

4) The function calls which placed in more than one line formated as:

   afuctioncall(arg1,arg2,.
   argn..);


5) Does not indent the multiline preprocessor macros:
#define GENGRAPH(X,Y,Z) \
GRAPH_TITLE(Y,Z) \
GRAPH_PER_MIN(X) \
GRAPH_PER_HOUR(X) \
GRAPH_END


6) When a function/method definitions takes more than one lines does not
formated well:

void aclass::afunction (int a, int b
int c,ind);


7) For the constructors it uses the following format (does not indent
data members) :

aconstructor::aconstructor():
data1(...),
data2(...),
data3(...),
{
}



formater.pl
Description: Perl program


Re: a simple formatter

2008-02-11 Thread Amos Jeffries
>
> On Fri, 2008-02-08 at 23:26 +0200, Tsantilas Christos wrote:
>> Alex Rousskov wrote:
>> >
>> > Changes in the current code are only bad for outstanding patches and
>> > forgotten branches, right? If everybody applies the same formatting to
>> > all their branches and HEAD, we should not have many conflicts, right?
>>
>> I don't know, possibly it is OK.
>> Or just format the HEAD, merge HEAD to branches (this will take some
>> hours but...) and then apply formating to the branch. After the first
>> time all will be OK.
>
> with CVS you will be much better off formatting all branches
> simultaneously, then merging.

I think that would cause clashes on any very old branches. They would need
merging up before the format then formatting HEAD & branches and
re-merging.

Amos




Re: a simple formatter

2008-02-11 Thread Robert Collins

On Fri, 2008-02-08 at 23:26 +0200, Tsantilas Christos wrote:
> Alex Rousskov wrote:
> > 
> > Changes in the current code are only bad for outstanding patches and
> > forgotten branches, right? If everybody applies the same formatting to
> > all their branches and HEAD, we should not have many conflicts, right?
> 
> I don't know, possibly it is OK.
> Or just format the HEAD, merge HEAD to branches (this will take some
> hours but...) and then apply formating to the branch. After the first
> time all will be OK.

with CVS you will be much better off formatting all branches
simultaneously, then merging.

-Rob
-- 
GPG key available at: .


signature.asc
Description: This is a digitally signed message part


Re: a simple formatter

2008-02-08 Thread Tsantilas Christos
Alex Rousskov wrote:
> 
> Changes in the current code are only bad for outstanding patches and
> forgotten branches, right? If everybody applies the same formatting to
> all their branches and HEAD, we should not have many conflicts, right?

I don't know, possibly it is OK.
Or just format the HEAD, merge HEAD to branches (this will take some
hours but...) and then apply formating to the branch. After the first
time all will be OK.

There is another good reason for bcpp, its source code looks better than
the source code of astyle. Looks easier to fix bugs in bcpp

I will look again on both solutions (formatter.pl/astyle and bcpp) the
next two days. I will post my findings.
It would be OK if someone else looked at it too, just to share our opinions.

Regards,
   Christos



Re: a simple formatter

2008-02-08 Thread Alex Rousskov

On Fri, 2008-02-08 at 21:11 +0200, Tsantilas Christos wrote:
> Also I am evaluating the following:
>http://invisible-island.net/bcpp/bcpp.html
> 
> It does not have a lot of configuration parameters but looks  safer than
> astyle.
> In the other hand makes a lot of changes in the current code.
> Also it is significant faster than formater.pl/astyle combinations.
> 
> I am running it with the following parameters:
>bcpp -bcl -yb -f 1 -ylcnc -i 4 filename.cc
> 
> At this time the only bad think I found is:
> 
> AConstructor::AConstructor( ...): chld_cnsotr(),chld2 _constr() ...
> {
> }
> 
> converts to:
> 
> AConstructor::Constructor( ...):
> chld_cnsotr(),
> chld2 _constr()
> {
> }

I do not see that as a blocker. Our current init lists are malformed in
worse ways. If bcpp would indent the data member names, that would be
perfect.

Changes in the current code are only bad for outstanding patches and
forgotten branches, right? If everybody applies the same formatting to
all their branches and HEAD, we should not have many conflicts, right?

Thank you,

Alex.




Re: a simple formatter

2008-02-08 Thread Tsantilas Christos
Also I am evaluating the following:
   http://invisible-island.net/bcpp/bcpp.html

It does not have a lot of configuration parameters but looks  safer than
astyle.
In the other hand makes a lot of changes in the current code.
Also it is significant faster than formater.pl/astyle combinations.

I am running it with the following parameters:
   bcpp -bcl -yb -f 1 -ylcnc -i 4 filename.cc

At this time the only bad think I found is:

AConstructor::AConstructor( ...): chld_cnsotr(),chld2 _constr() ...
{
}

converts to:

AConstructor::Constructor( ...):
chld_cnsotr(),
chld2 _constr()
{
}





Re: a simple formatter

2008-02-08 Thread Alex Rousskov
On Sat, 2008-02-09 at 01:23 +1300, Amos Jeffries wrote:

> While I am away, the attached file is the set of test cases I have seen 
> enough in the Squid-3 source to note as having bad syntax somehow 
> (probably astyle previously). Along with a few I threw in just because 
> the developer doc page described the syntax specially.

How about committing the wrapper and the test file to squid3/scripts/?
Or should the test file go into squid3/test-suite/?

Thank you,

Alex.




Re: a simple formatter

2008-02-08 Thread Amos Jeffries

Tsantilas Christos wrote:

On Fri, 2008-02-08 at 16:10 +1300, Amos Jeffries wrote:

The "same()" operations used in those tests may be different from a
"same_MD5s()" operation.

Try inserting an empty line in a middle of a function and recompile. You
will get a different MD5. Stripping the executable does not help in my
tests..


Yes this is true. I tried exactly  the same tests  to evaluate  the
formatter, but they did not work.  I believed that striping the code will
solve the problem but nothing...
In general I was not able to find a safe way to test it. I am just looking
the diffs searching for  bad formated code ...
(Also I must note here that the bazaar repository was very-very  helpful
here, branches/revert/diff etc on local repositories are really fast, but
OK it is an other story :-) )


... Stripping empty lines fron sources does not help if

the formatter

moves brackets and such, changing the number of lines. Does our
formatter do that?


Yes it does. For example the code:
  void some_function(){
  }
convert to:
  void some_function()
 {
  }


that would be desired behaviour though, yes?

While I am away, the attached file is the set of test cases I have seen 
enough in the Squid-3 source to note as having bad syntax somehow 
(probably astyle previously). Along with a few I threw in just because 
the developer doc page described the syntax specially.


If anyone has other please speak up and we'll get the lot testing properly.

Amos
--
Please use Squid 2.6STABLE17+ or 3.0STABLE1+
There are serious security advisories out on all earlier releases.
/*
  A code-formating tool is wanted fro squid3

  However teh squid sources use a number of syntax constructs which it must be 
capable of handling.

  This file is a dead-code file containing demos of those syntax which are 
known to be formatted badly at times

  It is to be used as the desired control to test formattign output against.
*/

/// structures with bit-fields
struct t1 {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
};

/// code-blocks inside definition protection
#if SOMETHING
{
;
}
#endif

/// regular function declarations
void
main(int argc, char *argv[])
{
;
}

/// template class explicit Instantiation for some compilers
template class ACLStrategised;

/// global template Instances with macro definition
template cbdata_type List::CBDATA_List;



Re: a simple formatter

2008-02-08 Thread Tsantilas Christos
>
> Excellent. That dropit it down to:
>
> --- astyle-test-cases.cc2008-02-08 12:37:14.0 +1300
> +++ astyle-test-cases.cc-original   2008-02-07 12:13:07.0
> +1300
> @@ -10,9 +10,9 @@
>
>  /// structures with bit-fields
>  struct t1 {
> -unsigned  int a:1;
> -unsigned  int b:1;
> -unsigned  int c:1;
> +unsigned int a:1;

Oops. I did not see the extra space after "unsigned" (The fonts of my mail
reader).
OK fixed locally. I will see if there is something else and I will send
again the fixed script.

Regards,
Christos



Re: a simple formatter

2008-02-08 Thread Tsantilas Christos
> On Fri, 2008-02-08 at 16:10 +1300, Amos Jeffries wrote:
>
> The "same()" operations used in those tests may be different from a
> "same_MD5s()" operation.
>
> Try inserting an empty line in a middle of a function and recompile. You
> will get a different MD5. Stripping the executable does not help in my
> tests..

Yes this is true. I tried exactly  the same tests  to evaluate  the
formatter, but they did not work.  I believed that striping the code will
solve the problem but nothing...
In general I was not able to find a safe way to test it. I am just looking
the diffs searching for  bad formated code ...
(Also I must note here that the bazaar repository was very-very  helpful
here, branches/revert/diff etc on local repositories are really fast, but
OK it is an other story :-) )

>... Stripping empty lines fron sources does not help if
the formatter
> moves brackets and such, changing the number of lines. Does our
> formatter do that?
>
Yes it does. For example the code:
  void some_function(){
  }
convert to:
  void some_function()
 {
  }






Re: a simple formatter

2008-02-07 Thread Alex Rousskov
On Fri, 2008-02-08 at 16:10 +1300, Amos Jeffries wrote:
> > Too bad compilers produce different output for each execution due to
> > timestamps and such.
> 
> What do you mean by this? A standard entry-level compiler test is that it
> produces the same output from the same input every time.

The "same()" operations used in those tests may be different from a
"same_MD5s()" operation.

Try inserting an empty line in a middle of a function and recompile. You
will get a different MD5. Stripping the executable does not help in my
tests. Stripping empty lines fron sources does not help if the formatter
moves brackets and such, changing the number of lines. Does our
formatter do that?

Thanks,

Alex.




Re: a simple formatter

2008-02-07 Thread Amos Jeffries
> On Fri, 2008-02-08 at 14:31 +1300, Amos Jeffries wrote:
>> > On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote:
>> >> Do we need something like that? Any comments/suggestions? Any
>> testers?
>> >
>> > I believe we do and I appreciate you working on it! Please try to fix
>> > the remaining problems Amos pointed out.
>> >
>> > Also, can you apply it to the entire Squid tree, remove all
>> whitespace,
>> > and calculate md5, comparing that with the virgin whitespace-less
>> tree?
>> > The MD5s should be the same, right?
>>
>> Oooh. Nice test.
>>
>> >
>> > This is not a perfect check because spaces within strings/etc are
>> > stripped and not checked, but it is a pretty good one.
>>
>> It will also miss the "#if 0 {" block problem.
>
> Are there free C++ source code obfuscation programs? If they are
> guaranteed to generate the same source code regardless of formatting,
> then applying them would catch even more bugs.
>
> Too bad compilers produce different output for each execution due to
> timestamps and such.

What do you mean by this? A standard entry-level compiler test is that it
produces the same output from the same input every time.

Beginner students are taught that in the bootstrapping lessons: "If it
compiles its own code and produces a different binary, do it again until
it stops or breaks. If it breaks you started with buggy code and still
have work to do."

> Perhaps there is a way to avoid that and compare
> md5s of squid executables?
>
> Cheers,
>
> Alex.
> P.S. A basic test file would be good to have anyway, so thank you for
> bootstrapping that.

I thought it would be helpful if you and Christos decided on an astyle
formatter. Turned out right.

Amos



Re: a simple formatter

2008-02-07 Thread Alex Rousskov
On Fri, 2008-02-08 at 14:31 +1300, Amos Jeffries wrote:
> > On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote:
> >> Do we need something like that? Any comments/suggestions? Any testers?
> >
> > I believe we do and I appreciate you working on it! Please try to fix
> > the remaining problems Amos pointed out.
> >
> > Also, can you apply it to the entire Squid tree, remove all whitespace,
> > and calculate md5, comparing that with the virgin whitespace-less tree?
> > The MD5s should be the same, right?
> 
> Oooh. Nice test.
> 
> >
> > This is not a perfect check because spaces within strings/etc are
> > stripped and not checked, but it is a pretty good one.
> 
> It will also miss the "#if 0 {" block problem.

Are there free C++ source code obfuscation programs? If they are
guaranteed to generate the same source code regardless of formatting,
then applying them would catch even more bugs.

Too bad compilers produce different output for each execution due to
timestamps and such. Perhaps there is a way to avoid that and compare
md5s of squid executables?

Cheers,

Alex.
P.S. A basic test file would be good to have anyway, so thank you for
bootstrapping that.



Re: a simple formatter

2008-02-07 Thread Amos Jeffries
> On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote:
>> Do we need something like that? Any comments/suggestions? Any testers?
>
> I believe we do and I appreciate you working on it! Please try to fix
> the remaining problems Amos pointed out.
>
> Also, can you apply it to the entire Squid tree, remove all whitespace,
> and calculate md5, comparing that with the virgin whitespace-less tree?
> The MD5s should be the same, right?

Oooh. Nice test.

>
> This is not a perfect check because spaces within strings/etc are
> stripped and not checked, but it is a pretty good one.

It will also miss the "#if 0 {" block problem.
Though my unit-test for that will catch it any any other whitsspace-exact
case we care to write the test for.

We may need a compile test for any other similar ones that have not had
test-cases written.

Just have to figure out the best way to have the test file built into the
codebase for automatic testing.

Amos



Re: a simple formatter

2008-02-07 Thread Alex Rousskov
On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote:
> Do we need something like that? Any comments/suggestions? Any testers?

I believe we do and I appreciate you working on it! Please try to fix
the remaining problems Amos pointed out.

Also, can you apply it to the entire Squid tree, remove all whitespace,
and calculate md5, comparing that with the virgin whitespace-less tree?
The MD5s should be the same, right?

This is not a perfect check because spaces within strings/etc are
stripped and not checked, but it is a pretty good one.

Thank you,

Alex.




Re: a simple formatter

2008-02-07 Thread Amos Jeffries
>
>> Tested.
>>
>> I have made a (small) file of the structures I've noticed past
>> formatting
>> errors on. With a manual fix-up to what I think it should look like in
>> readable C++ under the published squid formatting description.
>>
>> The output of your script does this:
>> ..
>
> It is the "--break-blocks" option of the astyle. I think it can removed.
>
>> Also, I noticed that the formatter will accept _anything_ given to it as
>> a
>> filename and create the files. Thats rather nasty when non-valid
>> filename
>> sequences are entered. ie "--help"
>>
>
> OK fixed :-).  Now accepts only files with .cc,.h,.cci and .c extensions
> and ignore all other files.
> I am re-sending the fixed script . In this version also I removed the
> "--break-blocks" astyle option. Formated code looks better without it.
>

Excellent. That dropit it down to:

--- astyle-test-cases.cc2008-02-08 12:37:14.0 +1300
+++ astyle-test-cases.cc-original   2008-02-07 12:13:07.0 +1300
@@ -10,9 +10,9 @@

 /// structures with bit-fields
 struct t1 {
-unsigned  int a:1;
-unsigned  int b:1;
-unsigned  int c:1;
+unsigned int a:1;
+unsigned int b:1;
+unsigned int c:1;
 };


I suppose we could live with that. Though I think it may be a problem with
the formatter.pl final replacement.

Amos



Re: a simple formatter

2008-02-07 Thread Tsantilas Christos

> Tested.
>
> I have made a (small) file of the structures I've noticed past formatting
> errors on. With a manual fix-up to what I think it should look like in
> readable C++ under the published squid formatting description.
>
> The output of your script does this:
> ..

It is the "--break-blocks" option of the astyle. I think it can removed.

> Also, I noticed that the formatter will accept _anything_ given to it as a
> filename and create the files. Thats rather nasty when non-valid filename
> sequences are entered. ie "--help"
>

OK fixed :-).  Now accepts only files with .cc,.h,.cci and .c extensions
and ignore all other files.
I am re-sending the fixed script . In this version also I removed the
"--break-blocks" astyle option. Formated code looks better without it.

--
  Christos

#!/usr/bin/perl
#
# Author: Tsantilas Christos
# email:  [EMAIL PROTECTED]
#
# Distributed under the terms of the GNU General Public License as published
# by the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# The ldap_manager library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Library General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
# See LICENSE or http://www.gnu.org/licenses/gpl.html for details .
#


use IPC::Open2;

$ASTYLE_BIN="/usr/bin/astyle";
$ASTYLE_ARGS ="--mode=c -s4 -O -l";

#$ASTYLE_ARGS="--mode=c -s4 -O --break-blocks -l";
#$ASTYLE_BIN="/usr/local/src/astyle-1.21/bin/astyle";



if(! -e $ASTYLE_BIN){
print "\nFile $ASTYLE_BIN not found\n";
print "Please fix the ASTYLE_BIN variable in this script!\n\n";
exit -1;
}
$ASTYLE_BIN=$ASTYLE_BIN." ".$ASTYLE_ARGS;

$INDENT = "";

$out = shift @ARGV;
#read options, currently no options available
while($out eq "" ||  $out =~ /^-\w+$/){
   if($out eq "-h") {
usage($0);
exit 0;
   }
   else {
   usage($0);
   exit -1;
   }
}


while($out){

if( $out !~ /\.cc$|\.cci$|\.h$|\.c$/) {
 print "Unknown suffix for file $out, ignoring\n";
 $out = shift @ARGV;
 next;
}

$in= "$out.astylebak";
my($new_in) = $in;
my($i) = 0;
while(-e $new_in) {
	$new_in=$in.".".$i;
	$i++;
}
$in=$new_in;
rename($out, $in);

local (*FROM_ASTYLE, *TO_ASTYLE);
$pid_style=open2(\*FROM_ASTYLE, \*TO_ASTYLE, $ASTYLE_BIN);

if(!$pid_style){
	print "An error while open2\n";
	exit -1;
}


if($pid=fork()){
	#do parrent staf
	close(FROM_ASTYLE);
	
	if(!open(IN, "<$in")){
	print "Can not open input file: $in\n";
	exit -1;
	}
	my($line) = '';
	while(){
	$line=$line.$_;
	if(input_filter(\$line)==0){
		next;
	}
	print TO_ASTYLE $line;
	$line = '';
	}
	if($line){
	print TO_ASTYLE $line;
	}
	close(TO_ASTYLE);
	waitpid($pid,0);
}
else{
	# child staf
	close(TO_ASTYLE);
	
	if(!open(OUT,">$out")){
	print "Can't open output file: $out\n";
	exit -1;
	}
	my($line)='';
	while(){
	$line = $line.$_;
	if(output_filter(\$line)==0){
		next;
	}
	print OUT $line;
	$line = '';
	}
	if($line){
	print OUT $line;
	}
	close(OUT);
	exit 0;
}

$out = shift @ARGV;
}


sub input_filter{
my($line)[EMAIL PROTECTED];
 #if we have integer declaration, get it all before processing it..
if($$line =~/.*\s*int\s+.*/  ){ 
	if(index($$line,";") == -1){
	return 0;
	}
	if($$line =~ /(.*)\s*int\s+([^:]*):\s*(\w+)\s*\;(.*)/s){
#	print "> ".$$line."($1)\n";
	$$line= $1." int ".$2."__FORASTYLE__".$3.";".$4;
#	print "->".$$line."\n";
	}
	return 1;
}

return 1;
}

sub output_filter{
my($line)[EMAIL PROTECTED];

if($$line =~ /^\s*$/){
	return 1;
}

#if line is not preprocessor directive keep indentation
if($$line !~  /^(\s*)\#/){
	$$line =~ /^(\s*)./;
	$INDENT = $1;
# here we can handle only the case we have a one line C++/C command.
# but looks enougth..
}

#   The  "#preprocessor_directive {" case
 if($$line =~ /^(\s*)\#(.*){\s*$/ ){
	$$line=$1."#".$2."\n".$INDENT."{\n";
}
	
#   The  "{ #preprocessor directive" case
if($$line =~ /^(\s*)\{(.*)#(if|else|endif)(.*)$/ ){
#  print "From :".$$line."\n";
  $$line= $1."{".$2."\n#".$3.$4."\n";
#  print "-->>".$$line."\n";
}

   # "The "unsigned int:1; case ."
$$line =~ s/__FORASTYLE__/:/;

return 1;
}

sub usage{
my($name)[EMAIL PROTECTED];
print "Usage:\n   $name file1 file2 file3 \n";
}

Re: a simple formatter

2008-02-06 Thread Amos Jeffries
>> Hi all,
>>
>> I wrote a small perl script which fixes (some of) the astyle problems.
>>
>> Before use it, you must adjust the $ASTYLE_BIN variable at the beggining
>> of the formatter.pl file.
>>
>> I am running it using the following command:
>>   # find . -name "*.cc"  -exec formater.pl \{\} \;
>> but it can take multiple files as arguments.
>>
>> I used it with astyle versions 1.18 and 1.21. I did not do extensive
>> tests, so I am sure it contains bugs :-).  Please use it with caution.
>>
>> I tried to implement Alex suggestions.  For the problem discussion look
>> at the thread:
>> http://www.squid-cache.org/mail-archive/squid-dev/200801/0019.html
>>
>> This script starts the astyle and attach one filter in its input and one
>> in its output, using pipes.
>>
>> The input filter search for "unsigned int X:1;" cases and convert
>> them to  "unsigned int X__FORASTYLE__1;".
>>
>> The output filter search for "unsigned int X__FORASTYLE__1;"
>> patterns and convert them back to the original text.
>> Also it search for the cases "#preprocessor directive {" or "{
>> #preprocessor_directive" and fix them.
>>
>>
>> Do we need something like that? Any comments/suggestions? Any testers?
>>
>> Regards,
>> Christos
>>
>
> Tested.
>

Also, I noticed that the formatter will accept _anything_ given to it as a
filename and create the files. Thats rather nasty when non-valid filename
sequences are entered. ie "--help"

Amos



Re: a simple formatter

2008-02-06 Thread Amos Jeffries
> Hi all,
>
> I wrote a small perl script which fixes (some of) the astyle problems.
>
> Before use it, you must adjust the $ASTYLE_BIN variable at the beggining
> of the formatter.pl file.
>
> I am running it using the following command:
>   # find . -name "*.cc"  -exec formater.pl \{\} \;
> but it can take multiple files as arguments.
>
> I used it with astyle versions 1.18 and 1.21. I did not do extensive
> tests, so I am sure it contains bugs :-).  Please use it with caution.
>
> I tried to implement Alex suggestions.  For the problem discussion look
> at the thread:
> http://www.squid-cache.org/mail-archive/squid-dev/200801/0019.html
>
> This script starts the astyle and attach one filter in its input and one
> in its output, using pipes.
>
> The input filter search for "unsigned int X:1;" cases and convert
> them to  "unsigned int X__FORASTYLE__1;".
>
> The output filter search for "unsigned int X__FORASTYLE__1;"
> patterns and convert them back to the original text.
> Also it search for the cases "#preprocessor directive {" or "{
> #preprocessor_directive" and fix them.
>
>
> Do we need something like that? Any comments/suggestions? Any testers?
>
> Regards,
> Christos
>

Tested.

I have made a (small) file of the structures I've noticed past formatting
errors on. With a manual fix-up to what I think it should look like in
readable C++ under the published squid formatting description.

The output of your script does this:
--- astyle-test-cases.cc-original   2008-02-07 12:13:07.0 +1300
+++ astyle-test-cases.cc2008-02-07 12:19:14.0 +1300
@@ -9,10 +9,11 @@
 */

 /// structures with bit-fields
+
 struct t1 {
-unsigned int a:1;
-unsigned int b:1;
-unsigned int c:1;
+unsigned  int a:1;
+unsigned  int b:1;
+unsigned  int c:1;
 };

 /// code-blocks inside definition protection
@@ -20,6 +21,7 @@
 {
 ;
 }
+
 #endif

 /// regular function declarations
@@ -30,6 +32,7 @@
 }

 /// template class explicit Instantiation for some compilers
+
 template class ACLStrategised;

 /// global template Instances with macro definition





NP: breaking the auto-docs away from the documented object is not that
good a thing to do.

Amos



a simple formatter

2008-02-06 Thread Tsantilas Christos
Hi all,

I wrote a small perl script which fixes (some of) the astyle problems.

Before use it, you must adjust the $ASTYLE_BIN variable at the beggining
of the formatter.pl file.

I am running it using the following command:
  # find . -name "*.cc"  -exec formater.pl \{\} \;
but it can take multiple files as arguments.

I used it with astyle versions 1.18 and 1.21. I did not do extensive
tests, so I am sure it contains bugs :-).  Please use it with caution.

I tried to implement Alex suggestions.  For the problem discussion look
at the thread:
http://www.squid-cache.org/mail-archive/squid-dev/200801/0019.html

This script starts the astyle and attach one filter in its input and one
in its output, using pipes.

The input filter search for "unsigned int X:1;" cases and convert
them to  "unsigned int X__FORASTYLE__1;".

The output filter search for "unsigned int X__FORASTYLE__1;"
patterns and convert them back to the original text.
Also it search for the cases "#preprocessor directive {" or "{
#preprocessor_directive" and fix them.


Do we need something like that? Any comments/suggestions? Any testers?

Regards,
Christos



formater.pl
Description: Perl program