On Wed, Jan 29, 2014 at 07:31:29PM -0500, Tom Lane wrote:
> Bruce Momjian <br...@momjian.us> writes:
> > I have cleaned up entab.c so I am ready to add a new option that removes
> > tabs from only comments.  Would you like me to create that and provide a
> > diff at a URL?  It would have to be run against all back branches.
> 
> If you think you can actually tell the difference reliably in entab,
> sure, give it a go.

OK, I have modified entab.c in a private patch to only process text
inside comments, and not process leading whitespace, patch attached.  I
basically ran 'entab -o -t4 -d' on the C files.

The result are here, in context, plain, and unified format:

        http://momjian.us/expire/entab_comment.cdiff
        http://momjian.us/expire/entab_comment.pdiff
        http://momjian.us/expire/entab_comment.udiff

and their line counts:

        89741 entab_comment.cdiff
        26351 entab_comment.pdiff
        50503 entab_comment.udiff

I compute 6627 lines as modified.  What I did not do is handle _only_
cases with periods before the tabs.  Should I try that?

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/tools/entab/entab.c b/src/tools/entab/entab.c
new file mode 100644
index 3b849f2..3fd2997
*** a/src/tools/entab/entab.c
--- b/src/tools/entab/entab.c
*************** main(int argc, char **argv)
*** 51,63 ****
  {
  	int			tab_size = 8,
  				min_spaces = 2,
  				protect_quotes = FALSE,
  				del_tabs = FALSE,
  				clip_lines = FALSE,
  				prv_spaces,
  				col_in_tab,
  				escaped,
! 				nxt_spaces;
  	char		in_line[BUFSIZ],
  				out_line[BUFSIZ],
  			   *src,
--- 51,66 ----
  {
  	int			tab_size = 8,
  				min_spaces = 2,
+ 				only_comments = FALSE,
  				protect_quotes = FALSE,
  				del_tabs = FALSE,
  				clip_lines = FALSE,
+ 				in_comment = FALSE,
  				prv_spaces,
  				col_in_tab,
  				escaped,
! 				nxt_spaces,
! 				leading_whitespace;
  	char		in_line[BUFSIZ],
  				out_line[BUFSIZ],
  			   *src,
*************** main(int argc, char **argv)
*** 74,80 ****
  	if (strcmp(cp, "detab") == 0)
  		del_tabs = 1;
  
! 	while ((ch = getopt(argc, argv, "cdhqs:t:")) != -1)
  		switch (ch)
  		{
  			case 'c':
--- 77,83 ----
  	if (strcmp(cp, "detab") == 0)
  		del_tabs = 1;
  
! 	while ((ch = getopt(argc, argv, "cdhoqs:t:")) != -1)
  		switch (ch)
  		{
  			case 'c':
*************** main(int argc, char **argv)
*** 83,88 ****
--- 86,94 ----
  			case 'd':
  				del_tabs = TRUE;
  				break;
+ 			case 'o':
+ 				only_comments = TRUE;
+ 				break;
  			case 'q':
  				protect_quotes = TRUE;
  				break;
*************** main(int argc, char **argv)
*** 97,102 ****
--- 103,109 ----
  				fprintf(stderr, "USAGE: %s [ -cdqst ] [file ...]\n\
  	-c (clip trailing whitespace)\n\
  	-d (delete tabs)\n\
+ 	-o (only comments)\n\
  	-q (protect quotes)\n\
  	-s minimum_spaces\n\
  	-t tab_width\n",
*************** main(int argc, char **argv)
*** 134,146 ****
  			if (escaped == FALSE)
  				quote_char = ' ';
  			escaped = FALSE;
  
  			/* process line */
  			while (*src != NUL)
  			{
  				col_in_tab++;
  				/* Is this a potential space/tab replacement? */
! 				if (quote_char == ' ' && (*src == ' ' || *src == '\t'))
  				{
  					if (*src == '\t')
  					{
--- 141,163 ----
  			if (escaped == FALSE)
  				quote_char = ' ';
  			escaped = FALSE;
+ 			leading_whitespace = TRUE;
  
  			/* process line */
  			while (*src != NUL)
  			{
  				col_in_tab++;
+ 
+ 				/* look backward so we handle slash-star-slash properly */
+ 				if (!in_comment && src > in_line &&
+ 					*(src - 1) == '/' && *src == '*')
+ 					in_comment = TRUE;
+ 				else if (in_comment && *src == '*' && *(src + 1) == '/')
+ 					in_comment = FALSE;
+ 
  				/* Is this a potential space/tab replacement? */
! 				if ((!only_comments || (in_comment && !leading_whitespace)) &&
! 					quote_char == ' ' && (*src == ' ' || *src == '\t'))
  				{
  					if (*src == '\t')
  					{
*************** main(int argc, char **argv)
*** 192,197 ****
--- 209,218 ----
  				/* Not a potential space/tab replacement */
  				else
  				{
+ 					/* allow leading stars in comments */
+ 					if (leading_whitespace && *src != ' ' && *src != '\t' &&
+ 						(!in_comment || *src != '*'))
+ 						leading_whitespace = FALSE;
  					/* output accumulated spaces */
  					output_accumulated_spaces(&prv_spaces, &dst);
  					/* This can only happen in a quote. */
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to