[Siglinux] using sed, no PERL!, to remove latex comments

William L. Jarrold Thu, 18 Nov 2004 13:06:47 -0800

Hi,

I need to map through a whole bunch of latex .tex files and remove
everything on a line after a latex comment.  And I must be sure that
this is absolutely correct.  In other words, I wanna remove all the
characters on a line after a...


% 

...but NOT after a...

\%

...why?  Because \% in latex means "print out the % sign" whereas % in
latex is analogous to # in C or perl.

So, how do I do this?  Well, you'll see my long sad story below.  I
first tried sed and gave up.  As an unimportant academic exercise, any
advice on how to do this with sed would be fun.  But what is most
important is ...

(1) can you see any bugs in my perl solution? 
(2) do you know how to do this without the FNORD hack (see below)?

So, most important is to answer questions 1 and 2.  But feel free to
comment on any aspect of the below.

Maybe I should never use sed and always use perl?

...Okay now for the details.

Well here is a start..Thinking I could do this qwik and dirty with sed
I tried this...

for i in *.tex ; do sed -f sedcmdfile $i > OutDir/$i ; done

...okay, so what is in sedcmdfile?  That is the magic question.  Here
is what I have so far...

drawbridge.cs.utexas.edu$ cat sedcmdfile 
s/[^\\]%.*//g
s/^%.*//g

...it works decent but not perfectly. (btw google pointed me to
http://spacsun.rice.edu/FAQ/sed.html which helped inspire the above)
The problem is that the first command will turn a line in foo.tex from
this...

\hline% (+ 14 15 17) = 46 (/ (- 46 44.7) 11.2) = 0.12

..to this...

\hlin

...So, as you seen the first line of sedcmdfile is causing my script
to be overzealous.

In a nutshell the problem is, how do I, for each line, make it remove
all %'s and everything after EXCEPT when the % is preceded by a \.

HRM, MAYBE I SHOULDA USED PERL?  

I did a google search and found this incantation at
(http://natura.di.uminho.pt/~jj/pln/def.pl)...

 s/^%.*\n|([^\\])%.*\n/$1/go 

...which might be close but I have spent so long on this and there is
prolly someone out there who knows the answer it 5 min....Well, after
some messing around, the above perl thing seemed buggy/bad/wrong but
it inspired this...

I created the file try-to-remove-latex-comments.pl and its contents
are below...

#!/usr/bin/perl -w

use strict;

while (<>) {
    s/^%.*//go ; # remove any line beginning with %
    s/\\%/FNORD/g ;# convert free standing \%'s to FNORD's
    s/(.*)%.*/$1/g ;# get rid of anything after a comment
    s/FNORD/\\%/g ;# convert FNORD's back to \%'s
    print $_;
}

...so then, you see, I did this...

for i in *.tex ; do ~/Perl/try-to-remove-latex-comments.pl $i > OutDir/$i ; done

...it seemed to work!  I was able to latex everything -- no errors!  A
qwik visual inspect of foo.tex and of foo.dvi showed no obvious
problems.

BUT, I have spent very long on this.  I know the g option is for
greedy but I do not know what o is for.  

Most importantly:

FNORD is a hack.  How can I do this without FNORD?

...and also...

Can you see any bugs in my perl solution? 

Thanks,

Bill



_______________________________________________
Siglinux mailing list
[EMAIL PROTECTED]
http://machito.utacm.org/mailman/listinfo/siglinux

[Siglinux] using sed, no PERL!, to remove latex comments

Reply via email to