Re: [vox-tech] adding line numbers to an HTML file

2006-10-26 Thread Micah Cowan
On Thu, 2006-10-26 at 14:23 -0700, Dylan Beaudette wrote:
 Hi everyone,
 
 wondering if there is a simple way to add line numbers to every non-html tag 
 in a webpage:
 
 here is a dirty hack that does not work very well:
 
 lynx -source http://casoilresource.lawr.ucdavis.edu/drupal/node/319/print | 
 cat -n  test.html
 
 or - if there is a way to add line numbers to non-tag data, similar to how 
 the 
 paste (http://rafb.net/paste/) service works. 
 
 any ideas would be very helpful!
 

The link you show doesn't seem to distinguish tag data, and it's
really not clear to me exactly what you're trying to accomplish. Perhaps
if you could post a short before-and-after example?

Depending on what you want, Perl or Python--or possibly even just
awk--should be able to meet your needs, but I can't really give you a
solution until I understand the problem properly :)

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/


___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] adding line numbers to an HTML file

2006-10-26 Thread Dylan Beaudette
On Thursday 26 October 2006 14:25, Micah Cowan wrote:
 On Thu, 2006-10-26 at 14:23 -0700, Dylan Beaudette wrote:
  Hi everyone,
 
  wondering if there is a simple way to add line numbers to every non-html
  tag in a webpage:
 
  here is a dirty hack that does not work very well:
 
  lynx -source http://casoilresource.lawr.ucdavis.edu/drupal/node/319/print
  | cat -n  test.html
 
  or - if there is a way to add line numbers to non-tag data, similar to
  how the paste (http://rafb.net/paste/) service works.
 
  any ideas would be very helpful!

 The link you show doesn't seem to distinguish tag data, and it's
 really not clear to me exactly what you're trying to accomplish. Perhaps
 if you could post a short before-and-after example?

 Depending on what you want, Perl or Python--or possibly even just
 awk--should be able to meet your needs, but I can't really give you a
 solution until I understand the problem properly :)

some clarification is indeed warranted:

the page in question 
(http://casoilresource.lawr.ucdavis.edu/drupal/node/319/print) produces 
printer-friendly output (simple html). I would like to add line numbers to 
this document so that the students in my class can easily refer to specific 
lines of code. In my hack posted above, i add a line number to *every* line - 
even html tags like head, body , etc. I would like to add line numbers to 
the text in-between html elements. i.e

body
1 something
2 about 
3 some other thing
4 here
...
/body

perhaps some regex-fu is required?





-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341
___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] adding line numbers to an HTML file

2006-10-26 Thread Micah Cowan
On Thu, 2006-10-26 at 15:13 -0700, Dylan Beaudette wrote:
 On Thursday 26 October 2006 14:25, Micah Cowan wrote:
  On Thu, 2006-10-26 at 14:23 -0700, Dylan Beaudette wrote:
   Hi everyone,
  
   wondering if there is a simple way to add line numbers to every non-html
   tag in a webpage:
  
   here is a dirty hack that does not work very well:
  
   lynx -source http://casoilresource.lawr.ucdavis.edu/drupal/node/319/print
   | cat -n  test.html
  
   or - if there is a way to add line numbers to non-tag data, similar to
   how the paste (http://rafb.net/paste/) service works.
  
   any ideas would be very helpful!
 
  The link you show doesn't seem to distinguish tag data, and it's
  really not clear to me exactly what you're trying to accomplish. Perhaps
  if you could post a short before-and-after example?
 
  Depending on what you want, Perl or Python--or possibly even just
  awk--should be able to meet your needs, but I can't really give you a
  solution until I understand the problem properly :)
 
 some clarification is indeed warranted:
 
 the page in question 
 (http://casoilresource.lawr.ucdavis.edu/drupal/node/319/print) produces 
 printer-friendly output (simple html). I would like to add line numbers to 
 this document so that the students in my class can easily refer to specific 
 lines of code. In my hack posted above, i add a line number to *every* line - 
 even html tags like head, body , etc. I would like to add line numbers to 
 the text in-between html elements. i.e
 
 body
 1 something
 2 about 
 3 some other thing
 4 here
 ...
 /body
 
 perhaps some regex-fu is required?

A quick awk script I whipped up earlier when you first posted was:

  awk 'BEGIN{ l=1 }   $1 ~ /^[^]/ { $0 = l++   $0 }   { print }'

Which takes its input and prepends an incrementing line number before
any lines that don't start with a  as their first non-space
character. That's almost as stupid as cat -n, though, as it will add
attributes before lines that are part of a tag spread across multiple
lines, and will fail to add line numbers before lines that consist of
mostly textual content (such as a line that is wrapped in a b tag).

Given your specific example, I notice that most of the examples consist
of lines ending in br, so for this specific case, you might use:

  awk 'BEGIN{ l=1 }   $0 ~ /br$/  $1 ~ /^[^]/ { $0 = l++ nbsp;
$0 }   { print }'

Which helps limit it to the lines in the examples.

If you want them to reset for each example, the following might work:

  awk 'BEGIN{ l=1 }

  {
if ($0 ~ /br$/  $1 ~ /^[^]/)
  $0 = l++ nbsp; $0;
else
  l=1;

print;
  }'

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/


___
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech