Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Trevor M. Lango
On Tuesday 23 November 2004 23:41, Foo Lim wrote:
 On Tue, 23 Nov 2004, Trevor M. Lango wrote:
  I have been reading the man pages and I'm lost.  I want to scan through
  an input file for an expression with this pattern:
 
      h.*.JPG
 
  and replace it with an expression with the following pattern:
 
      *.h.JPG
 
  Perl and awk both appear to be ideal candidates for just such a task
  but I'm a serious newbie to both of 'em.  Any help much appreciated!

 Hi Trevor,

 Does the pattern h.*.JPG match something like this: h.abc123.JPG ?

Something like this: h.#-##--.###.JPG

 Since the period . is a metacharacter in regular expressions.  If that's
 the case, then a perl script like this would work:

 while () {
   s/h\.(.*)\.JPG/$1.h.JPG/g;
   print;
 }

 FL

 ___
 vox-tech mailing list
 [EMAIL PROTECTED]
 http://lists.lugod.org/mailman/listinfo/vox-tech
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Foo Lim
On Wed, 24 Nov 2004, Trevor M. Lango wrote:

 On Tuesday 23 November 2004 23:41, Foo Lim wrote:
  On Tue, 23 Nov 2004, Trevor M. Lango wrote:
   I have been reading the man pages and I'm lost.  I want to scan through
   an input file for an expression with this pattern:
  
       h.*.JPG
  
   and replace it with an expression with the following pattern:
  
       *.h.JPG
  
   Perl and awk both appear to be ideal candidates for just such a task
   but I'm a serious newbie to both of 'em.  Any help much appreciated!
 
  Hi Trevor,
 
  Does the pattern h.*.JPG match something like this: h.abc123.JPG ?
 
 Something like this: h.#-##--.###.JPG
 
  Since the period . is a metacharacter in regular expressions.  If that's
  the case, then a perl script like this would work:
 
  while () {
s/h\.(.*)\.JPG/$1.h.JPG/g;
print;
  }
 
  FL

The code above should work.  If it's possible to have multiple files on a 
line, you may want to change the regex to this:

  s/h\.(.*?)\.JPG/$1.h.JPG/g;

instead, so it will minimal match instead of do a greedy match.

HTH,
FL

___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


[vox-tech] Understanding a C hello world program

2004-11-24 Thread Peter Jay Salzman
Consider this program:


   #includestdio.h

   int main(void)
   {
  printf(hello world\n);

  return 0;
   }


The more I think about it, the more fascinated I am by it.  I can get the
length of the sections using the size command:


   [EMAIL PROTECTED] size hello_world-1.o 
  textdata bss dec hex filename
48   0   0  48  30 hello_world-1.o

I assume that data is the initialized data segment where programmer
initialized variables are stored.  That's zero because I have no global
variables.

I also assume that bss is zero because I have no programmer-uninitialized
global variables.

When I disassemble the object file, I get 35 bytes:

   Dump of assembler code for function main:
   0x main+0:push   %ebp
   0x0001 main+1:mov%esp,%ebp
   0x0003 main+3:sub$0x8,%esp
   0x0006 main+6:and$0xfff0,%esp
   0x0009 main+9:mov$0x0,%eax
   0x000e main+14:   sub%eax,%esp
   0x0010 main+16:   movl   $0x0,(%esp)
   0x0017 main+23:   call   0x18 main+24
   0x001c main+28:   mov$0x0,%eax
   0x0021 main+33:   leave  
   0x0022 main+34:   ret
   End of assembler dump.

however, size reports a text size of 48.  Where do the extra 13 bytes come
from that size reports?  Probably from hello world\n\0, which is 13 bytes.
But if that's true, the string lives in the text segment.  I always pictured
the text segment as being straight opcodes.  There must be a structure to the
text segment that I was unaware of.

But then, looking at the output of the picture that objdump has of the object
file...

[EMAIL PROTECTED] objdump -h hello_world-1.o 

hello_world-1.o: file format elf32-i386

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 0023      0034  2**2
  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data       0058  2**2
  CONTENTS, ALLOC, LOAD, DATA
  2 .bss        0058  2**2
  ALLOC
  3 .rodata   000d      0058  2**0
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .note.GNU-stack       0065  2**0
  CONTENTS, READONLY
  5 .comment  0026      0065  2**0
  CONTENTS, READONLY


I wish objdump identified hex numbers as hex numbers.  In any event, the text
section has length x23, which is 35.  The opcodes plus the string.  Further
evidence that the string lives in the text segment.

However, I note that there's a section called .rodata, which I've never heard
of before, but I'm assumming that it stands for read only data section.
It's 13 bytes - just the size of hello world\n\0.  That's probably why this
program segfaults:

   #includestdio.h

   int main(void)
   {
  char *string=hello world\n;

  string[3] = 'T';

  puts(string);

  return 0;
   }

because string is the address of something that lives in a section of memory
marked read-only.  Whammo -- sigsegv.  I remember Mark posting about 3 or 4
years ago that this actually worked on some other Unicies (not Linux).


There was originally a question attached to this email, but as I typed more
and more, I realized I'm not even sure what my question is anymore.  Maybe I
have too many of them.

Maybe my immediate question is -- where do read-only strings live?  In the
text section or the .rodata section?  I've seen evidence that it lives in
both section.

If anyone cares to riff off this, I'd certainly be interested in anything
anyone says.

Pete



-- 
The mathematics of physics has become ever more abstract, rather than more
complicated.  The mind of God appears to be abstract but not complicated.
He also appears to like group theory.  --  Tony Zee's Fearful Symmetry

PGP Fingerprint: B9F1 6CF3 47C4 7CD8 D33E  70A9 A3B9 1945 67EA 951D
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Trevor M. Lango
On Wednesday 24 November 2004 00:15, Foo Lim wrote:
 On Wed, 24 Nov 2004, Trevor M. Lango wrote:
  On Tuesday 23 November 2004 23:41, Foo Lim wrote:
   On Tue, 23 Nov 2004, Trevor M. Lango wrote:
I have been reading the man pages and I'm lost.  I want to scan
through an input file for an expression with this pattern:
   
    h.*.JPG
   
and replace it with an expression with the following pattern:
   
    *.h.JPG
   
Perl and awk both appear to be ideal candidates for just such a task
but I'm a serious newbie to both of 'em.  Any help much appreciated!
  
   Hi Trevor,
  
   Does the pattern h.*.JPG match something like this: h.abc123.JPG ?
 
  Something like this: h.#-##--.###.JPG
 
   Since the period . is a metacharacter in regular expressions.  If
   that's the case, then a perl script like this would work:
  
   while () {
 s/h\.(.*)\.JPG/$1.h.JPG/g;
 print;
   }
  
   FL

 The code above should work.  If it's possible to have multiple files on a
 line, you may want to change the regex to this:

   s/h\.(.*?)\.JPG/$1.h.JPG/g;

 instead, so it will minimal match instead of do a greedy match.

Okay I am not having any success.  Perhaps I need to be more specific - I am 
trying to scan through html files to replace the image references in lines 
like this one:

   img align=right src=/IMAGES/C/h.I-LP-CEUR-AD.003.jpg

In this particular example, I need to replace:

h.I-LP-CEUR-AD.003.jpg

with:

I-LP-CEUR-AD.003.h.jpg

Thank you for your responses!

 HTH,
 FL

 ___
 vox-tech mailing list
 [EMAIL PROTECTED]
 http://lists.lugod.org/mailman/listinfo/vox-tech
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Ken Bloom
On Wed, Nov 24, 2004 at 08:57:03AM -0800, Trevor M. Lango wrote:
 On Wednesday 24 November 2004 00:15, Foo Lim wrote:
  On Wed, 24 Nov 2004, Trevor M. Lango wrote:
   On Tuesday 23 November 2004 23:41, Foo Lim wrote:
On Tue, 23 Nov 2004, Trevor M. Lango wrote:
 I have been reading the man pages and I'm lost. ?I want to scan
 through an input file for an expression with this pattern:

 ? ? h.*.JPG

 and replace it with an expression with the following pattern:

 ? ? *.h.JPG

 Perl and awk both appear to be ideal candidates for just such a task
 but I'm a serious newbie to both of 'em. ?Any help much appreciated!
   
Hi Trevor,
   
Does the pattern h.*.JPG match something like this: h.abc123.JPG ?
  
   Something like this: h.#-##--.###.JPG
  
Since the period . is a metacharacter in regular expressions.  If
that's the case, then a perl script like this would work:
   
while () {
  s/h\.(.*)\.JPG/$1.h.JPG/g;
  print;
}
   
FL
 
  The code above should work.  If it's possible to have multiple files on a
  line, you may want to change the regex to this:
 
s/h\.(.*?)\.JPG/$1.h.JPG/g;
 
  instead, so it will minimal match instead of do a greedy match.
 
 Okay I am not having any success.  Perhaps I need to be more specific - I am 
 trying to scan through html files to replace the image references in lines 
 like this one:
 
img align=right src=/IMAGES/C/h.I-LP-CEUR-AD.003.jpg
 
 In this particular example, I need to replace:
 
 h.I-LP-CEUR-AD.003.jpg
 
 with:
 
 I-LP-CEUR-AD.003.h.jpg
 
 Thank you for your responses!

Capitalization counts. If the files are named with a .jpg, then your
regexp pattern has to say .jpg. If the files are named with a .JPG,
then your regexp pattern has to say .JPG. There is a flag that you can
add at the end (were the g is) to do a case insensitive match, but not
to do a case insensitive, but not to make the replacement string case
insensitive.

-- 
I usually have a GPG digital signature included as an attachment.
See http://www.gnupg.org/ for info about these digital signatures.


signature.asc
Description: Digital signature
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Foo Lim
On Wed, 24 Nov 2004, Ken Bloom wrote:

 On Wed, Nov 24, 2004 at 08:57:03AM -0800, Trevor M. Lango wrote:

 s/h\.(.*?)\.JPG/$1.h.JPG/g;
  
   instead, so it will minimal match instead of do a greedy match.
  
  Okay I am not having any success.  Perhaps I need to be more specific - I 
  am 
  trying to scan through html files to replace the image references in lines 
  like this one:
  
 img align=right src=/IMAGES/C/h.I-LP-CEUR-AD.003.jpg
  
  In this particular example, I need to replace:
  
  h.I-LP-CEUR-AD.003.jpg
  
  with:
  
  I-LP-CEUR-AD.003.h.jpg
  
  Thank you for your responses!
 
 Capitalization counts. If the files are named with a .jpg, then your
 regexp pattern has to say .jpg. If the files are named with a .JPG,
 then your regexp pattern has to say .JPG. There is a flag that you can
 add at the end (were the g is) to do a case insensitive match, but not
 to do a case insensitive, but not to make the replacement string case
 insensitive.

Add an i to the end of the statement:

s/h\.(.*?)\.JPG/$1.h.JPG/gi;

However, this will match files that start with a lowercase h as well as an 
uppercase H.  If that's fine, then this regex will do the job.

FL

___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Mitch Patenaude
On Nov 24, 2004, at 9:56 AM, Foo Lim wrote:
Add an i to the end of the statement:
s/h\.(.*?)\.JPG/$1.h.JPG/gi;
However, this will match files that start with a lowercase h as well 
as an
uppercase H.  If that's fine, then this regex will do the job.
No... the problem is that it will take
  h.foo1.jpg
and replace it with
  foo1.h.JPG
Which probably won't work (since the case of the extension changes and 
most *nix are case sensitive...)  The regex should be

s/h\.(.*?)\.(JPG|jpg)/$1.h.$2/g;
Or even more generally:
s/h\.(.*?)\.([Jj][Pp][Gg])/$1.h.$2/g;
Which you can wrap up in a single command like so
perl -pi.orig -e 's/h.(.*?).([Jj][Pp][Gg])/$1.h.$2/g;' *.html
which will do the replacement on all HTML files.. saving the originals 
with the extension .orig in case something goes horribly wrong and your 
need to undo it.

  -- Mitch
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Understanding a C hello world program

2004-11-24 Thread Bryan Richter

Peter Jay Salzman wrote:
 Consider this program:
 
 snip
 
[EMAIL PROTECTED] size hello_world-1.o 
   textdata bss dec hex filename
 48   0   0  48  30 hello_world-1.o
 
 When I disassemble the object file, I get 35 bytes:
 
 snip 
 
 [EMAIL PROTECTED] objdump -h hello_world-1.o 
 
 snip
 
 
 I wish objdump identified hex numbers as hex numbers.  In any event, the text
 section has length x23, which is 35.  The opcodes plus the string.  Further
 evidence that the string lives in the text segment.

Actually, objdump agrees with disassembly- string doesn't live in text region
(35 bytes in both).

 
 Maybe my immediate question is -- where do read-only strings live?  In the
 text section or the .rodata section?  I've seen evidence that it lives in
 both section.

Maybe gcc -S would be enlightening? It doesn't enlighten me, but then I don't
know what a lot of the directives do (.section, for one). There is no .data
directive, however, and the hello world\n definitely comes before the .text
directive. It looks like:

---
.section .rodata
.LC0:
.string hello world\n
.text
---

Maybe size, which doesn't seem to recognize .rodata, just lumps .data and
everything else?

-Bryan
-- 
Bryan Richter
UCDTT President
UC Davis Undergrad, Physics Dept.
-
A PGP signature is (probably) attached to this email. 


signature.asc
Description: Digital signature
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Mitch Patenaude
On Nov 24, 2004, at 10:15 AM, Mitch Patenaude wrote:
perl -pi.orig -e 's/h.(.*?).([Jj][Pp][Gg])/$1.h.$2/g;' *.html
oops.. that should be
  perl -pi.orig -e 's/h\.(.*?)\.([Jj][Pp][Gg])/$1.h.$2/g;' *.html
(forgot to escape the literal periods.  This is why my mom detests 
regex... it ends up looking an awful lot like old serial line noise. 
;-)

  -- Mitch
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Peter Jay Salzman
On Wed 24 Nov 04, 10:20 AM, Mitch Patenaude [EMAIL PROTECTED] said:
 On Nov 24, 2004, at 10:15 AM, Mitch Patenaude wrote:
 perl -pi.orig -e 's/h.(.*?).([Jj][Pp][Gg])/$1.h.$2/g;' *.html
 
 oops.. that should be
   perl -pi.orig -e 's/h\.(.*?)\.([Jj][Pp][Gg])/$1.h.$2/g;' *.html
 
 (forgot to escape the literal periods.  This is why my mom detests 
 regex... it ends up looking an awful lot like old serial line noise. 
 ;-)
 
   -- Mitch

LOL.  That deserves to be in someone's .sig...   ;-)

Pete
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Jay Strauss
Not to beat a dead horse:
perl -i.bak -p -e s/(h)\.(.*?)\.(jpg)/$2.$1.$3/ig input.txt
In a perl one liner, making a backup of original, and saving capitalization
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Foo Lim
On Wed, 24 Nov 2004, Mitch Patenaude wrote:

 On Nov 24, 2004, at 10:15 AM, Mitch Patenaude wrote:
  perl -pi.orig -e 's/h.(.*?).([Jj][Pp][Gg])/$1.h.$2/g;' *.html
 
 oops.. that should be
perl -pi.orig -e 's/h\.(.*?)\.([Jj][Pp][Gg])/$1.h.$2/g;' *.html
 
 (forgot to escape the literal periods.  This is why my mom detests 
 regex... it ends up looking an awful lot like old serial line noise. 
 ;-)

Good catch(es).  I stand corrected.  :-)

FL

___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] yet another SQL question...[solved]

2004-11-24 Thread Dylan Beaudette
On Monday 22 November 2004 07:06 pm, David Hummel wrote:
 On Mon, Nov 22, 2004 at 02:00:59PM -0800, Dylan Beaudette wrote:
  I would like to make a table that displays the dominant component
  (i.e.  comppct_r is the largest for a given larger unit) and
  associated attributes for each larger unit.

 I would use CREATE TABLE ... SELECT.

 I think the following SELECT will work:

   select
 mukey,
 max(comppct_r),
 taxorder,
 taxsuborder,
 taxgrtgroup
   from component
   group by mukey;

 -David

(Feels like an idiot)

Thanks,

-- 
Dylan Beaudette
Soil Science Graduate Group
University of California at Davis
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Understanding a C hello world program

2004-11-24 Thread Mark K. Kim
On Wed, 24 Nov 2004, Peter Jay Salzman wrote:
[snip]
[EMAIL PROTECTED] size hello_world-1.o
   textdata bss dec hex filename
 48   0   0  48  30 hello_world-1.o
[snip]

What the... how did you make your program so small?  My output:

  $gcc hello.c -o hello
  $size hello
 textdata bss dec hex filename
  960 256   41220 4c4 hello
  $strip hello
  $size hello
 textdata bss dec hex filename
  960 256   41220 4c4 hello
  $ls -l hello
  -rwxr-xr-x  1 markmark2948 Nov 24 10:24 hello*
  $gcc --version
  gcc (GCC) 3.3.5 (Debian 1:3.3.5-2)
  [snip]

-_-'

 I assume that data is the initialized data segment where programmer
 initialized variables are stored.  That's zero because I have no global
 variables.

Yes.

 I also assume that bss is zero because I have no programmer-uninitialized
 global variables.

What is bss?  Is that space dynamically generated in memory at runtime?

 When I disassemble the object file, I get 35 bytes:
[snip]
 however, size reports a text size of 48.  Where do the extra 13 bytes come
 from that size reports?  Probably from hello world\n\0, which is 13 bytes.

Yes.

 But if that's true, the string lives in the text segment.  I always pictured
 the text segment as being straight opcodes.  There must be a structure to the
 text segment that I was unaware of.

To the CPU, there's no distinction between opcodes and read-only data.
Pretty much the only differences between text and data sections are:

  Text: r-x (readable, not writable, executable)
  Data: rw- (readable, writable, not executable)

At least in theory. data section is also executable in practice which
allows buffer overflow to be used to execute malicious codes.  But we
can't simply block out execution of the data section because I think
function pointers need to be executed outside the text area..???

Anyway, so if you think about a constant string like, hello, world,
there's no reason not to store it in the text section, because the string
isn't supposed to be changeable.  If you were to store it in the data
section, however, you could accidentally overwrite it, which is fine from
a single-program point of view, but you can save some memory if you put it
in the text section, because the text section isn't replicated (only the
other sections) if you run multiple versions of the same program.  This
makes it more important that the text section be non-writable and that
policy be enforced, so that one program accidentally modifying the text
section doesn't affect another instance of the same program.

Or that's how I understand it.

 But then, looking at the output of the picture that objdump has of the object
 file...
[snip]
 However, I note that there's a section called .rodata, which I've never heard
 of before, but I'm assumming that it stands for read only data section.
 It's 13 bytes - just the size of hello world\n\0.

Looking at the assembly output (thanks for that idea, Bryan!), .rodata is
a custom name of a section:

   .section .rodata

whereas the .text section is reserved:

   .text

They might as well have named .rodata section as .whatever instead of
.rodata and it'll still work the same way.  What the compiler does with
this section is I think linker-dependent, but it probably takes the
section and stick it in one of the canonical sections -- .text, .data,
etc. according to some predefined rule.  (It's probably .text section by
default, unless specified otherwise.)

These custom sections are useful because it allows the linker to shuffle
them around to fit any hardware constraints that may exist.  The linker
won't break up a section, though.

From the assembly output, the .rodata section goes before the .text
section, so it probably gets mapped to some default section which happens
to be .text.  It's possible gcc knows to map the custom section named
.rodata to the .text area, but I just tried changing the name from .rodata
to .whatever and it still compiles without complaining.

 That's probably why this
 program segfaults:

#includestdio.h

int main(void)
{
   char *string=hello world\n;

   string[3] = 'T';

   puts(string);

   return 0;
}

 because string is the address of something that lives in a section of memory
 marked read-only.  Whammo -- sigsegv.  I remember Mark posting about 3 or 4
 years ago that this actually worked on some other Unicies (not Linux).

Wow... you still remember that?  Yeah I seem to recall trying out
something like that.  I think it was one of Sun's OSes.  Might have been
HP/UX, too, since those two were the primary other unices I had access to.

 Maybe my immediate question is -- where do read-only strings live?  In the
 text section or the .rodata section?  I've seen evidence that it lives in
 both section.

.rodata is a custom section name which lives in the .text section.  Think
of it as an alias to an offset 

Re: [vox-tech] Understanding a C hello world program

2004-11-24 Thread Ken Bloom
On Wed, Nov 24, 2004 at 11:10:34AM -0800, Mark K. Kim wrote:
 On Wed, 24 Nov 2004, Peter Jay Salzman wrote:
 [snip]
 [EMAIL PROTECTED] size hello_world-1.o
textdata bss dec hex filename
  48   0   0  48  30 hello_world-1.o
 [snip]
 
 What the... how did you make your program so small?  My output:
 
   $gcc hello.c -o hello
   $size hello
  textdata bss dec hex filename
   960 256   41220 4c4 hello
   $strip hello
   $size hello
  textdata bss dec hex filename
   960 256   41220 4c4 hello
   $ls -l hello
   -rwxr-xr-x  1 markmark2948 Nov 24 10:24 hello*
   $gcc --version
   gcc (GCC) 3.3.5 (Debian 1:3.3.5-2)
   [snip]
 
 -_-'

He's working with hello.o, the output of gcc -c hello.c
You're working with hello, the output of gcc -o hello hello.c
So yours is a completely linked version of the file, and his is the
object file pre-linking. He could use ld -o hello $LIBARIES hello.o to
get what you have.

--Ken Bloom

-- 
I usually have a GPG digital signature included as an attachment.
See http://www.gnupg.org/ for info about these digital signatures.


signature.asc
Description: Digital signature
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Using awk or perl to find and replace

2004-11-24 Thread Trevor M. Lango
On Wednesday 24 November 2004 10:20, Mitch Patenaude wrote:
 On Nov 24, 2004, at 10:15 AM, Mitch Patenaude wrote:
  perl -pi.orig -e 's/h.(.*?).([Jj][Pp][Gg])/$1.h.$2/g;' *.html

 oops.. that should be
perl -pi.orig -e 's/h\.(.*?)\.([Jj][Pp][Gg])/$1.h.$2/g;' *.html

Absolutely perfect!  Thanks!

 (forgot to escape the literal periods.  This is why my mom detests
 regex... it ends up looking an awful lot like old serial line noise.
 ;-)

-- Mitch

 ___
 vox-tech mailing list
 [EMAIL PROTECTED]
 http://lists.lugod.org/mailman/listinfo/vox-tech
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech


Re: [vox-tech] Understanding a C hello world program

2004-11-24 Thread Peter Jay Salzman
On Wed 24 Nov 04, 11:10 AM, Mark K. Kim lugodatcbreakdotorg said:
 On Wed, 24 Nov 2004, Peter Jay Salzman wrote:
 [snip]
 [EMAIL PROTECTED] size hello_world-1.o
textdata bss dec hex filename
  48   0   0  48  30 hello_world-1.o
 [snip]
 
 What the... how did you make your program so small?  My output:
 
Heh.  I was working with the object file.  No linking was done.  :)


 What is bss?  Is that space dynamically generated in memory at runtime?
 
Yeah -- my understanding is that the data segment is divided into two parts:
a segment which gets initialized by the programmer and a segment that gets
initialized by the C library at run time.  The latter is called bss, which
was a command used by an old assembler.  It stood for block started by
symbol.

 At least in theory. data section is also executable in practice which
 allows buffer overflow to be used to execute malicious codes.  But we
 can't simply block out execution of the data section because I think
 function pointers need to be executed outside the text area..???

Is that true?  I'd think that function pointers necessarily need to point
inside the text section.

 Anyway, so if you think about a constant string like, hello, world,
 there's no reason not to store it in the text section, because the string
 isn't supposed to be changeable.  If you were to store it in the data
 section, however, you could accidentally overwrite it, which is fine from
 a single-program point of view, but you can save some memory if you put it
 in the text section, because the text section isn't replicated (only the
 other sections) if you run multiple versions of the same program.  This
 makes it more important that the text section be non-writable and that
 policy be enforced, so that one program accidentally modifying the text
 section doesn't affect another instance of the same program.
 
 Or that's how I understand it.

No -- that makes perfect sense.  I think that has to be true.  Good thinking!

  However, I note that there's a section called .rodata, which I've never
  heard of before, but I'm assumming that it stands for read only data
  section.  It's 13 bytes - just the size of hello world\n\0.
 
 Looking at the assembly output (thanks for that idea, Bryan!), .rodata is a
 custom name of a section:
 
.section .rodata
 
 whereas the .text section is reserved:
 
.text
 
 They might as well have named .rodata section as .whatever instead of
 .rodata and it'll still work the same way.  What the compiler does with
 this section is I think linker-dependent, but it probably takes the section
 and stick it in one of the canonical sections -- .text, .data, etc.
 according to some predefined rule.  (It's probably .text section by
 default, unless specified otherwise.)
 
More coolness.

 These custom sections are useful because it allows the linker to shuffle
 them around to fit any hardware constraints that may exist.  The linker
 won't break up a section, though.
 
YMC.  Where did you learn this from?  Have you taken a compiler class?  At
some point in my life, I'd like to learn more about this.

  That's probably why this program segfaults:
 
 #includestdio.h
 
 int main(void)
 {
char *string=hello world\n;
 
string[3] = 'T';
 
puts(string);
 
return 0;
 }
 
  because string is the address of something that lives in a section of memory
  marked read-only.  Whammo -- sigsegv.  I remember Mark posting about 3 or 4
  years ago that this actually worked on some other Unicies (not Linux).
 
 Wow... you still remember that?  Yeah I seem to recall trying out
 something like that.  I think it was one of Sun's OSes.  Might have been
 HP/UX, too, since those two were the primary other unices I had access to.
 
Heh.  I have just about every post you ever made to vox-tech saved on my hard
drive.  ;-)   I have subject folders and people folders.  Some people
can't help but say interesting stuff.

  Maybe my immediate question is -- where do read-only strings live?  In the
  text section or the .rodata section?  I've seen evidence that it lives in
  both section.
 
 .rodata is a custom section name which lives in the .text section.  Think
 of it as an alias to an offset into the .text section.  At least that's
 how it worked in one assembler I used to use, and this seems to work the
 same way from what you've discovered.  That's pretty cool!
 
  If anyone cares to riff off this, I'd certainly be interested in anything
  anyone says.
 
 I think it's cool that you've done all this analysis.  I didn't know about
 the size program and I haven't really seen much usage of objdump so it's
 cool to see it used here.  Thanks Peter!

I just asked the questions... you and Bryan answered them!  :)

So just to reiterate.  It appears that read-only strings get placed into a
custom section called .rodata by the compiler.  Then, .rodata gets placed
into the .text segment during linking.

I've 

[vox-tech] OT: are there any bigloo or objective CAML users?

2004-11-24 Thread Henry House
Has anyone used Bigloo (a compiled scheme dialect) or objecgive CAML?

-- 
Henry House
Please don't send me HTML mail! My mail system will reject it.
The unintelligible text that may follow is a digital signature.
See http://hajhouse.org/pgp to find out how to use it.
My OpenPGP key: http://hajhouse.org/hajhouse.asc.



signature.asc
Description: Digital signature
___
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech