On Sat, 8 Jan 2005 [EMAIL PROTECTED] wrote:

I'd like to report a bug (buffer overflow?) in the function sub(..., perl = 
TRUE)

I wanted to implement the familiar perl function for removing white spaces 
before and after a character string:
sub trimwhitespace($)
{
        my $string = shift;
        $string =~ s/^\s+//;
        $string =~ s/\s+$//;
        return $string;
}

So in R this would (presumably) become:

trimwhitespace <- function(x) {
   x <- sub('^\\s+', '', x, perl = TRUE) ## Removes preceding white spaces
   x <- sub('\\s+$', '', x, perl = TRUE) ## Removes trailing white spaces
   x
}

Expected behavior:
trimwhitespace(" abc")
[1] "abc"

On Windows:
trimwhitespace(" abc")
[1] "abc\0\220\277\036\001\220°ß\08iW\001p±ß\0X°ß\0"        ## That's not good! 
Looks like a buffer overflow

On Linux:
[1] "abc\0\0\002\0\0 \377\0\0\0\002\0\0\0\006\0\0/\377\0\0" ## Linux goofs as 
well!

Debugging shows that it is the first line in the function that produces the overflow. The overflow seems proportional to the about of preceding white spaces. I'm not sure if this is exploitable or not, but it might be possible to run arbitrary code stored in a character object using this.

Don't think so. It's actually just a printing issue: the length used for printing is marked incorrectly (as the length of the original string), and
you won't be able to access the character string past the \0 in any other way.


I've fixed it now.

--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to