Re: new string module

Mauro Tortonesi Mon, 03 Jan 2005 14:59:01 -0800

Alle 22:09, domenica 2 gennaio 2005, Jan Minar ha scritto:
> On Sun, Jan 02, 2005 at 01:37:36AM +0100, Mauro Tortonesi wrote:
> > i have just commited the new string.c module which includes a mechanism
> > to fix the bug reported by no?l köthe:
> >
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=271931
>
> #271931 is:
> >>> From: Ambrose Li <[EMAIL PROTECTED]>
> >>> Subject: Weird escaping makes wget verbose output completely
> >>>  unreadable in non-English locales
> >>> Message-ID: <[EMAIL PROTECTED]>
>
> Perhaps You meant [0]the #261755?
>
> [0] http://bugs.debian.org/261755


yes, both of them.

> > the code was inspired by felix von leitner's libowfat and by jan minar's
> > bug fixing patch.
> >
> > unfortunately i haven't fixed the bug yet since i don't like jan minar's
> > approach (changing logprintf in a not so portable way to encode every
> > string passed to the function as an argument) because of its
> > inefficiency.
>
> That was a hotfix.  You know, that thing that you do in order not to
> have a security hole Right Now.

ok, then i don't think you should be offended just because i said that i think 
your patch is too inefficient to be merged in the wget cvs repository. 
especially after you've posted a bug report on bugtraq (which was more a 
personal attack than a professional bug report) saying that wget authors are 
all incompetent...

> > as Fumitoshi UKAI suggested, the best choice would be to escape only the
> > strings that need to be escaped. so, i think we should probably check
> > together which strings passed to logprintf in the wget code need to be
> > escaped. anyone willing to help?
>
> You don't want to check whether this or that string accidentally needs
> or doesn't need to get escaped. The right way is to sanitize *all*
> untrusted input before you even start thinking about using it.

mmmh, i don't think so. why would you for example want or need to escape 
format strings (that are retrieved via gettext and are already in your local 
charset), the URLs to download or the configuration data read from wgetrc?

anyway, simone piunno and i have been talking a lot about this problem and 
we've found that apart from a couple of minor problems (very easy to fix) the 
current implementation of escape_buffer works fine. the problem is when you 
pass escaped multibyte strings as arguments to printf. if these strings 
contain a 0x00 byte, it will be incorrectly interpreted by printf as a string 
termination characher. simone says for example that UTF16 strings can contain 
null bytes.

i don't really have any clue on how to solve this problem. simone suggests to 
change the internal format of strings in wget to UTF8, but of course i would 
prefer a less invasive solution if possible... i don't even know if we could 
keep using gettext in that case.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.    http://www.ing.unife.it
Institute of Human & Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linux            http://www.deepspace6.net
Ferrara Linux User Group                 http://www.ferrara.linux.it

Re: new string module

Reply via email to