Alle 22:09, domenica 2 gennaio 2005, Jan Minar ha scritto: > On Sun, Jan 02, 2005 at 01:37:36AM +0100, Mauro Tortonesi wrote: > > i have just commited the new string.c module which includes a mechanism > > to fix the bug reported by no?l köthe: > > > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=271931 > > #271931 is: > >>> From: Ambrose Li <[EMAIL PROTECTED]> > >>> Subject: Weird escaping makes wget verbose output completely > >>> unreadable in non-English locales > >>> Message-ID: <[EMAIL PROTECTED]> > > Perhaps You meant [0]the #261755? > > [0] http://bugs.debian.org/261755
yes, both of them. > > the code was inspired by felix von leitner's libowfat and by jan minar's > > bug fixing patch. > > > > unfortunately i haven't fixed the bug yet since i don't like jan minar's > > approach (changing logprintf in a not so portable way to encode every > > string passed to the function as an argument) because of its > > inefficiency. > > That was a hotfix. You know, that thing that you do in order not to > have a security hole Right Now. ok, then i don't think you should be offended just because i said that i think your patch is too inefficient to be merged in the wget cvs repository. especially after you've posted a bug report on bugtraq (which was more a personal attack than a professional bug report) saying that wget authors are all incompetent... > > as Fumitoshi UKAI suggested, the best choice would be to escape only the > > strings that need to be escaped. so, i think we should probably check > > together which strings passed to logprintf in the wget code need to be > > escaped. anyone willing to help? > > You don't want to check whether this or that string accidentally needs > or doesn't need to get escaped. The right way is to sanitize *all* > untrusted input before you even start thinking about using it. mmmh, i don't think so. why would you for example want or need to escape format strings (that are retrieved via gettext and are already in your local charset), the URLs to download or the configuration data read from wgetrc? anyway, simone piunno and i have been talking a lot about this problem and we've found that apart from a couple of minor problems (very easy to fix) the current implementation of escape_buffer works fine. the problem is when you pass escaped multibyte strings as arguments to printf. if these strings contain a 0x00 byte, it will be incorrectly interpreted by printf as a string termination characher. simone says for example that UTF16 strings can contain null bytes. i don't really have any clue on how to solve this problem. simone suggests to change the internal format of strings in wget to UTF8, but of course i would prefer a less invasive solution if possible... i don't even know if we could keep using gettext in that case. -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi University of Ferrara - Dept. of Eng. http://www.ing.unife.it Institute of Human & Machine Cognition http://www.ihmc.us Deep Space 6 - IPv6 for Linux http://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it