oops, bad paste ... >>> import locale; >>> print locale.getdefaultlocale()[1]; mac-roman
On Fri, Oct 9, 2009 at 3:00 PM, Raman Gupta <[email protected]> wrote: > Please keep replies on list... > > Benson Margulies wrote: > > The point is that it only uses the encoding to write the file. It reads > > the bytes from the log raw, and pushes them into the codec to write them > > into the file. Thus, it is assuming that the input is UTF-8, and asking > > for the output to be in the default locale. That's how the codecs work. > > It isn't using a codec to convert from input, only to convert the output. > > I'm sorry Benson, but I believe you are operating under some > fundamental misconceptions... Of course it has to use a codec to > convert from input ("input" here is the svn log output). > > Any time one reads bytes that one knows are characters (as output by > svn log), one needs to apply a codec to the bytes to understand what > those characters are. You contradict yourself by saying that it is > assuming the input is UTF-8 -- UTF-8 is just another codec, no > different from other codecs except in the actual byte value(s) used to > represent characters. Assuming UTF-8 would indeed mean using a codec > to decode the input. > > Here is what it is really doing: > > def recode_stdout_to_file(s): > [... if statement snipped ...] > u = s.decode(sys.stdout.encoding) > return u.encode(locale.getdefaultlocale()[1]) > > i.e. svnmerge.py is decoding the bytes of the svn log output using the > codec returned by sys.stdout.encoding. This may be UTF-8, but it may > be something else depending on your local platform and settings. There > is *no assumption* of UTF-8 here. Then it is encoding those characters > back into bytes (and eventually writing these bytes to a file), using > the codec returned by locale.getdefaultlocale()[1]. This encoding is > what svn expects in the content of files that it reads commit log > messages from via the -F parameter. > > The possible error here is that our assumption of what encoding svn > uses when printing a log to stdout (i.e. sys.stdout.encoding) or what > encoding svn uses when reading a commit log file for creating a commit > message (i.e. locale.getdefaultlocale()[1]) is wrong. If either of > these assumptions is wrong, then yes, there is a problem that needs to > be fixed. It has nothing to do with "assuming" UTF-8. > > > And this makes sense. It's completely wrong to assume that the svn log > > messages are in the current user's default locale locale encoding. It > > makes some sense that users would want to edit a file in their current > > encoding, it just doesn't always work. > > Huh? Do you have some evidence that svn, when writing a commit log to > standard output, does not write the data in the encoding specified by > the python sys.stdout.encoding value? If so, great -- please provide > such evidence and a patch with your fix. > > Cheers, > Raman >
_______________________________________________ Svnmerge mailing list [email protected] http://www.orcaware.com/mailman/listinfo/svnmerge
