Re: request to clarify and improve Subversion property name specification

Garret Wilson Mon, 23 Jan 2012 10:57:56 -0800

On 1/23/2012 10:38 AM, Philip Martin wrote:

Garret Wilson<[email protected]>  writes:

On 1/23/2012 9:55 AM, Philip Martin wrote:

I thought you were proposing to write the code?

I'm fine with that as well. Looks like I would have to add a few lines
to decote UTF-8 (surely such code already exists in the Subversion
codebase somewhere) and change a few if(...){} statements. I should be
able to handle that. I would imagine it will take more effort on my
part to get permission to change the code than actually writing the
code itself.

The function receives a string of bytes, I think it's already in UTF-8.
The problem is that while Subversion has functions to validate UTF-8 it
doesn't have a system for extracting individual UTF-8 code points.  At
present it only ever needs to extract the ASCII subset which is trivial.

Ah. Well, like I said---I would be happy to write the UTF-8 extractioncode. It would be worth it to me to get this functionality in; it wouldbe a fun exercise for me; it would be a good introduction to thecodebase for me; it's a small (very small), low-risk task; and theSubversion codebase would be better off in the end. (I'm sure it can beused elsewhere.) It's a win-win for everyone! :D

This is really a small thing. Here's an example in just a few lines:http://bjoern.hoehrmann.de/utf-8/decoder/dfa/

Or see DecodeUTF8BytesToChar attidy.sourceforge.net/cgi-bin/lxr/source/src/utf8.c .

I would be happy even precluding code points from supplementary planes(e.g. those over U+FFFF), if anyone is worried about the code being toocomplicated.

Re: request to clarify and improve Subversion property name specification

Reply via email to