On 1/23/2012 9:55 AM, Philip Martin wrote:
So lots of high range Unicode points are allowed.

Yes.

How will we validate that?

The same way the code (as given to me by others) currently validates the names---it iterates the characters provided, and if one of them doesn't meet the definition, it returns false. Basically what goes inside if(...){} would be changed.

Do we have any suitable code in Subversion?

In my original email I provided the name of the method that is currently providing the arbitrary restriction. If the if(...){} block would change to relax its current restrictions. I don't see what is difficult about it, although perhaps I'm being naive. However, noting that SVN+DAV works just fine with this relaxed restriction, and that JavaHL works just fine /reading/ values with relaxed restriction, my best guess is that all you have to do is change a few lines in that method and things will all work nicely.

Do we write an XML validator?

Nowhere was there ever a hint of XML validation. In fact, I wasn't even proposing verification of XML well-formedness. There is no XML markup involved. I'm simply proposing we use the same definition that XML does of a name.

The definition of a name is conceptually a set of characters. Think of it as a regular expression. Currently Subversion uses something like /[a-zA-z:_][a-zA-z0-9\.:_]*/. I'm simply proposing we relax this using XML's "regular expression" instead of the one we use now. There is no XML involved. We are simply re-using a definition from their specification.

Currently there are at least two "official" Subversion clients. One is using XML's definition of a property name. Another is ("for now" the code says) using another definition. Whatever we do, I would propose they both use the same definition. I would vote for XML's definition

Use some other existing validator? Do we have to extract UTF8 multibyte characters first?

We would have to interpret the incoming bytes that as UTF-8 and parse them accordingly before validating the characters, yes. In fact, this should be happening anyway. Remember that clients such as Subclipse and TortoiseSVN are already /reading/ these property name values as UTF-8, so the code that validates them should be interpreting them as UTF-8 as well.

I thought you were proposing to write the code?

I'm fine with that as well. Looks like I would have to add a few lines to decote UTF-8 (surely such code already exists in the Subversion codebase somewhere) and change a few if(...){} statements. I should be able to handle that. I would imagine it will take more effort on my part to get permission to change the code than actually writing the code itself.

Basically I'm proposing that we set
publicly what constitutes a valid Subversion name, and then make
whatever code changes are needed to conform. A test suite comes to
mind as a tool to assist in this, but that's another subject
altogether.
Subversion has a testsuite.

Either 1) the test suite does not cover property name validity, or 2) the DAV+SVN client isn't run through the test suite, because the DAV+SVN client doesn't comply to the property name validation present behind JavaHL.

Garret

Reply via email to