Lisa Dusseault wrote:
For related work, you could look at the email spam filtering stuff from SIEVE: <http://www.ietf.org/internet-drafts/draft-ietf-sieve- spamtestbis-02.txt>
Thanks for the pointer. I was aware of SIEVE's Filtering Extension but didn't file it mentally as related to the Atom Rank Extension.
The similar problem is that many spam libraries already produce some kind of linear or similar scale of severity/likelihood rating for emails, much like existing services already provide their own scale of post rankings. The approach SIEVE took, very roughly, was simply to tell implementors to find some way to map their implementation's ranking scheme to a canonical range of numerical values. A SIEVE implementation might have one algorithm for converting SpamAssassin rankings to the canonical scale, and a different algorithm for some different library.
Unfortunately this approach does not suit ranking entries very well; the only "canonical range" I can think of being a superset of every conceivable Ranking Scheme is xsd:decimal. Smaller sets simply won't do. (While loss of information might be acceptable when classifying spam, other ranking use cases, e.g., grades, are less tolerant to it.)
Now this would essentially mean dropping the r:scheme element altogether, leaving only r:rank, which was, IIRC, a simplification suggested by Robert Sayre some time ago.
But even when dropping r:scheme's capability to define a set of allowed values having an r:scheme-like element around would still be useful:
- It provides a single place to specify a scheme's @significance (ascending or descending). Otherwise each and every r:rank element needs to carry its own @significance -- which is especially pointless since @significance has not significance (no pun intended) for a single ranking value; it only ever makes sense for a set of values.
- Furthermore, a scheme can also have a @label attached (e.g. "Five Stars"), which might then be used in a UI, complementing r:rank's @label (e.g. "Good"). But this issue is really independent of the question whether r:scheme itself actually has the ability to restrict the rankings to {1, 2, 3, 4, 5} or not.
To summarize, droppping r:scheme's datatyping capabilities might be acceptable, but dropping the concepts of Ranking Scheme and Domain is not. I would, however, like to see datatyping being included as well -- iff we can come up with a solution which can describe all the common uses cases out there!
Andreas Sewe
