On Wed Mar 5 11:00:34 2008, Richard Dobson wrote:
Sometimes flexibility comes back to bite you. I'd prefer to keep
things
simple if we can. What does the extra flexibility really do for
us, and
is it worth the cost?
Well I think it is better to be flexible in this case otherwise its
forcing server implementors to implement this in a particular way
when there is no real need to, what if in a particular
implementation its more efficient to implement it as GUIDs for
example as Alex suggested because of how its clustered or how the
database is implemented, in my case id want to implement this as a
compressed timestamp rather than as an increasing integer, and as
far as the spec is currently written and how it would work for
clients it wouldn't make any difference what the version identifier
is as it isn't (and IMO MUST NOT) take any meaning from the value
of the version identifier so things stay nice and simple, as as
soon as clients start taking any meaning of what the version
identifier means you are introducing a whole raft of potential
interoperability issues and bugs that need not exist.
You can't use timestamps - they're not strictly increasing, for
various reasons.
Firstly, two roster changes could happen at precisely the same
moment. To be fair, by introducing cluster node identifiers, and
having a strict strong ordering of them, you could avoid this.
Secondly, the clock on a computer can, and surprisingly often does,
go backwards. That's a much harder problem to solve.
Thirdly, in a clustering situation, you'd have to ensure that the
time on each cluster node was perfectly synchronized.
So the closest you can do would be a modified timestamp that had
additional logic during generation to ensure it never went backwards,
in which case you don't need the cluster identifier anymore, and
that's effectively the same as having a strictly increasing integer
sequence anyway, so it's easier to just do that. But even if you did
want to use timestamps, just representing them as an integer is
pretty trivial. Look at the definition of "modtime" in ACAP (RFC
2244), which defines a strictly increasing modified timestamp
represented using digits.
The only reason this scares me is that strictly increasing numeric
sequences have proved useful in the IMAP world, because clients
can spot when things go wrong much more easily.
I think this its a very very bad idea for the clients to take any
meaning from the version identifier as explained above, its just
opening a pandora's box of potential bugs and issues, far better
for it to just be a opaque string as far as the client in
concerned, which also helps to keep things as simple as possible.
It's useful for clients to be able to determine the ordering locally,
on occasion. If we removed this, we'd also have to ensure that roster
pushes were sent to the client in-order, which currently we don't
mandate. (Making this a SHOULD is sane, but in the cluster case, it's
quite hard).
Plus, nobody can get it wrong.
How exactly are they going to get it wrong if its an identifier
that only the server is interpreting the meaning of?
It's the server I'm worried about. :-)
There's no way that even a 32-bit unsigned integer is going to
overflow - if you did an update every second, it'd take 136 years
- but if that still unnerves you (in case PSA turns into the
undead, or something), use a 64-bit unsigned integer.
Its possible even if unlikely, what if several updates were made in
one second because of some kind of bulk update for example, at the
very least the spec MUST define what should happen if the number is
going to overflow, i.e. reseting to 0, even though it is unlikely.
You just use a 128-bit unsigned integer. There is no upper limit here
- in particular, there is no upper limit specified anywhere in this
document - XSD merely states that a xs:nonNegativeInteger is a
sequence of digits, and has "countably infinite" cardinality.
If you really and truly believe that practical limits of 64-bit
unsigned integers can cause problems in the real world, I honestly
don't know what to say except show you the figures - you could have
thousands of updates every millisecond, and still last over half a
million years - 574,542 roughly, assuming a fixed year length of
365.25 days.
I'm all for designing for the future, but you have to draw the line
somewhere, and besides, I figure we'll be on something bigger than
64-bit well before then - a jump to 128-bit gains us 10^25 years of
breathing space, and I'd like to imagine we can think up a solution
within that time, assuming that's prior to the heat death of the
universe.
Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade