On Wed, 15 Oct 2014 06:47:25 -0700 (PDT) "Edward K. Ream" <edream...@gmail.com> wrote:
> On Wednesday, October 15, 2014 8:28:05 AM UTC-5, Terry Brown wrote: > > > I don't think the post-scan does require Bob's always incrementing > timestamp fix, I think the post-scan is a more general solution which > addresses the ".leo files from other sources" aspect of the > duplicated gnx problem, as well as the "loading the same file twice > in one second" aspect of the problem fixed by Bob. > > I just don't see how collisions can happen except in automated > situations. The gnx contains the committer id plus a timestamp. That > combination *is* going to be unique in general, unless the timestamps > collide, which isn't going to happen except in Bob's case. > > > So for maximum code cleanliness Bob's fix could be removed. > > Again, I don't see how that statement can be correct. Each > invocation of Leo must be based on a unique timestamp. That's not really an absolute - the absolute here is the statement you make below, "We must not *ever* reassign gnx's, so any scheme that guarantees no *new* collisions will suffice." Having a system that generates always incrementing timestamps locally doesn't address files from other sources. Of course the full gnx is unlikely to collide when you include the username etc., but unlikely is not never. When was the last time anyone got their first choice of username signing up for a web based service :-) We probably don't want to be relying on Leo having a small number of users to keep the chances of usernames colliding in gnxs low :-) :-) > > It seems that collecting the maximum index value for the gnx > > timestamp > for a particular c /could/ be done in one of the existing scans, and > that that seems like a less invasive fix than changing the gnx format. > > Less visible externally, but I really don't like it. And no, > existing scans aren't up to the job because they happen too early, > before we know what gnx's have been read. You know the read code better than anyone, I was just assuming that there would be opportunity to collect gnx info. on all gnxs in a c prior to the new post-scan, even if that collection was spread across different parts of the read cycle. I can certainly see how the post scan would be the simplest / cleanest to implement. > > Also, I'm not convinced any non-scan based solutions (other than a > significant number of random bits) can really guarantee no > collisions. The ".leo files from other sources" case basically means > you have no idea what gnxs lurk in the file, unless you look. > > If that were true, Leo would have been fundamentally broken all these > many years. But no, the combination of id and timestamp is almost > always unique. "Almost always", vs. "not *ever*". We're definitely targeting situations which will impact 99% of Leo users zero times in their Leo using career, there's no argument that the present system is almost always sufficient. > We must not *ever* reassign gnx's, so any scheme that guarantees no > *new* collisions will suffice. That's what uuids or file numbers do. I'm not seeing how the file number lets me know that the gnx tbrown.20141015060623.1234.6 does not already exist in the .leo file being read, particularly when the .leo file was created on another system. Cheers -Terry -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+unsubscr...@googlegroups.com. To post to this group, send email to leo-editor@googlegroups.com. Visit this group at http://groups.google.com/group/leo-editor. For more options, visit https://groups.google.com/d/optout.