There is a lot of info here, thanks for the write up! On Fri, Aug 31, 2018 at 15:47:34 -0700, Gregory Szorc wrote: ... > Assuming you only have primitive data retrieval commands, you are now > issuing a lot more commands.
While I'm all for allowing simpler servers (and hopefully clients too), I'm worried about the chattiness of such a protocol - specifically the number of network round-trips that depend on previous commands completing. Over the years, I've seen plenty of protocols evolve to reduce chattiness. For example, NFSv4 added compounds - a way to pack up several RPCs and send them as a unit, SMB/CIFS reduced the number of RPCs, and so on. I realize that both those examples are file systems, but I'd argue that their lessons apply here as well. Somewhat relatedly: The jmap IETF working group [1] is working on a new way to access email - ideally replacing IMAP. The interesting thing here is that the entire design is visibly targetting high latency links. (Personally, I think this is because the authors are from Australia and therefore they are very sensitive to latency.) I don't know if there are any lessons in jmap that would apply here, but I would certainly encourage testing on high-latency & high-bandwidth links if there is any concern of chattiness in the new protocol. [1] https://datatracker.ietf.org/group/jmap/about/ ... > At the end of the day, the wire protocol command set will be driven by > practical needs, not by ivory tower architecting. We'll see what shortcuts > we need to employ in the name of performance and we'll implement them. That's good to hear. I just hope that these "bonus" commands will fit more or less nicely into the new protocol design. It'd be rather unfortunate if in the process of adding these bonus commands you reinvented getbundle. ... > Since we are effectively talking about a new VCS at the wire protocol > level, let's talk about other crazy ideas. As Augie likes to say, once we > decide to incur a backwards compatibility break, we can drive a truck > through it. > > Let's talk about hashes. > > Mercurial uses SHA-1 for content indexing. We know we want to transition > off of SHA-1 eventually due to security weaknesses. ... > In addition, Mercurial has 2 ways to store manifests: flat and tree. ... > > One of the ideas I'm exploring in the new wire protocol is the idea of > "hash namespaces." Essentially, the server's capabilities will advertise > which hash flavors are supported. Example hash flavors could be > "hg-sha1-flat" for flat manifests using SHA-1 and "hg-blake2b-tree" for > tree manifests using blake2b. When a client makes a request, that request > will be associated with a "hash namespace" such that any nodes referenced > by that command are in the requested "hash namespace." While this idea is intriguing, it also means AFAICT that a changeset no longer has one globally unique ID. E.g., consider the world where there are: hg-sha256-flat hg-blake2b-flat or: hg-blake2b-flat hg-blake2b-tree In both cases, the node id will be 32 bytes/64 hex chars long. I can no longer paste at you a hash I see in 'hg log' and (1) know what hash function generated it, and (2) be certain that you can grep your 'hg log' output for it and find it. This whole thing gets even more fun when you share abbreviated hashes - e.g., abc may be the shortest unique node prefix in both namespaces, but may map to completely different revisions. As a side note, wouldn't it be possible to deal with flat<->tree transitions by making a "dummy" commit that rewrites the manifest to the new format and sets some flag in .hg/requires? Anyway, as intriguing as this idea is, I'm skeptical that the resulting UX will be good. It also possible that I'm not fully understanding your idea here :) > This feature, if implemented, would allow a server/repository to index and > serve data under multiple hashing methodologies simultaneously. For > example, pushes to the repository would be indexed under SHA-1 flat, SHA-1 > tree, blake2b flat, and blake2b tree. Assuming the server operator opts > into this feature, new clones would use whatever format is > supported/recommended at that time. Existing clones would continue to > receive SHA-1 flat manifests. New clones would receive blake2b tree > manifests. See above about UX. Regardless, it is certainly something to experiment with and either keep or throw away. Thanks for all the work you've put in, Jeff. -- Once you have their hardware. Never give it back. (The First Rule of Hardware Acquisition) _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel