Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
On 9/5/2013 5:23 PM, Junio C Hamano wrote: > Brad King writes: >> create SP NUL NUL >> update SP NUL NUL [] NUL > > That SP in '-z' format looks strange. Was there a reason why NUL > was inappropriate? The precedent I saw in the -z survey I posted is that NUL is used to terminate fields that can contain arbitrary text but otherwise SP or TAB is still used to terminate simple fields. I recall no cases that use NUL after fields that only contain simple values. >> option:: >> Specify an option to take effect for following commands. > > This last one is somewhat peculiar, especially because it says > "following command*s*". > > How would I request to update refs HEAD and next in an all-or-none > fashion, while applying 'no-deref' only to HEAD but not next? You're right. It will be simpler for clients to generate each sequence of options followed by a command without worrying about options possibly generated for preceding s. So: option:: Modify behavior of the next command naming a . The only valid option is `no-deref` to avoid dereferencing a symbolic ref. > When I said "create or update" in the message you are responding to, > I did not mean that we should have two separate commands. The nice thing about "create" is that it implies oldvalue=zero, just as "delete" implies newvalue=zero. The symmetry feels clean. Also, in -z mode we need to treat an empty string as missing rather than zero. The only way to specify create with the "update" command is with literal 40 "0" as . > The regular command line does create-or-update; if it exists already, > it is an update, and if it does not, it is a create. The proposed update command can be used for create, update, delete, or verify with proper arguments and use of 40 "0". The other commands are for convenience, shorter input, and statement of intent. > create-or-update-no-deref [] > delete-no-deref [] > > Also how would one set the reflog message for the proposed update? This is why I proposed a separate option command. It can be used both for no-deref and for other options we want to add later e.g. option SP message SP LF and with -z: option SP message SP NUL For now I think it is sufficient for the -m command-line option to set the same message for all updates. The option command leaves room to add a per-ref message later. BTW, the reasons I propose message as an option rather than its own command are: * It is simpler to document the reach of the one option command (affects next ) in one place rather than separately for each option-like command. * It reduces the number of commands that do not take a . -Brad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
Brad King writes: > On 09/04/2013 05:27 PM, Junio C Hamano wrote: >> I am not saying the above is the best format, but the point is that >> the mode of the operation defines the structure > > Great, thanks for your comments. Based on that I've prototyped a > new format. Rather than jumping straight to the patch, here is my > proposed format documentation for review: > > - > With `--stdin`, update-ref reads instructions from standard input and > performs all modifications together. Specify commands of the form: > > create SP SP LF > update SP SP [SP ] LF > delete SP [SP ] LF > verify SP [SP ] LF > option SP LF > > Quote fields containing whitespace as if they were strings in C source > code. Alternatively, use `-z` to specify commands without quoting: > > create SP NUL NUL > update SP NUL NUL [] NUL > delete SP NUL [] NUL > verify SP NUL [] NUL > option SP NUL That SP in '-z' format looks strange. Was there a reason why NUL was inappropriate? > Lines of any other format or a repeated produce an error. > Command meanings are: > > create:: > Create with only if it does not exist. > > update:: > Update to be , verifying if given. > Specify a zero to delete a ref and/or a zero >to make sure that a ref does not exist. > > delete:: > Delete , verifying if given. > > verify:: > Verify against but do not change it. If >zero or missing, the ref must not exist. > > option:: > Specify an option to take effect for following commands. > Valid options are `deref` and `no-deref` to specify whether > to dereference symbolic refs. This last one is somewhat peculiar, especially because it says "following command*s*". How would I request to update refs HEAD and next in an all-or-none fashion, while applying 'no-deref' only to HEAD but not next? update refs/heads/next option --no-deref update HEAD sounds somewhat convoluted. When I said "create or update" in the message you are responding to, I did not mean that we should have two separate commands. The regular command line does create-or-update; if it exists already, it is an update, and if it does not, it is a create. If we were to have two, I would say we should have: create-or-update [] create-or-update-no-deref [] An old value of 0{40} would mean "I should be the one creating; if somebody else created it while I was preparing this request, please fail". Similarly for delete: delete [] delete-no-deref [] Also how would one set the reflog message for the proposed update? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
On 09/04/2013 05:27 PM, Junio C Hamano wrote: > I am not saying the above is the best format, but the point is that > the mode of the operation defines the structure Great, thanks for your comments. Based on that I've prototyped a new format. Rather than jumping straight to the patch, here is my proposed format documentation for review: - With `--stdin`, update-ref reads instructions from standard input and performs all modifications together. Specify commands of the form: create SP SP LF update SP SP [SP ] LF delete SP [SP ] LF verify SP [SP ] LF option SP LF Quote fields containing whitespace as if they were strings in C source code. Alternatively, use `-z` to specify commands without quoting: create SP NUL NUL update SP NUL NUL [] NUL delete SP NUL [] NUL verify SP NUL [] NUL option SP NUL Lines of any other format or a repeated produce an error. Command meanings are: create:: Create with only if it does not exist. update:: Update to be , verifying if given. Specify a zero to delete a ref and/or a zero to make sure that a ref does not exist. delete:: Delete , verifying if given. verify:: Verify against but do not change it. If zero or missing, the ref must not exist. option:: Specify an option to take effect for following commands. Valid options are `deref` and `no-deref` to specify whether to dereference symbolic refs. Use 40 "0" or the empty string to specify a zero value, except that with `-z` an empty is considered missing. If all s can be locked with matching s simultaneously, all modifications are performed. Otherwise, no modifications are performed. Note that while each individual is updated or deleted atomically, a concurrent reader may still see a subset of the modifications. - Thanks, -Brad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
Brad King writes: > Nothing else uses LF NUL. I chose it as a starting point for > this very discussion, which I asked about in $gmane/233653. The primary reason why LF raised my eyebrow was because the reason why many subcommands use "-z" (and NUL) is often because the payload may have LF in a record and LF cannot be used as a record separator without escaping. And they use NUL knowing that the payload data in fields cannot contain a NUL. If we used LF as a signal to define the structure of the record, it pretty much defeats the whole point of defining "-z" format. The -m reason string will be made into a single liner deep in the codepath but it _can_ contain LF. I would have been more receptive to, say, double-NUL as a record terminator, while using a NUL as a field terminator, or something, but then we would need to have a way to express an empty field. > In this particular use case we know the last field will never > be LF but that may not be so for future cases. There is no way > to represent sentinel-terminated arbitrary variable-width records > of NUL-terminated fields without some kind of escaping for the > sentinel value, but the whole point of -z is to avoid escaping. Indeed, but one escape hatch we have is that payload will not contain NUL anywhere, so whenever we see a NUL, we can trust that it defines the structure of the record, and is not a part of the payload. Stepping back a bit, here are some observations on the arguments update-ref can take: * "-m " is a reason given for this entire update. As the point of this new feature is to give an all-or-none update to one or more refs, I think we should not have to accept more than one reason (more specifically, the -m option does _not_ belong to a specific record that describes what happens to a single ref). * "-d " is a way to delete a ref. may be missing. * "--no-deref " and " " are ways to update or create a ref. Again may be missing. So it looks to me that one possible format that is easy to generate by machine without ambiguity may be: * The first record could be m NUL NUL but it is optional. The reason string may contain LF but just like invocation from the command line, LF will eventually cleaned up into a SP. * Then a series of records of different kinds follow. - A delete record looks like this: d NUL NUL NUL If you want to delete the ref without "oldvalue" protection, just say d NUL NUL NUL - A create/update record looks like one of these: u NUL NUL NUL NUL n NUL NUL NUL NUL Again, if you want to delete the ref without "oldvalue" protection, just say u NUL NUL NUL NUL n NUL NUL NUL NUL * EOF signals the end of the request. I am not saying the above is the best format, but the point is that the mode of the operation defines the structure, so unlike parsing xml or json where you first parse the structure and then interpret what each element means, you can define a simple format where the kind of element comes upfront to allow the parser/interpreter know what is expected to follow it. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
On 09/04/2013 02:23 PM, Junio C Hamano wrote: > "whitespace-separated" implies that we may allow fields separated with not a > single SP, but with double SPs or even HTs between them. I personally do not > think we should be so loose Okay, I will look at making it more strict. See proposed format below. >> +Quote arguments containing whitespace as if in C source code. > > Probably "as if they were strings in C source code". Fixed. >> +terminate each argument by NUL and each line by LF NUL. > > This is a somewhat interesting choice of the record terminator. Do we have a > precedent to use LF NUL elsewhere? If this is the first such case where we > need to express variable number of NUL-separated fields in a record, I think > I am fine with LF NUL (but I am sure other people would give us a > better convention if we ask them politely ;-)), but I just want to make sure > we do it the same way as other codepaths, if exist, that have to handle this > kind of thing. Nothing else uses LF NUL. I chose it as a starting point for this very discussion, which I asked about in $gmane/233653. In this particular use case we know the last field will never be LF but that may not be so for future cases. There is no way to represent sentinel-terminated arbitrary variable-width records of NUL-terminated fields without some kind of escaping for the sentinel value, but the whole point of -z is to avoid escaping. Below is a survey of all mentions of NUL and \0 in documented formats as of v1.8.4. The summary is that most are fixed-width records but a few have variable width allowing n or n+1 fields. In all variable-width cases there is structured information in the first field that indicates the number of NUL-terminated fields to expect. In the motivating case here, we could use a --no-old or --have-old option to indicate in one field how many more to expect in the record, but that will be quite verbose. Side note: I'd like to reserve room for the leading options to include things like "-m NUL NUL" so we cannot keep them all in in a single NUL-terminated, SP-separated field. Another approach is to introduce a way to represent "not here" for the argument that is not an otherwise valid value. This would make the non-option part of the record have a fixed width of 3 fields. For example, we could use SP in -z mode: [- NUL]... NUL NUL (|SP) NUL and the last field can be optional in non-z mode anyway: [- SP]... SP [SP ] LF Or we could use a character like "~" (other ideas?): [- NUL]... NUL NUL (|~) NUL and make it available in non-z mode too: [- SP]... SP [SP (|~)] LF Thoughts? -Brad Survey of NUL in documented formats: * Documentation/diff-format.txt: The -z mode for --numstat prints NUL-terminated lines but there is exactly one path at the end of each entry and the earlier fields are separated by TAB because they are structured. * Documentation/diff-options.txt: The -z mode for diff-tree output prints structured SP/TAB-separated fields in a NUL-terminated field followed by either one or two NUL-terminated paths. This is variable width but the first field tells us how wide. * Documentation/git-apply.txt: The -z mode forwards to --numstat diff options. * Documentation/git-check-attr.txt: The -z mode for stdin reads NUL-terminated paths. * Documentation/git-check-ignore.txt: The -z mode for stdin reads NUL-terminated paths. The -z mode for output prints a fixed-width table with every group of four NUL-terminated fields forming a row. * Documentation/git-checkout-index.txt: The -z mode reads NUL-terminated paths. * Documentation/git-commit.txt: The -z mode forwards to git-status. * Documentation/git-grep.txt: The -z mode separates file names from the matched line by a NUL. Therefore NUL divides LF-terminated lines into two pieces. * Documentation/git-ls-files.txt: The -z mode prints NUL-terminated lines but there is exactly one path at the end of each entry and the earlier fields are separated by SP and TAB because they are structured. * Documentation/git-ls-tree.txt: The -z mode prints NUL-terminated lines but there is exactly one path at the end of each entry and the earlier fields are separated by SP and TAB because they are structured. * Documentation/git-mktree.txt: The -z mode reads NUL-terminated lines as output by ls-tree -z. * Documentation/git-status.txt: The -z mode of --porcelain separates a variable number of entries by NUL. The beginning of each entry allows one to know the number of NUL-terminated fields to expect (A = 1 total NUL, R = 2 total NULs, etc.). * Documentation/git-update-index.txt: The -z mode of --stdin separates paths by NUL. The -z mode of --index-info separates entries by NUL but there is exactly one path at the end of each entry and the earlier fields are separated by SP and TAB because they are structured. * Documenta
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
On 09/04/2013 03:17 PM, Junio C Hamano wrote: > Brad King writes: >> +static void update_refs_stdin_read_n() >> +static void update_refs_stdin_read_z() > > These need to be defined with explicit (void) argument list. Oops, fixed. Thanks, -Brad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
Brad King writes: > +static void update_refs_stdin_read_n() > +{ > + struct strbuf line = STRBUF_INIT; > + > + while (strbuf_getline(&line, stdin, '\n') != EOF) > + update_refs_stdin_parse_line(line.buf); > + > + strbuf_release(&line); > +} > + > +static void update_refs_stdin_read_z() > +{ These need to be defined with explicit (void) argument list. static void foo(void) { ... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 7/8] update-ref: support multiple simultaneous updates
Brad King writes: > +With `--stdin`, update-ref reads instructions from standard input and > +performs all modifications together. Empty lines are ignored. > +Each non-empty line is parsed as whitespace-separated arguments. "whitespace-separated" implies that we may allow fields separated with not a single SP, but with double SPs or even HTs between them. I personally do not think we should be so loose; it is not like we are allowing the user to type these from the terminal, so limiting "separated by a single SP" may be perfectly adequate, and make it less error prone (e.g. there will not be a confusion "Is 'A SP SP B' A followed by B, or A followed by an empty string followed by B?"). > +Quote arguments containing whitespace as if in C source code. Probably "as if they were strings in C source code". > +Specify updates with lines of the form: > + > + [--no-deref] [--] [] > + > +Lines of any other format or a repeated produce an error. > +Specify a zero to delete a ref and/or a zero > +to make sure that a ref not exist. Use either 40 "0" or the > +empty string (written as "") to specify a zero value. Use `-z` > +to specify input with no whitespace, quoting, or escaping, and > +terminate each argument by NUL and each line by LF NUL. This is a somewhat interesting choice of the record terminator. Do we have a precedent to use LF NUL elsewhere? If this is the first such case where we need to express variable number of NUL-separated fields in a record, I think I am fine with LF NUL (but I am sure other people would give us a better convention if we ask them politely ;-)), but I just want to make sure we do it the same way as other codepaths, if exist, that have to handle this kind of thing. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html