Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 11:22 AM, Dave Borowitz dborow...@google.com wrote: The alternatives for someone writing a parser are: a. Store the original pkt-line framing. Or obviously, a2. Frame in some other way, e.g. JSON array of strings (complete straw man, not seriously proposing this). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Shawn Pearce spea...@spearce.org writes: push cert format is like commit or tag format. You need those LFs. We can't just go declare them optional because of the way pkt-line read function is implemented in git-core. As I said, I view each of the packets between push-cert and push-cert-end packets representing the meat of each line in the cert. The sending end takes a cert as a long multi-line string, splits them into an array, each of whose element represents a line in it (iow certlines = certstring.split('\n')), and sends them packetised. The receiver receives a sequence of packets, notices push-cert packet, collects packets until it sees push-cert-end packet and treats them as elements of this array. pkt-line deframing process would have to strip optional LFs to reconstruct the original array the sender had (i.e. the above certlines array). The receiver needs to join the array with LF to recover the long multi-line string once it received the array. But this LF does not have anything to do with the optional trailing LF in pkt-line. If you sent the original certlines array via different RPC mechanism, you need to join them together with your own LF to reconstruct the multi-line srring. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Shawn Pearce spea...@spearce.org writes: push cert format is like commit or tag format. You need those LFs. We can't just go declare them optional because of the way pkt-line read function is implemented in git-core. As I said, I view each of the packets between push-cert and push-cert-end packets representing the meat of each line in the cert. The sending end takes a cert as a long multi-line string, splits them into an array, each of whose element represents a line in it (iow certlines = certstring.split('\n')), and sends them packetised. The receiver receives a sequence of packets, notices push-cert packet, collects packets until it sees push-cert-end packet and treats them as elements of this array. pkt-line deframing process would have to strip optional LFs to reconstruct the original array the sender had (i.e. the above certlines array). The receiver needs to join the array with LF to recover the long multi-line string once it received the array. But this LF does not have anything to do with the optional trailing LF in pkt-line. If you sent the original certlines array via different RPC mechanism, you need to join them together with your own LF to reconstruct the multi-line string. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: Another way of looking at the problem with my assumptions is, I was assuming pkt-line framing was the same thing as pkt-line header. You seem to be saying the definition of pkt-line framing is header, and optional trailing newline. Yes. I thought that was what Server SHOULD terminate with LF; client MUST NOT require it in the existing text meant. Ah, that reminds me of one thing I already said elsewhere. We need to correct the above with s/Server/Sender/; s/Client/Receiver/; I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: That existing implementations of the receivers treat an empty packet (i.e. 0004) or 0005\n ;) Is that true? I think len = pkt_line(); if (!len) break; /* flush */ would give you len == 1 and would not confuse it with a flush. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 2:06 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: On Mon, Jul 6, 2015 at 1:34 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: Another way of looking at the problem with my assumptions is, I was assuming pkt-line framing was the same thing as pkt-line header. You seem to be saying the definition of pkt-line framing is header, and optional trailing newline. Yes. I thought that was what Server SHOULD terminate with LF; client MUST NOT require it in the existing text meant. Unfortunately, the existing text is littered with examples of PKT-LINE(foo SP bar LF). If we assume PKT-LINE(...) means apply pkt-line framing to the [...], then this strongly implies that pkt-line framing does _not_ include the trailing LF. (Or the logical but bizarre alternative reading that such an example might have _two_ trailing LFs :) Yes, But I never viewed PKT-LINE() as an element that strictly defines the grammar of the packet protocol ;-) By clarifying that sender SHOULD terminate with LF, receiver MUST NOT require it is the rule (and fixing the existing implementations at places where they violate the MUST NOT part, which I think are very small number of places), I think we can drop these LF (or LF? for that matter) from all of the PKT-LINE() in the construction in the pack-protocol.txt, which would be a very good thing to do. Completely agree, and that is what I meant when I said The additional upside [to explicitly defining pkt-line framing in this way] is that we could then potentially remove all or almost all LFs from this document. The example in your sentence will become PKT-LINE(foo SP bar) and the there may be an LF at the end would only be at one place, as a part of the definition of PKT-LINE(). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: Unfortunately, optional LFs still make the stored certs for later auditing and parsing a bit illegible. This is one way in which push certs are fundamentally different from the rest of the wire protocol, which is not intended to be persisted. Hmm, I am not sure I follow. The corner case I pointed out before where nonce runs into commands is not the only one. Consider the following cert fragment: 001fpushee git://localhost/repo 0029nonce 1433954361-bde756572d665bba81d8 A naive cert storage/auditing implementation would store the raw payload that needs to be verified, without the pkt-line framing. In this case: pushee git://localhost/repononce 1433954361-bde756572d665bba81d8 Without the pkt-line framing is fine, but my understanding (or, the intention of the original implementor) of this part of the protocol is that packets between the push-cert packet and the push-cert-end packet carry the meat of the each line of the certificate, one packet per line. If pkt-line is allowed to omit the terminating LFs, then it follows that the receiving ends can simply do something like what I illustrated in $gmane/273196 (in java or whatever other implementation platform they use) to collect packets between push-cert and push-cert-end, knowing that the packets may or may not have terminating LF and supplying the omitted LFs themselves when they receive the cert before verifying and storing. So in order to reconstitute the raw payload without pkt-line framing, the omitted LF obviously needs to be added. Why is that a problem? Side note: think of it in a different way. The key word of the first paragraph above is the meat of; if your cert has two lines pushee $URLLFnonce 1234-5670LF the lines in it are pushee $URLLF and nonce 1234-5670LF but the meat of them are pushee $URL and nonce 1234-5670. The protocol wants to carry an array with two elements, (pushee $URL, nonce 1234-5670), as the hypothetical cert has two lines. And then \n.join(the cert array) . \n would be how you reconstruct the original payload. The illustration in $gmane/273196 is slightly cheating in that sense. Instead of first creating an array of plain strings without LF termination and joining them together later, it knows that we will LF-join in the end, and abuses the LF in the original payload that came from the sender and supplies its own if the sender omitted it. It is very similar to and in the opposite of how each ref advertisement is handled. Until the first flush, each packet is expected to carry the object name and the ref name. A pkt-line framing may add terminating LF but that obviously is not part of the ref name. A naive parser that wants to find the pushee would look for pushee urlish, which would be wrong in this case. (To say nothing of the fact that pushee might actually be -0700pushee.) The alternatives for someone writing a parser are: a. Store the original pkt-line framing. b. Write a parser in some other clever way, e.g. parsing the entire cert in reverse might work. Neither of these is very satisfying, and both reduce human legibility of the stored payload. If LF is optional, then with that approach you might end up with a section of that buffer like: I think I touched on this in my previous message. You cannot send an empty line anywhere, and this is not limited to push-cert section of the protocol. Strictly speaking, the wire level allows it, but I do not think the deployed client APIs easily lets you deal with it. So you must follow the SHOULD terminate with LF for an empty line, even when you choose to ignore the SHOULD in most other places. I do not think it is such a big loss, as long as it is properly documented. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 12:57 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: The server can munge pkt-lines and reinsert LFs, but it _must_ have some way of reconstructing the payload that the client signed in order to verify the signature. If we just naively insert LFs where missing, we lose the ability to verify the signature. I still do not understand this part. There is no way to naively insert, is there? You have an array of lines (each of which you have already stripped its terminating LF at its end). How else other than adding one LF at the end of each element do you reconstruct the original multi-line string the client signed? Are there other ways that makes the result ambiguous?? I think I understand the confusion now. I think you and I are working from different assumptions about the client behavior. My assumption was: the intended behavior for the client is to sign the exact payload that was sent over the wire, minus pkt-line headers. For example, under my assumption, if the client sent: 0008foo\n 0007bar 0008baz\n then this indicates the client signed: foo\nbarbaz\n Under this assumption, naively inserting LF means inserting an LF after bar. Thus the server would record the following in a persistent store: foo\nbar\nbaz\n If we only store this string, and do not remember the fact that the client originally omitted one of those LFs, then when we go back to verify that signature later, it will fail. That was my assumption. Your assumption, IIUC, is that the payload the client signed MUST have contained LFs in between each line. When framing the content for the wire, the client MUST send one logical line, which has no trailing LF, per pkt-line, and furthermore the pkt-line content MAY contain an additional trailing LF. Under your assumption, the server always has enough information to reconstruct the original signed payload. The problem with the documentation, then, is that the documentation does not say anything to indicate that the signed payload is anything other than what is on the wire. So maybe this series should include an explicit description of the singed payload outside of the context of a push. Then, in the push section, we can describe the set of transformations that the client MUST perform (splitting on LF; adding pkt-line headers) and MAY perform (adding LFs). If we say the payload the client signs MUST have LFs only in certain places, then that gives the server enough information to reconstruct the payload and verify the signature. But if we say the signed payload MUST have LFs and the wire payload MAY have LFs, then now we have two completely different formats, only one of which is documented. I thought that was what I was saying. The wire protocol sends the contents of each line (both what is signed and the signature) on a separate packet. When I say contents of a line, I do not include the terminating LF as part of the line (iow, LF is not even optional; the terminating LF is not considered as part of the contents of a line). It becomes irrelevant that a pkt-line may or may not have a trailing optional LF. If there is LF at the end of a packet between push-cert and push-cert-end packets, that LF by definition cannot be part of the contents of a line in a certificate. It is just a pkt-line framing artifact you can and should remove if you are doing a split to array, join with LF implementation to recover the original string. And that is very much consistent with the way we send other things with pkt-line protocol. Each packet up to the first flush is expected to have object name and refname as ref advertisement. The pkt-line framing may or may not add a trailing LF, but LF is not part of refname. It is not even part of the payload; merely an artifact of pkt-line framing. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 1:30 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: I think I understand the confusion now. I think you and I are working from different assumptions about the client behavior. I agree that we now both understand where we come from ;-) And sorry for not being clear when I did the push-cert originally in the documentation. As I already said, packets between push-cert and push-cert-end are contents of individual lines of the GPG signed push certificate This sentence makes sense to me now, but only because we now agree that contents does not include the LF. Different people may have different initial assumptions about whether the contents of a line includes the trailing newline or not. was the design meant from day one, and a85b377d (push: the beginning of git push --signed, 2014-09-12) could have made it clearer. The problem with the documentation, then, is that the documentation does not say anything to indicate that the signed payload is anything other than what is on the wire. Yeah, that was untold assumption, as I considered what is on the wire to include pkt-line framing when I wrote a85b377d (push: the beginning of git push --signed, 2014-09-12). So maybe this series should include an explicit description of the singed payload outside of the context of a push. Then, in the push section, we can describe the set of transformations that the client MUST perform (splitting on LF; adding pkt-line headers) and MAY perform (adding LFs). Yes, and the latter is not limited to push-cert but anything sent on pkt-line. That incidentally is the only point I deeply care about. I just want to minimize the protocol is this way in general, but only for this one you must do it differently. Understood, and I'm glad we have finally come to an arrangement that is both consistent and easy to implement on the server side. One example of only for this one you must do it differently is another caveat for protocol implementors for the sending side (again not limited to push cert). That existing implementations of the receivers treat an empty packet (i.e. 0004) or 0005\n ;) as if it is the same as a flush packet (i.e. ), so even if the sending side chooses to ignore the SHOULD terminate each non-flush line using LF, it is strongly advised not to do so when it wants to send an empty payload. This needs to be documented. The receiving end SHOULD NOT treat 0004 the same way as . I think that must be documented and implementations (including our own) should be fixed. Agreed. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 1:11 PM, Dave Borowitz dborow...@google.com wrote: On Mon, Jul 6, 2015 at 12:57 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: The server can munge pkt-lines and reinsert LFs, but it _must_ have some way of reconstructing the payload that the client signed in order to verify the signature. If we just naively insert LFs where missing, we lose the ability to verify the signature. I still do not understand this part. There is no way to naively insert, is there? You have an array of lines (each of which you have already stripped its terminating LF at its end). How else other than adding one LF at the end of each element do you reconstruct the original multi-line string the client signed? Are there other ways that makes the result ambiguous?? I think I understand the confusion now. I think you and I are working from different assumptions about the client behavior. My assumption was: the intended behavior for the client is to sign the exact payload that was sent over the wire, minus pkt-line headers. For example, under my assumption, if the client sent: 0008foo\n 0007bar 0008baz\n then this indicates the client signed: foo\nbarbaz\n Under this assumption, naively inserting LF means inserting an LF after bar. Thus the server would record the following in a persistent store: foo\nbar\nbaz\n If we only store this string, and do not remember the fact that the client originally omitted one of those LFs, then when we go back to verify that signature later, it will fail. That was my assumption. Your assumption, IIUC, is that the payload the client signed MUST have contained LFs in between each line. When framing the content for the wire, the client MUST send one logical line, which has no trailing LF, per pkt-line, and furthermore the pkt-line content MAY contain an additional trailing LF. Under your assumption, the server always has enough information to reconstruct the original signed payload. The problem with the documentation, then, is that the documentation does not say anything to indicate that the signed payload is anything other than what is on the wire. Another way of looking at the problem with my assumptions is, I was assuming pkt-line framing was the same thing as pkt-line header. You seem to be saying the definition of pkt-line framing is header, and optional trailing newline. A quick scan of pack-protocol.txt did not turn up anything one way or the other on this issue, so perhaps we could make it more explicit. The additional upside here is that we could then potentially remove all or almost all LFs from this document. So maybe this series should include an explicit description of the singed payload outside of the context of a push. Then, in the push section, we can describe the set of transformations that the client MUST perform (splitting on LF; adding pkt-line headers) and MAY perform (adding LFs). If we say the payload the client signs MUST have LFs only in certain places, then that gives the server enough information to reconstruct the payload and verify the signature. But if we say the signed payload MUST have LFs and the wire payload MAY have LFs, then now we have two completely different formats, only one of which is documented. I thought that was what I was saying. The wire protocol sends the contents of each line (both what is signed and the signature) on a separate packet. When I say contents of a line, I do not include the terminating LF as part of the line (iow, LF is not even optional; the terminating LF is not considered as part of the contents of a line). It becomes irrelevant that a pkt-line may or may not have a trailing optional LF. If there is LF at the end of a packet between push-cert and push-cert-end packets, that LF by definition cannot be part of the contents of a line in a certificate. It is just a pkt-line framing artifact you can and should remove if you are doing a split to array, join with LF implementation to recover the original string. And that is very much consistent with the way we send other things with pkt-line protocol. Each packet up to the first flush is expected to have object name and refname as ref advertisement. The pkt-line framing may or may not add a trailing LF, but LF is not part of refname. It is not even part of the payload; merely an artifact of pkt-line framing. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: The server can munge pkt-lines and reinsert LFs, but it _must_ have some way of reconstructing the payload that the client signed in order to verify the signature. If we just naively insert LFs where missing, we lose the ability to verify the signature. I still do not understand this part. There is no way to naively insert, is there? You have an array of lines (each of which you have already stripped its terminating LF at its end). How else other than adding one LF at the end of each element do you reconstruct the original multi-line string the client signed? Are there other ways that makes the result ambiguous?? If we say the payload the client signs MUST have LFs only in certain places, then that gives the server enough information to reconstruct the payload and verify the signature. But if we say the signed payload MUST have LFs and the wire payload MAY have LFs, then now we have two completely different formats, only one of which is documented. I thought that was what I was saying. The wire protocol sends the contents of each line (both what is signed and the signature) on a separate packet. When I say contents of a line, I do not include the terminating LF as part of the line (iow, LF is not even optional; the terminating LF is not considered as part of the contents of a line). It becomes irrelevant that a pkt-line may or may not have a trailing optional LF. If there is LF at the end of a packet between push-cert and push-cert-end packets, that LF by definition cannot be part of the contents of a line in a certificate. It is just a pkt-line framing artifact you can and should remove if you are doing a split to array, join with LF implementation to recover the original string. And that is very much consistent with the way we send other things with pkt-line protocol. Each packet up to the first flush is expected to have object name and refname as ref advertisement. The pkt-line framing may or may not add a trailing LF, but LF is not part of refname. It is not even part of the payload; merely an artifact of pkt-line framing. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 1:34 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: Another way of looking at the problem with my assumptions is, I was assuming pkt-line framing was the same thing as pkt-line header. You seem to be saying the definition of pkt-line framing is header, and optional trailing newline. Yes. I thought that was what Server SHOULD terminate with LF; client MUST NOT require it in the existing text meant. Unfortunately, the existing text is littered with examples of PKT-LINE(foo SP bar LF). If we assume PKT-LINE(...) means apply pkt-line framing to the [...], then this strongly implies that pkt-line framing does _not_ include the trailing LF. (Or the logical but bizarre alternative reading that such an example might have _two_ trailing LFs :) Ah, that reminds me of one thing I already said elsewhere. We need to correct the above with s/Server/Sender/; s/Client/Receiver/; I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: On Mon, Jul 6, 2015 at 1:34 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: Another way of looking at the problem with my assumptions is, I was assuming pkt-line framing was the same thing as pkt-line header. You seem to be saying the definition of pkt-line framing is header, and optional trailing newline. Yes. I thought that was what Server SHOULD terminate with LF; client MUST NOT require it in the existing text meant. Unfortunately, the existing text is littered with examples of PKT-LINE(foo SP bar LF). If we assume PKT-LINE(...) means apply pkt-line framing to the [...], then this strongly implies that pkt-line framing does _not_ include the trailing LF. (Or the logical but bizarre alternative reading that such an example might have _two_ trailing LFs :) Yes, But I never viewed PKT-LINE() as an element that strictly defines the grammar of the packet protocol ;-) By clarifying that sender SHOULD terminate with LF, receiver MUST NOT require it is the rule (and fixing the existing implementations at places where they violate the MUST NOT part, which I think are very small number of places), I think we can drop these LF (or LF? for that matter) from all of the PKT-LINE() in the construction in the pack-protocol.txt, which would be a very good thing to do. The example in your sentence will become PKT-LINE(foo SP bar) and the there may be an LF at the end would only be at one place, as a part of the definition of PKT-LINE(). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 12:28 PM, Junio C Hamano gits...@pobox.com wrote: Shawn Pearce spea...@spearce.org writes: push cert format is like commit or tag format. You need those LFs. We can't just go declare them optional because of the way pkt-line read function is implemented in git-core. As I said, I view each of the packets between push-cert and push-cert-end packets representing the meat of each line in the cert. The sending end takes a cert as a long multi-line string, splits them into an array, each of whose element represents a line in it (iow certlines = certstring.split('\n')), and sends them packetised. Right now the sending end sends newlines. The receiver receives a sequence of packets, notices push-cert packet, collects packets until it sees push-cert-end packet and treats them as elements of this array. pkt-line deframing process would have to strip optional LFs to reconstruct the original array the sender had (i.e. the above certlines array). The problem is the signature. Today, the client computes the signature over the payload it actually sends (minus pkt-line headers) The server can munge pkt-lines and reinsert LFs, but it _must_ have some way of reconstructing the payload that the client signed in order to verify the signature. If we just naively insert LFs where missing, we lose the ability to verify the signature. If we say the payload the client signs MUST have LFs only in certain places, then that gives the server enough information to reconstruct the payload and verify the signature. But if we say the signed payload MUST have LFs and the wire payload MAY have LFs, then now we have two completely different formats, only one of which is documented. The receiver needs to join the array with LF to recover the long multi-line string once it received the array. But this LF does not have anything to do with the optional trailing LF in pkt-line. If you sent the original certlines array via different RPC mechanism, you need to join them together with your own LF to reconstruct the multi-line string. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: I think I understand the confusion now. I think you and I are working from different assumptions about the client behavior. I agree that we now both understand where we come from ;-) And sorry for not being clear when I did the push-cert originally in the documentation. As I already said, packets between push-cert and push-cert-end are contents of individual lines of the GPG signed push certificate was the design meant from day one, and a85b377d (push: the beginning of git push --signed, 2014-09-12) could have made it clearer. The problem with the documentation, then, is that the documentation does not say anything to indicate that the signed payload is anything other than what is on the wire. Yeah, that was untold assumption, as I considered what is on the wire to include pkt-line framing when I wrote a85b377d (push: the beginning of git push --signed, 2014-09-12). So maybe this series should include an explicit description of the singed payload outside of the context of a push. Then, in the push section, we can describe the set of transformations that the client MUST perform (splitting on LF; adding pkt-line headers) and MAY perform (adding LFs). Yes, and the latter is not limited to push-cert but anything sent on pkt-line. That incidentally is the only point I deeply care about. I just want to minimize the protocol is this way in general, but only for this one you must do it differently. One example of only for this one you must do it differently is another caveat for protocol implementors for the sending side (again not limited to push cert). That existing implementations of the receivers treat an empty packet (i.e. 0004) as if it is the same as a flush packet (i.e. ), so even if the sending side chooses to ignore the SHOULD terminate each non-flush line using LF, it is strongly advised not to do so when it wants to send an empty payload. This needs to be documented. The receiving end SHOULD NOT treat 0004 the same way as . I think that must be documented and implementations (including our own) should be fixed. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Junio C Hamano gits...@pobox.com writes: By clarifying that sender SHOULD terminate with LF, receiver MUST NOT require it is the rule (and fixing the existing implementations at places where they violate the MUST NOT part, which I think are very small number of places), I think we can drop these LF (or LF? for that matter) from all of the PKT-LINE() in the construction in the pack-protocol.txt, which would be a very good thing to do. The example in your sentence will become PKT-LINE(foo SP bar) and the there may be an LF at the end would only be at one place, as a part of the definition of PKT-LINE(). I quickly scanned both the sources where we use packet_write() in the code and say PKT-LINE in the doc; aside from the actual packfile transfer that happens on the sideband, which technically _is_ a user of PKT-LINE, we do not send anything that does not end with a text in PKT-LINE. I just wanted to make sure that there may or may not be an LF at the end; if there is, it is not part of the payload but is part of the framing does not invite new implementors to break their binary transfer by reading the definition of PKT-LINE too literally to mean ok, so I stuffed this 998 byte binary gunk to the packet and insert an optional LF before sending the remainder in separate packets. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Wed, Jul 1, 2015 at 4:49 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I'm not sure I understand your suggestion. Are you saying, you would prefer to make LFs optional in the push cert, for consistency with LFs being optional elsewhere? Absolutely. It is not make it optional, but even though it is optional, the receiver has not been following the spec, and it is not too late to fix it. The earliest these documentation updates can hit the public is 2.6; by that time I'd expect the deployed receivers would be fixed with 2.5.1 and 2.4.7 maintenance releases. If some third-party reimplemented their client not to terminate with LF, they wouldn't be working correctly with the deployed servers right now *anyway*. And with the more lenient receive-pack in 2.5.1 or 2.4.7, they will start working. And we will not change our client to drop LF termination. So overall I do not see that it is too much a price to pay for consistency across the protocol. Ok, I understand your proposal now, thank you. I will drop this documentation patch from this series, and abandon https://git.eclipse.org/r/51071 in JGit. I am not volunteering to rewrite push cert handling in git-core though ;) If LF is optional, then with that approach you might end up with a section of that buffer like: I think I touched on this in my previous message. You cannot send an empty line anywhere, and this is not limited to push-cert section of the protocol. Strictly speaking, the wire level allows it, but I do not think the deployed client APIs easily lets you deal with it. So you must follow the SHOULD terminate with LF for an empty line, even when you choose to ignore the SHOULD in most other places. I do not think it is such a big loss, as long as it is properly documented. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Wed, Jul 1, 2015 at 1:49 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I'm not sure I understand your suggestion. Are you saying, you would prefer to make LFs optional in the push cert, for consistency with LFs being optional elsewhere? Absolutely. It is not make it optional, but even though it is optional, the receiver has not been following the spec, and it is not too late to fix it. This is madness. For all the reasons Dave points out later in the thread. You can't store and make sense of the push cert without the LF record delimiters. push cert format is like commit or tag format. You need those LFs. We can't just go declare them optional because of the way pkt-line read function is implemented in git-core. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 11:22 AM, Dave Borowitz dborow...@google.com wrote: b. Write a parser in some other clever way, e.g. parsing the entire cert in reverse might work. ...as long as is illegal in nonce and pushee values, which it may be but is not explicitly documented. I still have no desire to write such a parser. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 11:27 AM, Dave Borowitz dborow...@google.com wrote: On Mon, Jul 6, 2015 at 11:22 AM, Dave Borowitz dborow...@google.com wrote: b. Write a parser in some other clever way, e.g. parsing the entire cert in reverse might work. ...as long as is illegal in nonce and pushee values, which it may be but is not explicitly documented. I still have no desire to write such a parser. TBQH at this point I would prefer, as a protocol implementor, to restore the original proposal of this patch, which is to require \n in push certificates. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Mon, Jul 6, 2015 at 10:46 AM, Dave Borowitz dborow...@google.com wrote: On Wed, Jul 1, 2015 at 4:49 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I'm not sure I understand your suggestion. Are you saying, you would prefer to make LFs optional in the push cert, for consistency with LFs being optional elsewhere? Absolutely. It is not make it optional, but even though it is optional, the receiver has not been following the spec, and it is not too late to fix it. The earliest these documentation updates can hit the public is 2.6; by that time I'd expect the deployed receivers would be fixed with 2.5.1 and 2.4.7 maintenance releases. If some third-party reimplemented their client not to terminate with LF, they wouldn't be working correctly with the deployed servers right now *anyway*. And with the more lenient receive-pack in 2.5.1 or 2.4.7, they will start working. And we will not change our client to drop LF termination. So overall I do not see that it is too much a price to pay for consistency across the protocol. Ok, I understand your proposal now, thank you. I will drop this documentation patch from this series, and abandon https://git.eclipse.org/r/51071 in JGit. I am not volunteering to rewrite push cert handling in git-core though ;) Unfortunately, optional LFs still make the stored certs for later auditing and parsing a bit illegible. This is one way in which push certs are fundamentally different from the rest of the wire protocol, which is not intended to be persisted. The corner case I pointed out before where nonce runs into commands is not the only one. Consider the following cert fragment: 001fpushee git://localhost/repo 0029nonce 1433954361-bde756572d665bba81d8 A naive cert storage/auditing implementation would store the raw payload that needs to be verified, without the pkt-line framing. In this case: pushee git://localhost/repononce 1433954361-bde756572d665bba81d8 A naive parser that wants to find the pushee would look for pushee urlish, which would be wrong in this case. (To say nothing of the fact that pushee might actually be -0700pushee.) The alternatives for someone writing a parser are: a. Store the original pkt-line framing. b. Write a parser in some other clever way, e.g. parsing the entire cert in reverse might work. Neither of these is very satisfying, and both reduce human legibility of the stored payload. If LF is optional, then with that approach you might end up with a section of that buffer like: I think I touched on this in my previous message. You cannot send an empty line anywhere, and this is not limited to push-cert section of the protocol. Strictly speaking, the wire level allows it, but I do not think the deployed client APIs easily lets you deal with it. So you must follow the SHOULD terminate with LF for an empty line, even when you choose to ignore the SHOULD in most other places. I do not think it is such a big loss, as long as it is properly documented. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Fri, Jul 3, 2015 at 11:07 AM, Jeff King p...@peff.net wrote: I wondered briefly whether this impacted the keepalives we added to `upload-pack` in 05e9515; those are implemented as 0-byte data packets, which we send during the potentially long counting/delta-compression phase before we send out pack data. It works there because the packets actually contain a single sideband byte, so they are never mistaken for a flush packet. Interesting. I didn't know about 05e9515. This is a great hack. We have similar issues in our server-server system at $DAY_JOB, they use --quiet as no human is watching. So we did a different hack for the same reason. Related, I recently ran into a case where I think we should do the same for pushes. After receiving the pack, `index-pack` may chew on the result for literally minutes (try pushing up the entire linux.git history sometime). We say nothing at all on the wire until we've finished that, run check_everything_connected, and run all hooks. Some clients (or intermediates on the connection) may give up after a few minutes of silence. I think we should have: 1. Some progress eye-candy from the server to tell us that something is happening, and how close we are to finishing (basically index-pack -v). JGit receive-pack has done this for years. We output a progress monitor for the resolving delta phase, and the counting during the graph connectivity check, as JGit being in Java is slow as snot and cannot digest the linux kernel instantly. 2. When progress is disabled, similar keepalive packets saying nope, no output yet. Yea this is a problem so I think JGit ignores the client's request for quiet here and shovels progress messages anyway as a hack to force keep-alive. Never considered the empty side-band message that 05e9515 introduces. For (2), hopefully we can implement it in the same way, and rely on empty sideband-0 packets. I haven't tested it in practice, though (I have some very rough patches for (1) already). sideband-0 is not going to work for JGit clients. JGit clients are strict about the sideband stream being 1,2,3 and fail hard if they get any other stream from the server[1]. [1] https://eclipse.googlesource.com/jgit/jgit/+/master/org.eclipse.jgit/src/org/eclipse/jgit/transport/SideBandInputStream.java#169 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Jeff King p...@peff.net writes: On Wed, Jul 01, 2015 at 01:39:49PM -0700, Junio C Hamano wrote: There is a slight complication on sending an empty line without any termination, though ;-) The reader that calls packet_read() cannot tell such a payload from a flush packet, I think. *That* may be something we want to document. Usually flush packets are , and an empty data packet is 0004. Or are you talking about some kind of flush inside the pkt-data stream? Neither. At the wire level there is a difference, but the callers of most often used function in pkt-line API, packet_read(), says while (1) { len = packet_read(); if (!len) { /* flush */ break; } ... do things on the len bytes received ... ... and then on to the next packet ... } I think the least intrusive change to the caller side would be to teach packet_read() to keep a static and let the callers do this: while (1) { len = packet_read(); if (!len packet_last_was_flush()) { /* flush */ break; } ... do things on the len bytes received ... ... and then on to the next packet ... } even though that looks very ugly. len = packet_read(..., flag); if (!len (flag PKT_LAST_WAS_FLUSH)) { /* flush */ ... might be better. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Fri, Jul 03, 2015 at 11:43:33AM -0700, Shawn Pearce wrote: For (2), hopefully we can implement it in the same way, and rely on empty sideband-0 packets. I haven't tested it in practice, though (I have some very rough patches for (1) already). sideband-0 is not going to work for JGit clients. Er, sorry, mental hiccup. It is 0-length sideband-1 packets that I meant. The same as for upload-pack keepalives. JGit clients are strict about the sideband stream being 1,2,3 and fail hard if they get any other stream from the server[1]. I think git does, too. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Fri, Jul 03, 2015 at 10:45:59AM -0700, Junio C Hamano wrote: Usually flush packets are , and an empty data packet is 0004. Or are you talking about some kind of flush inside the pkt-data stream? Neither. At the wire level there is a difference, but the callers of most often used function in pkt-line API, packet_read(), says while (1) { len = packet_read(); if (!len) { /* flush */ break; } ... do things on the len bytes received ... ... and then on to the next packet ... } Ah, I see. Yeah, that is a problem. The solutions you proposed seem like good workarounds to me, but we are unfortunately stuck with existing clients and servers which did not behave that way. I wondered briefly whether this impacted the keepalives we added to `upload-pack` in 05e9515; those are implemented as 0-byte data packets, which we send during the potentially long counting/delta-compression phase before we send out pack data. It works there because the packets actually contain a single sideband byte, so they are never mistaken for a flush packet. Related, I recently ran into a case where I think we should do the same for pushes. After receiving the pack, `index-pack` may chew on the result for literally minutes (try pushing up the entire linux.git history sometime). We say nothing at all on the wire until we've finished that, run check_everything_connected, and run all hooks. Some clients (or intermediates on the connection) may give up after a few minutes of silence. I think we should have: 1. Some progress eye-candy from the server to tell us that something is happening, and how close we are to finishing (basically index-pack -v). 2. When progress is disabled, similar keepalive packets saying nope, no output yet. For (2), hopefully we can implement it in the same way, and rely on empty sideband-0 packets. I haven't tested it in practice, though (I have some very rough patches for (1) already). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Wed, Jul 01, 2015 at 01:39:49PM -0700, Junio C Hamano wrote: There is a slight complication on sending an empty line without any termination, though ;-) The reader that calls packet_read() cannot tell such a payload from a flush packet, I think. *That* may be something we want to document. Usually flush packets are , and an empty data packet is 0004. Or are you talking about some kind of flush inside the pkt-data stream? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: Signed-off-by: Dave Borowitz dborow...@google.com --- Documentation/technical/pack-protocol.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index 1386840..2d8b1a1 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -534,6 +534,9 @@ A push certificate begins with a set of header lines. After the header and an empty line, the protocol commands follow, one per line. +Note that (unlike other portions of the protocol), all LFs in the +`push-cert` specification above MUST be present. + Currently, the following header fields are defined: `pusher` ident:: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Dave Borowitz dborow...@google.com writes: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I'm not sure I understand your suggestion. Are you saying, you would prefer to make LFs optional in the push cert, for consistency with LFs being optional elsewhere? Absolutely. It is not make it optional, but even though it is optional, the receiver has not been following the spec, and it is not too late to fix it. The earliest these documentation updates can hit the public is 2.6; by that time I'd expect the deployed receivers would be fixed with 2.5.1 and 2.4.7 maintenance releases. If some third-party reimplemented their client not to terminate with LF, they wouldn't be working correctly with the deployed servers right now *anyway*. And with the more lenient receive-pack in 2.5.1 or 2.4.7, they will start working. And we will not change our client to drop LF termination. So overall I do not see that it is too much a price to pay for consistency across the protocol. If LF is optional, then with that approach you might end up with a section of that buffer like: I think I touched on this in my previous message. You cannot send an empty line anywhere, and this is not limited to push-cert section of the protocol. Strictly speaking, the wire level allows it, but I do not think the deployed client APIs easily lets you deal with it. So you must follow the SHOULD terminate with LF for an empty line, even when you choose to ignore the SHOULD in most other places. I do not think it is such a big loss, as long as it is properly documented. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Junio C Hamano gits...@pobox.com writes: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I think that something like this should be sufficient. As the receiving end, we must not complain if there is no terminator. ... And the change we are *not* going to make, but I made temporarily only for testing, on the sending side to violate our sender SHOULD terminate with LF rule would look like this: There is a slight complication on sending an empty line without any termination, though ;-) The reader that calls packet_read() cannot tell such a payload from a flush packet, I think. *That* may be something we want to document. diff --git a/send-pack.c b/send-pack.c index 2a64fec..1a743db 100644 --- a/send-pack.c +++ b/send-pack.c @@ -273,9 +273,11 @@ static int generate_push_cert(struct strbuf *req_buf, packet_buf_write(req_buf, push-cert%c%s, 0, cap_string); for (cp = cert.buf; cp cert.buf + cert.len; cp = np) { + int len; np = next_line(cp, cert.buf + cert.len - cp); + len = (np = cp + 1) ? 1 : (np - cp - 1); packet_buf_write(req_buf, -%.*s, (int)(np - cp), cp); +%.*s, len, cp); } packet_buf_write(req_buf, push-cert-end\n); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
Junio C Hamano gits...@pobox.com writes: Dave Borowitz dborow...@google.com writes: Signed-off-by: Dave Borowitz dborow...@google.com --- Documentation/technical/pack-protocol.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index 1386840..2d8b1a1 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -534,6 +534,9 @@ A push certificate begins with a set of header lines. After the header and an empty line, the protocol commands follow, one per line. +Note that (unlike other portions of the protocol), all LFs in the +`push-cert` specification above MUST be present. + Currently, the following header fields are defined: `pusher` ident:: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I think that something like this should be sufficient. As the receiving end, we must not complain if there is no terminator. diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 94d0571..303a1dd 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1415,9 +1415,12 @@ static struct command *read_head_info(struct sha1_array *shallow) true_flush = 1; break; } - if (!strcmp(certbuf, push-cert-end\n)) + if (!strcmp(certbuf, push-cert-end) || + !strcmp(certbuf, push-cert-end\n)) break; /* end of cert */ strbuf_addstr(push_cert, certbuf); + if (certbuf[len - 1] != '\n') + strbuf_addch(push_cert, '\n'); } if (true_flush) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required
On Wed, Jul 1, 2015 at 1:00 PM, Junio C Hamano gits...@pobox.com wrote: Dave Borowitz dborow...@google.com writes: Signed-off-by: Dave Borowitz dborow...@google.com --- Documentation/technical/pack-protocol.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index 1386840..2d8b1a1 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -534,6 +534,9 @@ A push certificate begins with a set of header lines. After the header and an empty line, the protocol commands follow, one per line. +Note that (unlike other portions of the protocol), all LFs in the +`push-cert` specification above MUST be present. + Currently, the following header fields are defined: `pusher` ident:: I am moderately negative about this; wouldn't it make the end result cleaner to fix the implementation? I'm not sure I understand your suggestion. Are you saying, you would prefer to make LFs optional in the push cert, for consistency with LFs being optional elsewhere? C git servers in the wild already require LFs when extracting the nonce value from the certificate (see find_header). If we make the LFs optional, then a conforming client may not send LFs, which will cause nonce verification to fail when pushing to an unfixed server. That is why I think we are stuck with this. (Also, this is probably not insurmountable, but the cert processing code in receive-pack.c would have to be substantially rewritten if we loosened this requirement. Currently it concatenates the cert contents without pkt-line framing into a buffer, and searches around for \n and \n\n. If LF is optional, then with that approach you might end up with a section of that buffer like: nonce 1234-abcd deadbeefdeadbeefdeadbeefdeadbeefdeadbeef refs/heads/master where it is impossible to distinguish between the end of the nonce and the start of the first command.) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html