Re: Git *accepts* a branch name, it can't identity in the future?
Thanks, but Johannes has already found the issue and given a solution. Regardless, replying to the questions just for the note. On Sun, 2017-08-20 at 04:33 -0400, Jeff King wrote: > What does "git for-each-ref" say about which branches you _do_ have? > > Also, what platform are you on? > I use a "Debian GNU/Linux buster/sid 64-bit" > I'm wondering specifically if you have a filesystem (like HFS+ on MacOS) > that silently rewrites invalid unicode in filenames we create. That > would mean your branches are still there, but probably with some funny > filename like "done/%xxdoc-fix". Git wouldn't know that name because the > filesystem rewriting happened behinds its back (though I'd think that a > further open() call would find the same file, so maybe this is barking > up the wrong tree). > That sounds dangerous! > Another line of thinking: are you sure the � you are writing on the > command line is identical to the one generated by the corruption (and if > you cut and paste, is perhaps a generic glyph placed in the buffer by > your terminal to replace an invalid codepoint, rather than the actual > bytes)? > This was the issue. I wasn't providing git with the actual bytes that resulted as a consequence of the sloppy script. > [you didn't say how your script works, so let's use git to rename] I know of no other way to rename a branch, so I didn't mention it :) > $ broken=$(printf '\223') > > [and we can rename it using that knowledge] > $ git branch ${broken}doc-fix doc-fix > Johannes has already given a solution, this one works too. -- Kaartic
Re: Git *accepts* a branch name, it can't identity in the future?
On Sun, 2017-08-20 at 10:20 +0200, Johannes Sixt wrote: > It is not Git's fault that your terminal converts an invalid UTF-8 > sequence (that your script produces) to �. Nor is it when you paste that > character onto the command line, that it is passed as a (correct) UTF-8 > character. > You're right. I just now realise how I missed the line between "what's seen by us" and "what's seen by the program". > Perhaps this helps (untested): > > $ git branch -m done/$(printf '\x93')doc-fix done/dic-fix > This one helped, thanks. -- Kaartic
Re: Git *accepts* a branch name, it can't identity in the future?
On Sun, Aug 20, 2017 at 01:21:29PM +0530, Kaartic Sivaraam wrote: > I made a small assumption in the script which turned out to be false. I > thought the unicode prefixes I used corresponded to only two bytes. > This lead to the issue. The unicode character '✓' corresponds to three > characters and as a result instead of removing it, my script replaced > it with the unknown character '�'. So, the branch named '✓doc-fix' > became 'done/�doc-fix'. Here's the issue. I couldn't use > > $ git branch -m done/�doc-fix done/dic-fix > > to rename the branch. Nor could I refer to it in anyway. Git simply > says, > > error: pathspec 'done/�doc-fix' did not match any file(s) known to git. What does "git for-each-ref" say about which branches you _do_ have? Also, what platform are you on? I'm wondering specifically if you have a filesystem (like HFS+ on MacOS) that silently rewrites invalid unicode in filenames we create. That would mean your branches are still there, but probably with some funny filename like "done/%xxdoc-fix". Git wouldn't know that name because the filesystem rewriting happened behinds its back (though I'd think that a further open() call would find the same file, so maybe this is barking up the wrong tree). Another line of thinking: are you sure the � you are writing on the command line is identical to the one generated by the corruption (and if you cut and paste, is perhaps a generic glyph placed in the buffer by your terminal to replace an invalid codepoint, rather than the actual bytes)? > I just wanted to know why git accepted a branch name which it can't > identify later? > > If it had rejected that name in the first place it would have been > better. In case you would like to know how I got that weird name, > here's a way to get that > > $ echo '✓doc-fix' | cut -c3-100 [a few defines to make it easy to prod git] $ check=$(printf '\342\234\223') $ broken=$(printf '\223') [this is your starting state, a branch with the unicode name] $ git branch ${check}doc-fix [you didn't say how your script works, so let's use git to rename] $ git branch -m ${check}doc-fix ${broken}doc-fix [my terminal doesn't show the unknown-character glyph, but we can see the funny character with "cat -A"]: $ git for-each-ref --format='%(refname)' | cat -A refs/heads/master$ refs/heads/M-^Sdoc-fix$ [and we can rename it using that knowledge] $ git branch ${broken}doc-fix doc-fix -Peff
Re: Git *accepts* a branch name, it can't identity in the future?
Am 20.08.2017 um 09:51 schrieb Kaartic Sivaraam: I made a small assumption in the script which turned out to be false. I thought the unicode prefixes I used corresponded to only two bytes. This lead to the issue. The unicode character '✓' corresponds to three characters and as a result instead of removing it, my script replaced it with the unknown character '�'. So, the branch named '✓doc-fix' became 'done/�doc-fix'. Here's the issue. I couldn't use $ git branch -m done/�doc-fix done/dic-fix to rename the branch. Nor could I refer to it in anyway. Git simply says, error: pathspec 'done/�doc-fix' did not match any file(s) known to git. It's not a big issue as I haven't lost anything out of it. The branches have been merged into 'master'. I just wanted to know why git accepted a branch name which it can't identify later? If it had rejected that name in the first place it would have been better. In case you would like to know how I got that weird name, here's a way to get that $ echo '✓doc-fix' | cut -c3-100 See, these two are different: $ echo '✓doc-fix' | cut -c3-100 | od -t x1 000 93 64 6f 63 2d 66 69 78 0a 011 $ echo '�doc-fix' | od -t x1 000 64 6f bd 64 6f 63 2d 66 69 78 0a 013 It is not Git's fault that your terminal converts an invalid UTF-8 sequence (that your script produces) to �. Nor is it when you paste that character onto the command line, that it is passed as a (correct) UTF-8 character. Perhaps this helps (untested): $ git branch -m done/$(printf '\x93')doc-fix done/dic-fix In Git's database, branch names are just sequences of bytes. It is outside the scope to verify that all input is encoded correctly. -- Hannes
Git *accepts* a branch name, it can't identity in the future?
Hello all, First of all, I would like to tell that this happened completely by accident and it's partly my mistake. Here's what happened. I recently started creating 'feature branches' a lot for the few patches that I sent to this mailing list. To identify the status of the patch corresponding to that branch I prefixed them with special unicode characters like ✓, ˅ etc. instead of using conventional hierarchical names like, 'done/', 'archived/'. Then I started finding it difficult to distinguish these unicode- prefixed names probably because they had only one unicode character in common. So, I thought of switching to the conventional way of using scoped branch names (old is gold, you see). I wrote a tiny script to rename the branches by replacing a specific unicode prefix with a corresponding hierachy. For example, the script would convert a branch named '✓doc-fix' to 'done/doc-fix'. I made a small assumption in the script which turned out to be false. I thought the unicode prefixes I used corresponded to only two bytes. This lead to the issue. The unicode character '✓' corresponds to three characters and as a result instead of removing it, my script replaced it with the unknown character '�'. So, the branch named '✓doc-fix' became 'done/�doc-fix'. Here's the issue. I couldn't use $ git branch -m done/�doc-fix done/dic-fix to rename the branch. Nor could I refer to it in anyway. Git simply says, error: pathspec 'done/�doc-fix' did not match any file(s) known to git. It's not a big issue as I haven't lost anything out of it. The branches have been merged into 'master'. I just wanted to know why git accepted a branch name which it can't identify later? If it had rejected that name in the first place it would have been better. In case you would like to know how I got that weird name, here's a way to get that $ echo '✓doc-fix' | cut -c3-100 -- Kaartic