Re: change the filetype from binary to text after the file is commited to a git repo
Thx jeff, i will try it tomorrow. > Am 24.07.2017 um 22:32 schrieb Jeff King: > > On Mon, Jul 24, 2017 at 10:26:22PM +0200, tonka3...@gmail.com wrote: > >>> I'm not sure exactly what you're trying to accomplish. If you're unhappy >>> with the file as utf-16, then you should probably convert to utf-8 as a >>> single commit (since the diff will otherwise be unreadable) and then >>> make further changes in utf-8. > >> That was exactly what i'm searching for. The utf-16 back in the days >> was by accident (thx to visual studio). So if the last commit and the >> acutal change are both utf-8 the diff should work again. Just for my >> understanding. Git just take the bytes of the whole file on every >> commit, so there is no general problem with that, the size of the >> utf-16 is just twice as big as the utf-8 one, is that correct? > > Right. The diff switching the encodings will be listed as "binary" (and > you should write a good commit message explaining what's going on!), but > then after that the changes to the utf-8 version will display as normal > text. Git only looks at the actual bytes being diffed, not older > versions of the file. > > -Peff
Re: change the filetype from binary to text after the file is commited to a git repo
On Mon, Jul 24, 2017 at 10:26:22PM +0200, tonka3...@gmail.com wrote: > > I'm not sure exactly what you're trying to accomplish. If you're unhappy > > with the file as utf-16, then you should probably convert to utf-8 as a > > single commit (since the diff will otherwise be unreadable) and then > > make further changes in utf-8. > That was exactly what i'm searching for. The utf-16 back in the days > was by accident (thx to visual studio). So if the last commit and the > acutal change are both utf-8 the diff should work again. Just for my > understanding. Git just take the bytes of the whole file on every > commit, so there is no general problem with that, the size of the > utf-16 is just twice as big as the utf-8 one, is that correct? Right. The diff switching the encodings will be listed as "binary" (and you should write a good commit message explaining what's going on!), but then after that the changes to the utf-8 version will display as normal text. Git only looks at the actual bytes being diffed, not older versions of the file. -Peff
Re: change the filetype from binary to text after the file is commited to a git repo
On Mon, Jul 24, 2017 at 09:02:12PM +0200, tonka3...@gmail.com wrote: > There is no .gitattributes file in the repo. I think that the git > heuristic will also detect utf-16 files as binary (in windows), so i > think that is the reason why my file is binary (i have to check that > tomorrow). Correct. UTF-16 _is_ binary, if you are trying to include it alongside ASCII content (like the rest of the text diff headers). The two cannot mix. > If i add a .gitattribute file i have the problem that git > diff will treat the old and the new blob as utf-8, which generate > garbage. Git's diff doesn't look at encodings at all; it does a diff of the actual bytes without respect to any encoding. So yes, if you use "-a" or a gitattribute to ask git to show you the bytes, the UTF-16 is likely to look like garbage (and a commit rewriting from utf-16 to utf-8 will basically be a rewrite of the whole file contents). > Do you have another idea? Could it be possible to add only a space in > code (utf-8) and then add the real content in a second commit, so the > old and the new one are both utf-8? I'm not sure exactly what you're trying to accomplish. If you're unhappy with the file as utf-16, then you should probably convert to utf-8 as a single commit (since the diff will otherwise be unreadable) and then make further changes in utf-8. If you need the file to remain utf-16 but you want more readable diffs for those versions, you can ask git to convert to utf-8 before performing the diff. Such a diff couldn't be applied, but would be useful for reading. E.g., try: echo 'file diff=utf16' >.gitattributes git config diff.utf16.textconv 'iconv -f utf16 -t utf8' You can read more about how this works in the "textconv" section of "git help attributes". Note that I'm relying on the external "iconv" tool to do the conversion there. It's pretty standard on most Unix systems, but I don't know what would be the best tool on Windows. -Peff
Re: change the filetype from binary to text after the file is commited to a git repo
Hey Jeff, Thx for your answer. There is no .gitattributes file in the repo. I think that the git heuristic will also detect utf-16 files as binary (in windows), so i think that is the reason why my file is binary (i have to check that tomorrow). If i add a .gitattribute file i have the problem that git diff will treat the old and the new blob as utf-8, which generate garbage. Do you have another idea? Could it be possible to add only a space in code (utf-8) and then add the real content in a second commit, so the old and the new one are both utf-8? > Am 24.07.2017 um 20:18 schrieb Jeff King: > >> On Mon, Jul 24, 2017 at 07:11:06AM +0200, tonka tonka wrote: >> >> I have a problem with an already committed file into my repo. This git >> repo was converted from svn to git some years ago. Last week I have >> change some lines in a file and I saw in the diff that it is marked as >> binary (it's a simple .cpp file). I think on the first commit it was >> detected as an utf-16 file (on windows). But no matter what I do I >> can't get it back to a "normal text" text file (git does not detect >> that), but I is now only utf-8. I also replace the whole content of >> the file with just 'a' and git say it's binary. > > Git doesn't store a flag for "binary-ness" on each file (though see > below). As the diffs are generated on the fly when you ask to compare > two versions, so too is the determination of "is this binary". > > The default heuristic looks at file size (by default, if the file is > over 500MB it's considered binary) and whether it has any zero-byte > characters in the first few kilobytes. But note that if _either_ side of > a diff is considered binary, then Git won't show a text diff. > > If you want a particular diff to show all content, even if it doesn't > look like text, add "-a" to your git invocation (e.g., "git show -a"). > > That said, you can also use .gitattributes (see "git help attributes") > to mark a file as binary or not-binary, skipping the heuristic check. > I'm guessing since you converted from svn that you don't have a > .gitattributes file, but it's possible that somebody later added one > that marks the file as binary (and so the solution would be to drop that > entry). > > -Peff
Re: change the filetype from binary to text after the file is commited to a git repo
On Mon, Jul 24, 2017 at 07:11:06AM +0200, tonka tonka wrote: > I have a problem with an already committed file into my repo. This git > repo was converted from svn to git some years ago. Last week I have > change some lines in a file and I saw in the diff that it is marked as > binary (it's a simple .cpp file). I think on the first commit it was > detected as an utf-16 file (on windows). But no matter what I do I > can't get it back to a "normal text" text file (git does not detect > that), but I is now only utf-8. I also replace the whole content of > the file with just 'a' and git say it's binary. Git doesn't store a flag for "binary-ness" on each file (though see below). As the diffs are generated on the fly when you ask to compare two versions, so too is the determination of "is this binary". The default heuristic looks at file size (by default, if the file is over 500MB it's considered binary) and whether it has any zero-byte characters in the first few kilobytes. But note that if _either_ side of a diff is considered binary, then Git won't show a text diff. If you want a particular diff to show all content, even if it doesn't look like text, add "-a" to your git invocation (e.g., "git show -a"). That said, you can also use .gitattributes (see "git help attributes") to mark a file as binary or not-binary, skipping the heuristic check. I'm guessing since you converted from svn that you don't have a .gitattributes file, but it's possible that somebody later added one that marks the file as binary (and so the solution would be to drop that entry). -Peff
change the filetype from binary to text after the file is commited to a git repo
Hey everybody, I have a problem with an already committed file into my repo. This git repo was converted from svn to git some years ago. Last week I have change some lines in a file and I saw in the diff that it is marked as binary (it's a simple .cpp file). I think on the first commit it was detected as an utf-16 file (on windows). But no matter what I do I can't get it back to a "normal text" text file (git does not detect that), but I is now only utf-8. I also replace the whole content of the file with just 'a' and git say it's binary. Is the only way to get it back to text-mode?: * copy a utf-8 version of the original file * delete the file * make a commit * add the old file as a new one I think that will work but it will also break my history. Is there a better way to get these behavior without losing history? Best regards Tonka