Re: ediff displays gibberish output
In article <[EMAIL PROTECTED]>, Eli Zaretskii <[EMAIL PROTECTED]> writes: > > From: Stefan Monnier <[EMAIL PROTECTED]> > > > > I think the case where both files use the same encoding is the common case > > rather than the exception. > In this case, one file was UTF-8, the other was pure 7-bit ASCII. I > think this case is also very common. To save those cases, I think chaging the code of reading the process output to use `undecided' coding-system is enough. > And then there's the case when one file is ISO-88590-x, the other is > UTF-8; also very common. If two files contain identical characters (just encodings are different), I'm not sure what is the right thing. If we decode the both hunks correctly, a user will see no difference and wonder why Emacs tells those lines are different. --- Kenichi Handa [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
> In this case, one file was UTF-8, the other was pure 7-bit ASCII. I > think this case is also very common. Right, we may also want to handle cases where one encoding is a subset of the other. But that can come as a second step. > And then there's the case when one file is ISO-88590-x, the other is > UTF-8; also very common. But much more difficult to handle, since we then have to decode each half of a hunk differently. This can probably wait until the point where only utf-8 is relevant. > Patches welcome, as always (but please hold your horses until after > the release). Seconded, Stefan ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
> Cc: Leo <[EMAIL PROTECTED]>, emacs-pretest-bug@gnu.org > From: Stefan Monnier <[EMAIL PROTECTED]> > Date: Mon, 18 Dec 2006 05:21:08 -0500 > > I think the case where both files use the same encoding is the common case > rather than the exception. In this case, one file was UTF-8, the other was pure 7-bit ASCII. I think this case is also very common. And then there's the case when one file is ISO-88590-x, the other is UTF-8; also very common. Etc., etc. Patches welcome, as always (but please hold your horses until after the release). ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
>> All my files are in utf-8 encoding and that's why I am surprised to >> see such output. > But that's a very special case. We cannot make Emacs work only in > special cases. I think the case where both files use the same encoding is the common case rather than the exception. Stefan ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
> From: Leo <[EMAIL PROTECTED]> > Date: Mon, 18 Dec 2006 02:02:55 + > > All my files are in utf-8 encoding and that's why I am surprised to > see such output. But that's a very special case. We cannot make Emacs work only in special cases. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
* Kenichi Handa (2006-12-18 10:17 +0900) said: ^ [...] > That's perhaps because Emacs reads the output of process while > decoding by a detected coding system. That method works for your > test case, but fails in a case that two files contain non-ascii > characters in different encoding (e.g. UTF-8 vs GBK). In ediff, two files both in UTF-8 encoding that contain Chinese characters still output the diff in raw utf-8 coding (those \234 etc.) All my files are in utf-8 encoding and that's why I am surprised to see such output. -- Leo (GPG Key: 9283AA3F) ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
In article <[EMAIL PROTECTED]>, Leo <[EMAIL PROTECTED]> writes: > * Eli Zaretskii (2006-12-17 06:30 +0200) said: > ^ >>> From: Leo <[EMAIL PROTECTED]> >>> Date: Sun, 17 Dec 2006 03:15:59 + >>> >>> But a file with eight-bit characters can have a correct diff >>> output. What makes ediff fail where diff succeeds? > > > > I'm not sure what you mean, but my crystal ball says that you are > > looking at the output of Diff in a terminal that supports UTF-8 > > encoded characters. If that's the case, then you will only see > > correct output from Diff with UTF-8 encoded files; other encodings > > will show gibberish. > > > > By contrast, Emacs does not support a single encoding, it supports > > many different ones. It needs to know the right encoding to display > > the characters as readable. > I mean in emacs running diff-buffer-with-file or vc-diff. The diff > output displays correctly. That's perhaps because Emacs reads the output of process while decoding by a detected coding system. That method works for your test case, but fails in a case that two files contain non-ascii characters in different encoding (e.g. UTF-8 vs GBK). --- Kenichi Handa [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
>> I'm not sure we can do any better here, unless we somehow know the >> encoding of each of the two files. [...] > This is also what I basically told Leo. Clearly, we can take a look at each file, figure out their respective encoding, and if both use the same, then use that encoding to decode the diff output. I haven't looked at the code to see what kid of effort would be required to do that, but it's likely to be a fairly common case, so it'd be good to handle it right. Stefan ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
> > From: Leo <[EMAIL PROTECTED]> > > Date: Wed, 13 Dec 2006 01:22:50 + > > > > How to reproduce: > > 1. save the attachments to file1.txt and file2.txt > > > > 2. M-x ediff RET and choose file1.txt file2.txt respectively > > > > 3. Type 'D' in ediff panel window and you see the difference > > output is gibberish as shown in the screenshot. > > Thank you for your report. > > I'm not sure this is a bug, though: `D' displays the raw output from > Diff, whose encoding is not clear, because it comes from two different > files that could be each encoded differently. Ediff reads the output > of Diff in raw-text form, precisely for this reason, and that is how > the buffer is displayed to you when you press `D'. What you see is > not gibberish, but the UTF-8 encoding of the non-ASCII characters, > exactly as they are in the written in file1.txt on disk. > > I'm not sure we can do any better here, unless we somehow know the > encoding of each of the two files. > > Michael and Handa-san, could you please comment on this? > This is also what I basically told Leo. --michael ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
* Eli Zaretskii (2006-12-17 06:30 +0200) said: ^ >> From: Leo <[EMAIL PROTECTED]> >> Date: Sun, 17 Dec 2006 03:15:59 + >> >> But a file with eight-bit characters can have a correct diff >> output. What makes ediff fail where diff succeeds? > > I'm not sure what you mean, but my crystal ball says that you are > looking at the output of Diff in a terminal that supports UTF-8 > encoded characters. If that's the case, then you will only see > correct output from Diff with UTF-8 encoded files; other encodings > will show gibberish. > > By contrast, Emacs does not support a single encoding, it supports > many different ones. It needs to know the right encoding to display > the characters as readable. I mean in emacs running diff-buffer-with-file or vc-diff. The diff output displays correctly. -- Leo (GPG Key: 9283AA3F) ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
> From: Leo <[EMAIL PROTECTED]> > Date: Sun, 17 Dec 2006 03:15:59 + > > But a file with eight-bit characters can have a correct diff > output. What makes ediff fail where diff succeeds? I'm not sure what you mean, but my crystal ball says that you are looking at the output of Diff in a terminal that supports UTF-8 encoded characters. If that's the case, then you will only see correct output from Diff with UTF-8 encoded files; other encodings will show gibberish. By contrast, Emacs does not support a single encoding, it supports many different ones. It needs to know the right encoding to display the characters as readable. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
* Kenichi Handa (2006-12-17 12:04 +0900) said: ^ [...] > If efficiency doesn't matter, we can read two files, check > which coding systems are used for each file, and parse > ourput of diff and decode parts of each file by > the corresponding coding systems. > > But, perhaps, it is good enough to make the command work > with universal-coding-system-argument (C-x RET c) (if not > yet done), and warn user to use it when the output diff > contains eight-bit characters. > > --- > Kenichi Handa > [EMAIL PROTECTED] But a file with eight-bit characters can have a correct diff output. What makes ediff fail where diff succeeds? -- Leo (GPG Key: 9283AA3F) ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
In article <[EMAIL PROTECTED]>, Eli Zaretskii <[EMAIL PROTECTED]> writes: > I'm not sure this is a bug, though: `D' displays the raw output from > Diff, whose encoding is not clear, because it comes from two different > files that could be each encoded differently. Ediff reads the output > of Diff in raw-text form, precisely for this reason, and that is how > the buffer is displayed to you when you press `D'. What you see is > not gibberish, but the UTF-8 encoding of the non-ASCII characters, > exactly as they are in the written in file1.txt on disk. Right. > I'm not sure we can do any better here, unless we somehow know the > encoding of each of the two files. If efficiency doesn't matter, we can read two files, check which coding systems are used for each file, and parse ourput of diff and decode parts of each file by the corresponding coding systems. But, perhaps, it is good enough to make the command work with universal-coding-system-argument (C-x RET c) (if not yet done), and warn user to use it when the output diff contains eight-bit characters. --- Kenichi Handa [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: ediff displays gibberish output
> From: Leo <[EMAIL PROTECTED]> > Date: Wed, 13 Dec 2006 01:22:50 + > > How to reproduce: > 1. save the attachments to file1.txt and file2.txt > > 2. M-x ediff RET and choose file1.txt file2.txt respectively > > 3. Type 'D' in ediff panel window and you see the difference > output is gibberish as shown in the screenshot. Thank you for your report. I'm not sure this is a bug, though: `D' displays the raw output from Diff, whose encoding is not clear, because it comes from two different files that could be each encoded differently. Ediff reads the output of Diff in raw-text form, precisely for this reason, and that is how the buffer is displayed to you when you press `D'. What you see is not gibberish, but the UTF-8 encoding of the non-ASCII characters, exactly as they are in the written in file1.txt on disk. I'm not sure we can do any better here, unless we somehow know the encoding of each of the two files. Michael and Handa-san, could you please comment on this? ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
ediff displays gibberish output
Please write in English if possible, because the Emacs maintainers usually do not have translators to read other languages for them. Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list. Please describe exactly what actions triggered the bug and the precise symptoms of the bug: How to reproduce: 1. save the attachments to file1.txt and file2.txt 2. M-x ediff RET and choose file1.txt file2.txt respectively 3. Type 'D' in ediff panel window and you see the difference output is gibberish as shown in the screenshot. GNU Emacs是什么? What is GNU Emacs? If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /opt/free.APPS/emacs/share/emacs/23.0.0/etc/DEBUG for instructions. In GNU Emacs 23.0.0.3 (i686-pc-linux-gnu, GTK+ Version 2.8.20) of 2006-11-22 on Fedora X server distributor `The X.Org Foundation', version 11.0.7000 configured using `configure '--prefix=/opt/free.APPS/emacs' '--with-gtk' '--with-freetype' '--with-xft' '--enable-font-backend' '--with-pop=yes' '--enable-locallisppath=/opt/share/emacs/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/site-lisp' '--without-xim' 'CFLAGS=-O2 -march=pentium-m -pipe -fomit-frame-pointer'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_GB.UTF-8 locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: Emacs-Lisp Minor modes in effect: erc-page-mode: t erc-services-mode: t erc-autojoin-mode: t erc-button-mode: t erc-ring-mode: t erc-pcomplete-mode: t erc-track-mode: t erc-match-mode: t erc-fill-mode: t erc-stamp-mode: t erc-netsplit-mode: t erc-smiley-mode: t erc-scrolltobottom-mode: t paredit-mode: t dired-omit-mode: t recentf-mode: t icomplete-mode: t show-paren-mode: t delete-selection-mode: t global-auto-revert-mode: t display-time-mode: t shell-dirtrack-mode: t tooltip-mode: t tool-bar-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: 1 6 6 9 , 2 5 2 1 6 6 9 , 2 5 C-e q y C-x 1 C-x k C-x k C-x k C-x C-b q M-x r e p o r b u Recent messages: Buffer A: Processing difference region 0 of 2 Buffer B: Processing difference region 0 of 2 Processing difference regions ... done Mark set [3 times] Auto-saving...done Mark set Quit this Ediff session? (y or n) Updating buffer list...done Commands: m, u, t, RET, g, k, S, D, Q; q to quit; h for help Loading emacsbug...done ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug