Re: ediff displays gibberish output

2006-12-18 Thread Kenichi Handa
In article <[EMAIL PROTECTED]>, Eli Zaretskii <[EMAIL PROTECTED]> writes:

> > From: Stefan Monnier <[EMAIL PROTECTED]>
> > 
> > I think the case where both files use the same encoding is the common case
> > rather than the exception.

> In this case, one file was UTF-8, the other was pure 7-bit ASCII.  I
> think this case is also very common.

To save those cases, I think chaging the code of reading the
process output to use `undecided' coding-system is enough.

> And then there's the case when one file is ISO-88590-x, the other is
> UTF-8; also very common.

If two files contain identical characters (just encodings
are different), I'm not sure what is the right thing.  If we
decode the both hunks correctly, a user will see no
difference and wonder why Emacs tells those lines are
different.

---
Kenichi Handa
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-18 Thread Stefan Monnier
> In this case, one file was UTF-8, the other was pure 7-bit ASCII.  I
> think this case is also very common.

Right, we may also want to handle cases where one encoding is a subset of
the other.  But that can come as a second step.

> And then there's the case when one file is ISO-88590-x, the other is
> UTF-8; also very common.

But much more difficult to handle, since we then have to decode each half of
a hunk differently.  This can probably wait until the point where only utf-8
is relevant.

> Patches welcome, as always (but please hold your horses until after
> the release).

Seconded,


Stefan


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-18 Thread Eli Zaretskii
> Cc: Leo <[EMAIL PROTECTED]>,  emacs-pretest-bug@gnu.org
> From: Stefan Monnier <[EMAIL PROTECTED]>
> Date: Mon, 18 Dec 2006 05:21:08 -0500
> 
> I think the case where both files use the same encoding is the common case
> rather than the exception.

In this case, one file was UTF-8, the other was pure 7-bit ASCII.  I
think this case is also very common.

And then there's the case when one file is ISO-88590-x, the other is
UTF-8; also very common.

Etc., etc.

Patches welcome, as always (but please hold your horses until after
the release).


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-18 Thread Stefan Monnier
>> All my files are in utf-8 encoding and that's why I am surprised to
>> see such output.

> But that's a very special case.  We cannot make Emacs work only in
> special cases.

I think the case where both files use the same encoding is the common case
rather than the exception.


Stefan


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-17 Thread Eli Zaretskii
> From: Leo <[EMAIL PROTECTED]>
> Date: Mon, 18 Dec 2006 02:02:55 +
> 
> All my files are in utf-8 encoding and that's why I am surprised to
> see such output.

But that's a very special case.  We cannot make Emacs work only in
special cases.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-17 Thread Leo
* Kenichi Handa (2006-12-18 10:17 +0900) said:
  ^
[...]
> That's perhaps because Emacs reads the output of process while
> decoding by a detected coding system.  That method works for your
> test case, but fails in a case that two files contain non-ascii
> characters in different encoding (e.g. UTF-8 vs GBK).

In ediff, two files both in UTF-8 encoding that contain Chinese
characters still output the diff in raw utf-8 coding (those \234 etc.)

All my files are in utf-8 encoding and that's why I am surprised to
see such output.

-- 
Leo  (GPG Key: 9283AA3F)



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-17 Thread Kenichi Handa
In article <[EMAIL PROTECTED]>, Leo <[EMAIL PROTECTED]> writes:

> * Eli Zaretskii (2006-12-17 06:30 +0200) said:
>   ^
>>> From: Leo <[EMAIL PROTECTED]>
>>> Date: Sun, 17 Dec 2006 03:15:59 +
>>> 
>>> But a file with eight-bit characters can have a correct diff
>>> output. What makes ediff fail where diff succeeds?
> >
> > I'm not sure what you mean, but my crystal ball says that you are
> > looking at the output of Diff in a terminal that supports UTF-8
> > encoded characters.  If that's the case, then you will only see
> > correct output from Diff with UTF-8 encoded files; other encodings
> > will show gibberish.
> >
> > By contrast, Emacs does not support a single encoding, it supports
> > many different ones.  It needs to know the right encoding to display
> > the characters as readable.

> I mean in emacs running diff-buffer-with-file or vc-diff. The diff
> output displays correctly.

That's perhaps because Emacs reads the output of process
while decoding by a detected coding system.  That method
works for your test case, but fails in a case that two files
contain non-ascii characters in different encoding
(e.g. UTF-8 vs GBK).

---
Kenichi Handa
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-17 Thread Stefan Monnier
>> I'm not sure we can do any better here, unless we somehow know the
>> encoding of each of the two files.
[...]
> This is also what I basically told Leo.

Clearly, we can take a look at each file, figure out their respective
encoding, and if both use the same, then use that encoding to decode the
diff output.

I haven't looked at the code to see what kid of effort would be required to
do that, but it's likely to be a fairly common case, so it'd be good to
handle it right.


Stefan


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-17 Thread Michael Kifer

> > From: Leo <[EMAIL PROTECTED]>
> > Date: Wed, 13 Dec 2006 01:22:50 +
> > 
> > How to reproduce:
> >   1. save the attachments to file1.txt and file2.txt
> >   
> >   2. M-x ediff RET and choose file1.txt file2.txt respectively
> >   
> >   3. Type 'D' in ediff panel window and you see the difference
> >  output is gibberish as shown in the screenshot.
> 
> Thank you for your report.
> 
> I'm not sure this is a bug, though: `D' displays the raw output from
> Diff, whose encoding is not clear, because it comes from two different
> files that could be each encoded differently.  Ediff reads the output
> of Diff in raw-text form, precisely for this reason, and that is how
> the buffer is displayed to you when you press `D'.  What you see is
> not gibberish, but the UTF-8 encoding of the non-ASCII characters,
> exactly as they are in the written in file1.txt on disk.
> 
> I'm not sure we can do any better here, unless we somehow know the
> encoding of each of the two files.
> 
> Michael and Handa-san, could you please comment on this?
> 


This is also what I basically told Leo.


--michael  


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-16 Thread Leo
* Eli Zaretskii (2006-12-17 06:30 +0200) said:
  ^
>> From: Leo <[EMAIL PROTECTED]>
>> Date: Sun, 17 Dec 2006 03:15:59 +
>> 
>> But a file with eight-bit characters can have a correct diff
>> output. What makes ediff fail where diff succeeds?
>
> I'm not sure what you mean, but my crystal ball says that you are
> looking at the output of Diff in a terminal that supports UTF-8
> encoded characters.  If that's the case, then you will only see
> correct output from Diff with UTF-8 encoded files; other encodings
> will show gibberish.
>
> By contrast, Emacs does not support a single encoding, it supports
> many different ones.  It needs to know the right encoding to display
> the characters as readable.

I mean in emacs running diff-buffer-with-file or vc-diff. The diff
output displays correctly.

-- 
Leo  (GPG Key: 9283AA3F)



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-16 Thread Eli Zaretskii
> From: Leo <[EMAIL PROTECTED]>
> Date: Sun, 17 Dec 2006 03:15:59 +
> 
> But a file with eight-bit characters can have a correct diff
> output. What makes ediff fail where diff succeeds?

I'm not sure what you mean, but my crystal ball says that you are
looking at the output of Diff in a terminal that supports UTF-8
encoded characters.  If that's the case, then you will only see
correct output from Diff with UTF-8 encoded files; other encodings
will show gibberish.

By contrast, Emacs does not support a single encoding, it supports
many different ones.  It needs to know the right encoding to display
the characters as readable.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-16 Thread Leo
* Kenichi Handa (2006-12-17 12:04 +0900) said:
  ^
[...]
> If efficiency doesn't matter, we can read two files, check
> which coding systems are used for each file, and parse
> ourput of diff and decode parts of each file by
> the corresponding coding systems.
>
> But, perhaps, it is good enough to make the command work
> with universal-coding-system-argument (C-x RET c) (if not
> yet done), and warn user to use it when the output diff
> contains eight-bit characters.
>
> ---
> Kenichi Handa
> [EMAIL PROTECTED]

But a file with eight-bit characters can have a correct diff
output. What makes ediff fail where diff succeeds?

-- 
Leo  (GPG Key: 9283AA3F)



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-16 Thread Kenichi Handa
In article <[EMAIL PROTECTED]>, Eli Zaretskii <[EMAIL PROTECTED]> writes:

> I'm not sure this is a bug, though: `D' displays the raw output from
> Diff, whose encoding is not clear, because it comes from two different
> files that could be each encoded differently.  Ediff reads the output
> of Diff in raw-text form, precisely for this reason, and that is how
> the buffer is displayed to you when you press `D'.  What you see is
> not gibberish, but the UTF-8 encoding of the non-ASCII characters,
> exactly as they are in the written in file1.txt on disk.

Right.

> I'm not sure we can do any better here, unless we somehow know the
> encoding of each of the two files.

If efficiency doesn't matter, we can read two files, check
which coding systems are used for each file, and parse
ourput of diff and decode parts of each file by
the corresponding coding systems.

But, perhaps, it is good enough to make the command work
with universal-coding-system-argument (C-x RET c) (if not
yet done), and warn user to use it when the output diff
contains eight-bit characters.

---
Kenichi Handa
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: ediff displays gibberish output

2006-12-16 Thread Eli Zaretskii
> From: Leo <[EMAIL PROTECTED]>
> Date: Wed, 13 Dec 2006 01:22:50 +
> 
> How to reproduce:
>   1. save the attachments to file1.txt and file2.txt
>   
>   2. M-x ediff RET and choose file1.txt file2.txt respectively
>   
>   3. Type 'D' in ediff panel window and you see the difference
>  output is gibberish as shown in the screenshot.

Thank you for your report.

I'm not sure this is a bug, though: `D' displays the raw output from
Diff, whose encoding is not clear, because it comes from two different
files that could be each encoded differently.  Ediff reads the output
of Diff in raw-text form, precisely for this reason, and that is how
the buffer is displayed to you when you press `D'.  What you see is
not gibberish, but the UTF-8 encoding of the non-ASCII characters,
exactly as they are in the written in file1.txt on disk.

I'm not sure we can do any better here, unless we somehow know the
encoding of each of the two files.

Michael and Handa-san, could you please comment on this?


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


ediff displays gibberish output

2006-12-12 Thread Leo

Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.

Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list.

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

How to reproduce:
  1. save the attachments to file1.txt and file2.txt
  
  2. M-x ediff RET and choose file1.txt file2.txt respectively
  
  3. Type 'D' in ediff panel window and you see the difference
 output is gibberish as shown in the screenshot.

GNU Emacs是什么?
What is GNU Emacs?


If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/opt/free.APPS/emacs/share/emacs/23.0.0/etc/DEBUG for instructions.


In GNU Emacs 23.0.0.3 (i686-pc-linux-gnu, GTK+ Version 2.8.20)
 of 2006-11-22 on Fedora
X server distributor `The X.Org Foundation', version 11.0.7000
configured using `configure '--prefix=/opt/free.APPS/emacs' '--with-gtk' 
'--with-freetype' '--with-xft' '--enable-font-backend' '--with-pop=yes' 
'--enable-locallisppath=/opt/share/emacs/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/site-lisp'
 '--without-xim' 'CFLAGS=-O2 -march=pentium-m -pipe -fomit-frame-pointer''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_GB.UTF-8
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: Emacs-Lisp

Minor modes in effect:
  erc-page-mode: t
  erc-services-mode: t
  erc-autojoin-mode: t
  erc-button-mode: t
  erc-ring-mode: t
  erc-pcomplete-mode: t
  erc-track-mode: t
  erc-match-mode: t
  erc-fill-mode: t
  erc-stamp-mode: t
  erc-netsplit-mode: t
  erc-smiley-mode: t
  erc-scrolltobottom-mode: t
  paredit-mode: t
  dired-omit-mode: t
  recentf-mode: t
  icomplete-mode: t
  show-paren-mode: t
  delete-selection-mode: t
  global-auto-revert-mode: t
  display-time-mode: t
  shell-dirtrack-mode: t
  tooltip-mode: t
  tool-bar-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
 
1 6 6 9 , 2 5  
   
   
   
2
  1 6 6 9 , 2 5  C-e  
 
q 
y C-x 1 C-x k  C-x k  C-x k  
C-x C-b q
   M-x r e p 
o r  b u  

Recent messages:
Buffer A: Processing difference region 0 of 2
Buffer B: Processing difference region 0 of 2
Processing difference regions ... done
Mark set [3 times]
Auto-saving...done
Mark set
Quit this Ediff session? (y or n) 
Updating buffer list...done
Commands: m, u, t, RET, g, k, S, D, Q; q to quit; h for help
Loading emacsbug...done
___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug