[PATCH] emacs: use 'gnus-decoded in notmuch-mm-display-part-inline ()

2012-05-20 Thread Tomi Ollila
When mail message is read from emacs, the message structure
obtained may contain parts which have content included
(`text/plain` for example) and other parts where content is not
included (`text/html` for example).

In case content is included, the string is already available in
emacs' internal format and therefore mm-... functions should not
attempt to do further decoding for the data in temp buffer
provided for it.

Currently when reply buffer is created,
notmuch-mm-display-part-inline () is used to provided quoted reply
content. This change makes the mm-... functions called by it use
'gnus-decoded as charset whenever the content is already available.

File .../emacs-23.3/lisp/gnus/mm-uu.el mentions:
"`gnus-decoded' is a fake charset, which means no further decoding."
---

I propose this patch is taken into 0.13.1.

Please note that I'm not entirely sure my description above
is right. So those who knows more of these issues please check
my "facts".

Note that before this change the only reference to word 'gnus' in
the whole notmuch repository is:
./emacs/notmuch-wash.el:223:  ;; `gnus-art.el'.

if anybody thinks this matters...

 emacs/notmuch-lib.el |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 7fa441a..e99b48d 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -244,7 +244,12 @@ the given type."
 current buffer, if possible."
   (let ((display-buffer (current-buffer)))
 (with-temp-buffer
-  (let* ((charset (plist-get part :content-charset))
+  ;; In case there is :content, the content string is already converted
+  ;; into emacs internal format. `gnus-decoded' is a fake charset,
+  ;; which means no further decoding (to be done by mm- functions).
+  (let* ((charset (if (plist-member part :content)
+ 'gnus-decoded
+   (plist-get part :content-charset)))
 (handle (mm-make-handle (current-buffer) `(,content-type (charset 
. ,charset)
;; If the user wants the part inlined, insert the content and
;; test whether we are able to inline it (which includes both
-- 
1.7.1



emacs complains about encoding?

2012-05-20 Thread Adam Wolfe Gordon
On Wed, May 16, 2012 at 3:24 AM, Tomi Ollila  wrote:
> Haa, It doesn't matter which is the original encoding of the message;
>
> notmuch reply id:20120515194455.B7AD5100646 at guru.guru-group.fi
>
> where ?notmuch show --format=raw ^^^ ?outputs (among other lines):
>
> ?Content-Type: text/plain; charset="iso-8859-1"
> ?Content-Transfer-Encoding: quoted-printable
>
> and
>
> notmuch reply id:"878vgsbprq.fsf at nikula.org"
>
> where ?notmuch show --format=raw ^^^ ?outputs (among other lines):
>
> ?Content-Type: text/plain; charset="utf-8"
> ?Content-Transfer-Encoding: base64
>
> produce correct reply content, both in utf-8.
>
> So it is the emacs side which breaks replies.

It turns out it's actually not the emacs side, but an interaction
between our JSON reply format and emacs.

The JSON reply (and show) code includes part content for all text/*
parts except text/html. Because all JSON is required to be UTF-8, it
handles the encoding itself, puts UTF-8 text in, and omits a
content-charset field from the output. Emacs passes on the
content-charset field to mm-display-part-inline if it's available, but
for text/plain parts it's not, leaving mm-display-part-inline to its
own devices for figuring out what the charset is. It seems
mm-display-part-inline correctly figures out that it's UTF-8, and puts
in the series of ugly \nnn characters because that's what emacs does
with UTF-8 sometimes.

In the original reply stuff (pre-JSON reply format) emacs used the
output of notmuch reply verbatim, so all the charset stuff was handled
in notmuch. Before f6c170fabca8f39e74705e3813504137811bf162, emacs was
using the JSON reply format, but was inserting the text itself instead
of using mm-display-part-inline, so emacs still wasn't trying to do
any charset manipulation. Using mm-display-part-inline is desirable
because it lets us handle non-text/plain (e.g. text/html) parts
correctly in reply, and makes the display more consistent (since we
use it for show). But, it leads to this problem.

So, there are a couple of solutions I can see:

1) Have the JSON formats include the original content-charset even
though they're actually outputting UTF-8. Of the solutions I tried,
this is the best, even though it doesn't sound like a good thing to
do.

2) Have the JSON formats include content only if it's actually UTF-8.
This means that for non-UTF-8 parts (including ASCII parts), the emacs
interface has to do more work to display the part content, since it
must fetch it from outside first. When I tried this, it worked but
caused the \nnn to show up when viewing messages in emacs. I suspect
this is because it sets a charset for the whole buffer, and can't
accommodate messages with different charsets in the same buffer
properly. Reply works correctly, though.

3) Have the JSON formats include the charset for all parts, but make
it UTF-8 for all parts they include content for (since we're actually
outputting UTF-8). This doesn't seem to fix the problem, even though
it seems like it should.

If no one has a better idea or a strong reason not to, I'll send a
patch for solution (1).

-- Adam


Re: emacs complains about encoding?

2012-05-20 Thread Adam Wolfe Gordon
On Wed, May 16, 2012 at 3:24 AM, Tomi Ollila tomi.oll...@iki.fi wrote:
 Haa, It doesn't matter which is the original encoding of the message;

 notmuch reply id:20120515194455.b7ad5100...@guru.guru-group.fi

 where  notmuch show --format=raw ^^^  outputs (among other lines):

  Content-Type: text/plain; charset=iso-8859-1
  Content-Transfer-Encoding: quoted-printable

 and

 notmuch reply id:878vgsbprq@nikula.org

 where  notmuch show --format=raw ^^^  outputs (among other lines):

  Content-Type: text/plain; charset=utf-8
  Content-Transfer-Encoding: base64

 produce correct reply content, both in utf-8.

 So it is the emacs side which breaks replies.

It turns out it's actually not the emacs side, but an interaction
between our JSON reply format and emacs.

The JSON reply (and show) code includes part content for all text/*
parts except text/html. Because all JSON is required to be UTF-8, it
handles the encoding itself, puts UTF-8 text in, and omits a
content-charset field from the output. Emacs passes on the
content-charset field to mm-display-part-inline if it's available, but
for text/plain parts it's not, leaving mm-display-part-inline to its
own devices for figuring out what the charset is. It seems
mm-display-part-inline correctly figures out that it's UTF-8, and puts
in the series of ugly \nnn characters because that's what emacs does
with UTF-8 sometimes.

In the original reply stuff (pre-JSON reply format) emacs used the
output of notmuch reply verbatim, so all the charset stuff was handled
in notmuch. Before f6c170fabca8f39e74705e3813504137811bf162, emacs was
using the JSON reply format, but was inserting the text itself instead
of using mm-display-part-inline, so emacs still wasn't trying to do
any charset manipulation. Using mm-display-part-inline is desirable
because it lets us handle non-text/plain (e.g. text/html) parts
correctly in reply, and makes the display more consistent (since we
use it for show). But, it leads to this problem.

So, there are a couple of solutions I can see:

1) Have the JSON formats include the original content-charset even
though they're actually outputting UTF-8. Of the solutions I tried,
this is the best, even though it doesn't sound like a good thing to
do.

2) Have the JSON formats include content only if it's actually UTF-8.
This means that for non-UTF-8 parts (including ASCII parts), the emacs
interface has to do more work to display the part content, since it
must fetch it from outside first. When I tried this, it worked but
caused the \nnn to show up when viewing messages in emacs. I suspect
this is because it sets a charset for the whole buffer, and can't
accommodate messages with different charsets in the same buffer
properly. Reply works correctly, though.

3) Have the JSON formats include the charset for all parts, but make
it UTF-8 for all parts they include content for (since we're actually
outputting UTF-8). This doesn't seem to fix the problem, even though
it seems like it should.

If no one has a better idea or a strong reason not to, I'll send a
patch for solution (1).

-- Adam
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] emacs: use 'gnus-decoded in notmuch-mm-display-part-inline ()

2012-05-20 Thread Tomi Ollila
When mail message is read from emacs, the message structure
obtained may contain parts which have content included
(`text/plain` for example) and other parts where content is not
included (`text/html` for example).

In case content is included, the string is already available in
emacs' internal format and therefore mm-... functions should not
attempt to do further decoding for the data in temp buffer
provided for it.

Currently when reply buffer is created,
notmuch-mm-display-part-inline () is used to provided quoted reply
content. This change makes the mm-... functions called by it use
'gnus-decoded as charset whenever the content is already available.

File .../emacs-23.3/lisp/gnus/mm-uu.el mentions:
`gnus-decoded' is a fake charset, which means no further decoding.
---

I propose this patch is taken into 0.13.1.

Please note that I'm not entirely sure my description above
is right. So those who knows more of these issues please check
my facts.

Note that before this change the only reference to word 'gnus' in
the whole notmuch repository is:
./emacs/notmuch-wash.el:223:  ;; `gnus-art.el'.

if anybody thinks this matters...

 emacs/notmuch-lib.el |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 7fa441a..e99b48d 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -244,7 +244,12 @@ the given type.
 current buffer, if possible.
   (let ((display-buffer (current-buffer)))
 (with-temp-buffer
-  (let* ((charset (plist-get part :content-charset))
+  ;; In case there is :content, the content string is already converted
+  ;; into emacs internal format. `gnus-decoded' is a fake charset,
+  ;; which means no further decoding (to be done by mm- functions).
+  (let* ((charset (if (plist-member part :content)
+ 'gnus-decoded
+   (plist-get part :content-charset)))
 (handle (mm-make-handle (current-buffer) `(,content-type (charset 
. ,charset)
;; If the user wants the part inlined, insert the content and
;; test whether we are able to inline it (which includes both
-- 
1.7.1

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch