The overloading of show (was Re: [PATCH] Output unmodified Content-Type header value for JSON format.)

2012-01-16 Thread David Edmondson
On Sat, 14 Jan 2012 19:36:17 -0500, Austin Clements  wrote:
> ...there are several levels of structure here:
> 
> 1. Threads (query results)
> 2. Thread structure
> 3. Message structure (MIME)
> 4. Part content
> 
> Currently, search returns 1; show --format=json returns 2, 3, and
> sometimes 4 (but sometimes not); and show --format=raw returns 4.
> Notably, 1 does not require opening message files (neither does 2),
> which I consider an important distinction between search and show.
> 
> Some of the discussion has been about putting 4 squarely in the realm
> of show --format=raw.  One counterargument (which has grown on me
> since this discussion) is that the part content included in
> --format=json can be thought of as pre-fetching content that clients
> are likely to need in order to avoid re-parsing the message in most
> circumstances.  I believe this is not the *intent* of the current
> code, though without a specification of the JSON format it's hard to
> tell.

The JSON output included what was considered useful (mostly for the
Emacs UI), but how much deep thought went into 'useful' I couldn't say.

> Other discussion (more interesting, in my mind) has been about
> separating retrieving thread structure, 2, from retrieving message
> structure, 3.  To me, splitting these feels much more natural than
> what we do now, which seems to be inflexibly bound to the specific way
> the Emacs show mode currently works.  The thread structure is readily
> available from the database, so I think separating these would open up
> some new UI opportunities, particularly easy and fast thread outlining
> and navigation.

Given that the current output already includes both 2 and 3, anything
that could be done with 2 can be done with the current output, so
there's no block on the kind of innovation that you describe other than
possibly some performance problems.

notmuch-lkml.el[1] was a quick prototype of an alternative way to find
messages to read based on suggestions from Aneesh. It could have used
the proposed 'thread structure only' output.

The changes you allude to make sense. My only concern would be any
potential impact on the current Emacs UI's use of JSON output. Switching
to a model where a typical 'notmuch-show' buffer requires many calls to
notmuch (and commensurate JSON parsing) may perform significantly worse
than the current approach.

> I believe it would also simplify the code and address some irritating
> asymmetries in the way notmuch show handles the --part argument.

Footnotes: 
[1]  http://dme.org/data/emacs/notmuch-lkml.el

-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: 



Re: The overloading of show (was Re: [PATCH] Output unmodified Content-Type header value for JSON format.)

2012-01-16 Thread David Edmondson
On Sat, 14 Jan 2012 19:36:17 -0500, Austin Clements amdra...@mit.edu wrote:
 ...there are several levels of structure here:
 
 1. Threads (query results)
 2. Thread structure
 3. Message structure (MIME)
 4. Part content
 
 Currently, search returns 1; show --format=json returns 2, 3, and
 sometimes 4 (but sometimes not); and show --format=raw returns 4.
 Notably, 1 does not require opening message files (neither does 2),
 which I consider an important distinction between search and show.
 
 Some of the discussion has been about putting 4 squarely in the realm
 of show --format=raw.  One counterargument (which has grown on me
 since this discussion) is that the part content included in
 --format=json can be thought of as pre-fetching content that clients
 are likely to need in order to avoid re-parsing the message in most
 circumstances.  I believe this is not the *intent* of the current
 code, though without a specification of the JSON format it's hard to
 tell.

The JSON output included what was considered useful (mostly for the
Emacs UI), but how much deep thought went into 'useful' I couldn't say.

 Other discussion (more interesting, in my mind) has been about
 separating retrieving thread structure, 2, from retrieving message
 structure, 3.  To me, splitting these feels much more natural than
 what we do now, which seems to be inflexibly bound to the specific way
 the Emacs show mode currently works.  The thread structure is readily
 available from the database, so I think separating these would open up
 some new UI opportunities, particularly easy and fast thread outlining
 and navigation.

Given that the current output already includes both 2 and 3, anything
that could be done with 2 can be done with the current output, so
there's no block on the kind of innovation that you describe other than
possibly some performance problems.

notmuch-lkml.el[1] was a quick prototype of an alternative way to find
messages to read based on suggestions from Aneesh. It could have used
the proposed 'thread structure only' output.

The changes you allude to make sense. My only concern would be any
potential impact on the current Emacs UI's use of JSON output. Switching
to a model where a typical 'notmuch-show' buffer requires many calls to
notmuch (and commensurate JSON parsing) may perform significantly worse
than the current approach.

 I believe it would also simplify the code and address some irritating
 asymmetries in the way notmuch show handles the --part argument.

Footnotes: 
[1]  http://dme.org/data/emacs/notmuch-lkml.el



pgprCxbFv74AV.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


The overloading of show (was Re: [PATCH] Output unmodified Content-Type header value for JSON format.)

2012-01-14 Thread Austin Clements
(was in reply to id:87ehv2proa.fsf at praet.org, but I wanted to start a
new top-level thread)

Quoth Pieter Praet on Jan 14 at 10:19 am:
> On Thu, 12 Jan 2012 12:28:40 -0500, Austin Clements  
> wrote:
> > Quoth Pieter Praet on Jan 12 at  6:07 pm:
> > > On Tue, 22 Nov 2011 22:40:21 -0500, Austin Clements  
> > > wrote:
> > > > Quoth Jameson Graef Rollins on Nov 20 at 12:10 pm:
> > > > > The open question seems to be how we handle the content encoding
> > > > > parameters.  My argument is that those should either be used by 
> > > > > notmuch
> > > > > to properly encode the content for the consumer.  If that's not
> > > > > possible, then just those parameters needed by the consumer to decode
> > > > > the content should be output.
> > > > 
> > > > If notmuch is going to include part content in the JSON output (which
> > > > perhaps it shouldn't, as per recent IRC discussions), then it must
> > > > handle content encodings because JSON must be Unicode and therefore
> > > > the content strings in the JSON must be Unicode.
> > > 
> > > Having missed the IRC discussions: what is the rationale for not
> > > including (specific types of?) part content in the JSON output ?
> > > Eg. how about inline attached text/x-patch ?
> > 
> > Technically the IRC discussion was about not including *any* part
> > content in the JSON output, and always using show --format=raw or
> > similar to retrieve desired parts.  Currently, notmuch includes part
> > content in the JSON only for text/*, *except* when it's text/html.  I
> > assume non-text parts are omitted because binary data is hard to
> > represent in JSON and text/html is omitted because some people don't
> > need it.  However, this leads to some peculiar asymmetry in the Emacs
> > code where sometimes it pulls part content out of the JSON and
> > sometimes it retrieves it using show --format=raw.  This in turn leads
> > to asymmetry in content encoding handling, since notmuch handles
> > content encoding for parts included in the JSON (and there's no good
> > way around that since JSON is Unicode), but not for parts retrieved as
> > raw.
> > 
> > The idea discussed on IRC was to remove all part content from the JSON
> > output and to always use show to retrieve it, possibly beefing up
> > show's support for content decoding (and possibly introducing a way to
> > retrieve multiple raw parts at once to avoid re-parsing).  This would
> > get the JSON format out of the business of guessing what consumers
> > need, simplify the Emacs code, and normalize content encoding
> > handling.
> 
> Full ACK.
> 
> One concern though (IIUC): Due to the prevalence of retarded MUA's, not
> outputting 'text/plain' and/or 'text/html' parts is unfortunately all
> too often equivalent to not outputting anything at all, so wouldn't we,
> in essence, be reducing `show --format=json' to an ever-so-slightly
> augmented `search --format=json' ?

I'm not sure I fully understand what you're saying, but there are
several levels of structure here:

1. Threads (query results)
2. Thread structure
3. Message structure (MIME)
4. Part content

Currently, search returns 1; show --format=json returns 2, 3, and
sometimes 4 (but sometimes not); and show --format=raw returns 4.
Notably, 1 does not require opening message files (neither does 2),
which I consider an important distinction between search and show.

Some of the discussion has been about putting 4 squarely in the realm
of show --format=raw.  One counterargument (which has grown on me
since this discussion) is that the part content included in
--format=json can be thought of as pre-fetching content that clients
are likely to need in order to avoid re-parsing the message in most
circumstances.  I believe this is not the *intent* of the current
code, though without a specification of the JSON format it's hard to
tell.

Other discussion (more interesting, in my mind) has been about
separating retrieving thread structure, 2, from retrieving message
structure, 3.  To me, splitting these feels much more natural than
what we do now, which seems to be inflexibly bound to the specific way
the Emacs show mode currently works.  The thread structure is readily
available from the database, so I think separating these would open up
some new UI opportunities, particularly easy and fast thread outlining
and navigation.  I believe it would also simplify the code and address
some irritating asymmetries in the way notmuch show handles the --part
argument.


The overloading of show (was Re: [PATCH] Output unmodified Content-Type header value for JSON format.)

2012-01-14 Thread Austin Clements
(was in reply to id:87ehv2proa@praet.org, but I wanted to start a
new top-level thread)

Quoth Pieter Praet on Jan 14 at 10:19 am:
 On Thu, 12 Jan 2012 12:28:40 -0500, Austin Clements amdra...@mit.edu wrote:
  Quoth Pieter Praet on Jan 12 at  6:07 pm:
   On Tue, 22 Nov 2011 22:40:21 -0500, Austin Clements amdra...@mit.edu 
   wrote:
Quoth Jameson Graef Rollins on Nov 20 at 12:10 pm:
 The open question seems to be how we handle the content encoding
 parameters.  My argument is that those should either be used by 
 notmuch
 to properly encode the content for the consumer.  If that's not
 possible, then just those parameters needed by the consumer to decode
 the content should be output.

If notmuch is going to include part content in the JSON output (which
perhaps it shouldn't, as per recent IRC discussions), then it must
handle content encodings because JSON must be Unicode and therefore
the content strings in the JSON must be Unicode.
   
   Having missed the IRC discussions: what is the rationale for not
   including (specific types of?) part content in the JSON output ?
   Eg. how about inline attached text/x-patch ?
  
  Technically the IRC discussion was about not including *any* part
  content in the JSON output, and always using show --format=raw or
  similar to retrieve desired parts.  Currently, notmuch includes part
  content in the JSON only for text/*, *except* when it's text/html.  I
  assume non-text parts are omitted because binary data is hard to
  represent in JSON and text/html is omitted because some people don't
  need it.  However, this leads to some peculiar asymmetry in the Emacs
  code where sometimes it pulls part content out of the JSON and
  sometimes it retrieves it using show --format=raw.  This in turn leads
  to asymmetry in content encoding handling, since notmuch handles
  content encoding for parts included in the JSON (and there's no good
  way around that since JSON is Unicode), but not for parts retrieved as
  raw.
  
  The idea discussed on IRC was to remove all part content from the JSON
  output and to always use show to retrieve it, possibly beefing up
  show's support for content decoding (and possibly introducing a way to
  retrieve multiple raw parts at once to avoid re-parsing).  This would
  get the JSON format out of the business of guessing what consumers
  need, simplify the Emacs code, and normalize content encoding
  handling.
 
 Full ACK.
 
 One concern though (IIUC): Due to the prevalence of retarded MUA's, not
 outputting 'text/plain' and/or 'text/html' parts is unfortunately all
 too often equivalent to not outputting anything at all, so wouldn't we,
 in essence, be reducing `show --format=json' to an ever-so-slightly
 augmented `search --format=json' ?

I'm not sure I fully understand what you're saying, but there are
several levels of structure here:

1. Threads (query results)
2. Thread structure
3. Message structure (MIME)
4. Part content

Currently, search returns 1; show --format=json returns 2, 3, and
sometimes 4 (but sometimes not); and show --format=raw returns 4.
Notably, 1 does not require opening message files (neither does 2),
which I consider an important distinction between search and show.

Some of the discussion has been about putting 4 squarely in the realm
of show --format=raw.  One counterargument (which has grown on me
since this discussion) is that the part content included in
--format=json can be thought of as pre-fetching content that clients
are likely to need in order to avoid re-parsing the message in most
circumstances.  I believe this is not the *intent* of the current
code, though without a specification of the JSON format it's hard to
tell.

Other discussion (more interesting, in my mind) has been about
separating retrieving thread structure, 2, from retrieving message
structure, 3.  To me, splitting these feels much more natural than
what we do now, which seems to be inflexibly bound to the specific way
the Emacs show mode currently works.  The thread structure is readily
available from the database, so I think separating these would open up
some new UI opportunities, particularly easy and fast thread outlining
and navigation.  I believe it would also simplify the code and address
some irritating asymmetries in the way notmuch show handles the --part
argument.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch