Re: [PATCH v2] e1000: Convert debug macros into tracepoints.

2024-04-03 Thread Austin Clements
At this point there's not much of my original code left. :D Don, you're
welcome to take the credit in the commit.

On Wed, Apr 3, 2024, 9:46 AM Don Porter  wrote:

> From: Austin Clements 
>
> The E1000 debug messages are very useful for developing drivers.
> Make these available to users without recompiling QEMU.
>
> Signed-off-by: Austin Clements 
> [geo...@ldpreload.com: Rebased on top of 2.9.0]
> Signed-off-by: Geoffrey Thomas 
> Signed-off-by: Don Porter 
> ---
>  hw/net/e1000.c  | 90 +++--
>  hw/net/trace-events | 25 -
>  2 files changed, 54 insertions(+), 61 deletions(-)
>
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 43f3a4a701..24475636a3 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -44,26 +44,6 @@
>  #include "trace.h"
>  #include "qom/object.h"
>
> -/* #define E1000_DEBUG */
> -
> -#ifdef E1000_DEBUG
> -enum {
> -DEBUG_GENERAL,  DEBUG_IO,   DEBUG_MMIO, DEBUG_INTERRUPT,
> -DEBUG_RX,   DEBUG_TX,   DEBUG_MDIC, DEBUG_EEPROM,
> -DEBUG_UNKNOWN,  DEBUG_TXSUM,DEBUG_TXERR,DEBUG_RXERR,
> -DEBUG_RXFILTER, DEBUG_PHY,  DEBUG_NOTYET,
> -};
> -#define DBGBIT(x)(1< -static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL);
> -
> -#define DBGOUT(what, fmt, ...) do { \
> -if (debugflags & DBGBIT(what)) \
> -fprintf(stderr, "e1000: " fmt, ## __VA_ARGS__); \
> -} while (0)
> -#else
> -#define DBGOUT(what, fmt, ...) do {} while (0)
> -#endif
> -
>  #define IOPORT_SIZE   0x40
>  #define PNPMMIO_SIZE  0x2
>
> @@ -351,8 +331,7 @@ e1000_mit_timer(void *opaque)
>  static void
>  set_ics(E1000State *s, int index, uint32_t val)
>  {
> -DBGOUT(INTERRUPT, "set_ics %x, ICR %x, IMR %x\n", val,
> s->mac_reg[ICR],
> -s->mac_reg[IMS]);
> +trace_e1000_set_ics(val, s->mac_reg[ICR], s->mac_reg[IMS]);
>  set_interrupt_cause(s, 0, val | s->mac_reg[ICR]);
>  }
>
> @@ -425,8 +404,7 @@ set_rx_control(E1000State *s, int index, uint32_t val)
>  s->mac_reg[RCTL] = val;
>  s->rxbuf_size = e1000x_rxbufsize(val);
>  s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1;
> -DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT],
> -   s->mac_reg[RCTL]);
> +trace_e1000_set_rx_control(s->mac_reg[RDT], s->mac_reg[RCTL]);
>  timer_mod(s->flush_queue_timer,
>qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 1000);
>  }
> @@ -440,16 +418,16 @@ set_mdic(E1000State *s, int index, uint32_t val)
>  if ((val & E1000_MDIC_PHY_MASK) >> E1000_MDIC_PHY_SHIFT != 1) // phy #
>  val = s->mac_reg[MDIC] | E1000_MDIC_ERROR;
>  else if (val & E1000_MDIC_OP_READ) {
> -DBGOUT(MDIC, "MDIC read reg 0x%x\n", addr);
> +trace_e1000_mdic_read_register(addr);
>  if (!(phy_regcap[addr] & PHY_R)) {
> -DBGOUT(MDIC, "MDIC read reg %x unhandled\n", addr);
> +trace_e1000_mdic_read_register_unhandled(addr);
>  val |= E1000_MDIC_ERROR;
>  } else
>  val = (val ^ data) | s->phy_reg[addr];
>  } else if (val & E1000_MDIC_OP_WRITE) {
> -DBGOUT(MDIC, "MDIC write reg 0x%x, value 0x%x\n", addr, data);
> +trace_e1000_mdic_write_register(addr, data);
>  if (!(phy_regcap[addr] & PHY_W)) {
> -DBGOUT(MDIC, "MDIC write reg %x unhandled\n", addr);
> +trace_e1000_mdic_write_register_unhandled(addr);
>  val |= E1000_MDIC_ERROR;
>  } else {
>  if (addr < NPHYWRITEOPS && phyreg_writeops[addr]) {
> @@ -471,8 +449,8 @@ get_eecd(E1000State *s, int index)
>  {
>  uint32_t ret = E1000_EECD_PRES|E1000_EECD_GNT |
> s->eecd_state.old_eecd;
>
> -DBGOUT(EEPROM, "reading eeprom bit %d (reading %d)\n",
> -   s->eecd_state.bitnum_out, s->eecd_state.reading);
> +trace_e1000_get_eecd(s->eecd_state.bitnum_out, s->eecd_state.reading);
> +
>  if (!s->eecd_state.reading ||
>  ((s->eeprom_data[(s->eecd_state.bitnum_out >> 4) & 0x3f] >>
>((s->eecd_state.bitnum_out & 0xf) ^ 0xf))) & 1)
> @@ -511,9 +489,8 @@ set_eecd(E1000State *s, int index, uint32_t val)
>  s->eecd_state.reading = (((s->eecd_state.val_in >> 6) & 7) ==
>  EEPROM_READ_OPCODE_MICROWIRE);
>  }
> -DBGOUT(EEPROM, "eeprom bitnum in %d out %d, reading %d\n",
> -   s->eecd_state.bitn

Re: [go-nuts] Re: Go 1.18 Beta 1 is released

2021-12-15 Thread 'Austin Clements' via golang-nuts
Jan, assuming you're running on an AMD CPU, this is go.dev/issue/34988 (if
you're not running on an AMD CPU, that would be very interesting to know!)
The TL;DR is that this appears to be a kernel bug, and we have a C
reproducer, but we do not yet have a fix or a workaround.


On Wed, Dec 15, 2021 at 7:57 AM Jan Mercl <0xj...@gmail.com> wrote:

> On Tue, Dec 14, 2021 at 8:51 PM Cherry Mui  wrote:
>
> > We have just released go1.18beta1, a beta version of Go 1.18.
> > It is cut from the master branch at the revision tagged go1.18beta1.
> >
> > Please try your production load tests and unit tests with the new
> version.
> > Your help testing these pre-release versions is invaluable.
> >
> > Report any problems using the issue tracker:
> > https://golang.org/issue/new
>
> The link requires a Github account, which I don't have so instead
> reporting here:
>
> When trying to build from source on netbsd/amd64 in qemu, both
> 'all.bash' and 'make.bash' crashes, logs attached. FYI, with Go1.17.5
> and the same qemu VM, 'all.bash' fails, but does not crash IIRC and
> 'make.bash' completes without issues.
>
> -j
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/CAA40n-UsTmUHXa-ao%3DykO9ejG-J8OD8scdmKiQtoxC2npC4HOg%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CALgmw1-p0wSuBQt2hatAGEb%2BcgK9ezU%2BGW6HpK%2BV1R69x%3DeV2A%40mail.gmail.com.


Re: [go-nuts] Re: Considering dropping GO386=387

2020-07-16 Thread 'Austin Clements' via golang-nuts
Thanks for that data point, Nick. It's a good idea to make the build fail
if GO386 is set to 387 if we drop support. It already fails if GO386 is set
to any unsupported value, but we could continue to check GO386 even though
there would only be one supported value, and perhaps give a nicer error if
it's set to 387.


On Wed, Jul 15, 2020 at 1:21 PM Nick Craig-Wood  wrote:

> I make a GO386=387 build for rclone, eg
>
> https://github.com/rclone/rclone/issues/437
>
> People love running rclone on ancient computers to rescue data off them I
> guess.
>
> This would affect a very small percentage of users and there are always
> older versions of rclone they can use so I'm not too bothered if support is
> dropped.
>
> I haven't tried compiling rclone with gccgo for a while.
>
> It would be helpful if the build fails rather than silently ignoring the
> GO386 flag if this proposal does go forward.
>
> On Tuesday, 14 July 2020 at 13:56:58 UTC+1 aus...@google.com wrote:
>
>> Hi everyone. We’re exploring the possibility of dropping 387
>> floating-point support and requiring SSE2 support for GOARCH=386 in the
>> native gc compiler, potentially in Go 1.16. This would raise the minimum
>> GOARCH=386 requirement to the Intel Pentium 4 (released in 2000) or AMD
>> Opteron/Athlon 64 (released in 2003).
>>
>> There are several reasons we’re considering this:
>>
>>1. While 387 support isn’t a huge maintenance burden, it does take
>>time away from performance and feature work and represents a fair amount 
>> of
>>latent complexity.
>>2. 387 support has been a regular source of bugs (#36400
>>, #27516
>>, #22429
>>, #17357
>>, #13923
>>, #12970
>>, #4798 ,
>>just to name a few).
>>3. 387 bugs often go undetected for a long time because we don’t have
>>builders that support only 387 (so unsupported instructions can slip in
>>unnoticed).
>>4. Raising the minimum requirement to SSE2 would allow us to also
>>assume many other useful architectural features, such as proper memory
>>fences and 128 bit registers, which would simplify the compiler and 
>> runtime
>>and allow for much more efficient implementations of core functions like
>>memmove on 386.
>>5. We’re exploring switching to a register-based calling convention
>>in Go 1.16, which promises significant performance improvements, but
>>retaining 387 support will definitely complicate this and slow our 
>> progress.
>>
>>
>> The gccgo toolchain will continue to support 387 floating-point, so this
>> remains an option for projects that absolutely must use 387 floating-point.
>>
>> We’d like to know if there are still significant uses of GO386=387,
>> particularly for which using gccgo would not be a viable option.
>>
>> Thanks!
>>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/babd66fd-fe7f-4840-b1aa-8cb32b499b67n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CALgmw1-sDfxmeQ7CUvJ8wSUWm3vWCzzb2ZRrH-vdu_Yvqvb_MQ%40mail.gmail.com.


[go-nuts] Considering dropping GO386=387

2020-07-14 Thread 'Austin Clements' via golang-nuts
Hi everyone. We’re exploring the possibility of dropping 387 floating-point
support and requiring SSE2 support for GOARCH=386 in the native gc
compiler, potentially in Go 1.16. This would raise the minimum GOARCH=386
requirement to the Intel Pentium 4 (released in 2000) or AMD Opteron/Athlon
64 (released in 2003).

There are several reasons we’re considering this:

   1. While 387 support isn’t a huge maintenance burden, it does take time
   away from performance and feature work and represents a fair amount of
   latent complexity.
   2. 387 support has been a regular source of bugs (#36400
   , #27516
   , #22429 ,
   #17357 , #13923
   , #12970 ,
   #4798 , just to name a few).
   3. 387 bugs often go undetected for a long time because we don’t have
   builders that support only 387 (so unsupported instructions can slip in
   unnoticed).
   4. Raising the minimum requirement to SSE2 would allow us to also assume
   many other useful architectural features, such as proper memory fences and
   128 bit registers, which would simplify the compiler and runtime and allow
   for much more efficient implementations of core functions like memmove on
   386.
   5. We’re exploring switching to a register-based calling convention in
   Go 1.16, which promises significant performance improvements, but retaining
   387 support will definitely complicate this and slow our progress.


The gccgo toolchain will continue to support 387 floating-point, so this
remains an option for projects that absolutely must use 387 floating-point.

We’d like to know if there are still significant uses of GO386=387,
particularly for which using gccgo would not be a viable option.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CALgmw190ym7heSN2pb3xS%2BiA_UN97aP-2KW4tqC_9FFtYt9W3Q%40mail.gmail.com.


Re: [go-nuts] Change in virtual memory patterns in Go 1.12

2019-04-16 Thread 'Austin Clements' via golang-nuts
On Tue, Apr 16, 2019 at 1:23 AM Rémy Oudompheng 
wrote:

> Thanks Austin,
>
> The application workload is a kind of fragmentation torture test as it
> involves a mixture of many long-lived small and large (>100 MB)
> objects, with regularly allocated short-lived small and large objects.
> I have tried creating a sample synthetic reproducer but did not
> succeed at the moment.
>
> Regarding the max_map_count, your explanation is very clear and I
> apparently missed the large comment in the runtime explaining all of
> that.
> Do you expect a significant drawback between choosing 2MB or 16MB as
> the granularity of the huge page flag manipulation in the case of huge
> heaps ?
>

Most likely this will just cause less use of huge pages in your
application. This could slow it down by putting more pressure on the TLB.
In a sense, this is a self-compounding issue since huge pages can be highly
beneficial to huge heaps.

Regarding the virtual memory footprint, it changed radically with Go
> 1.12. It basically looks like a leak and I saw it grow to more than
> 1TB where the actual heap total size never exceeds 180GB.
> Although I understand that it is easy to construct a situation where
> there is repeatedly no available contiguous interval of >100MB in the
> address space, it is a significant difference from Go 1.11 where the
> address space would grow to 400-500GB for a similar workload and stay
> flat after that, and I could not find an obvious change in the
> allocator explaining the phenomenon (and unfortunately my resources do
> not allow for an easy live comparison of both program lifetimes).
>
> Am I right saying that scavenging method or frequency does not
> (cannot) affect at all virtual memory footprint and dynamics ?
>

It certainly can affect virtual memory footprint because of how scavenging
affects the allocator's placement policy. Though even with the increased
VSS, I would expect your application to have lower RSS with 1.12. In almost
all cases, lower RSS with higher VSS is a fine trade-off, though lower RSS
with the same VSS would obviously be better. But it can be problematic when
it causes the map count (which is roughly proportional to the VSS) to grow
too large. It's also unfortunate that Linux even has this limit; it's the
only OS Go runs on that limits the map count.

We've seen one other application experience VSS growth with the 1.12
changes, and it does seem to require a pretty unique allocation pattern.
Michael (cc'd) may be zeroing in on the causes of this and may have some
patches for you to try if you don't mind. :)

Regards,
> Rémy.
>
> Le mar. 2 avr. 2019 à 16:15, Austin Clements  a écrit :
> >
> > Hi Rémy. We often fight with vm.max_map_count in the runtime, sadly.
> Most likely this comes from the way the runtime interacts with Linux's
> transparent huge page support. When we scavenge (release to the OS) only
> part of a huge page, we tell the OS not to turn that huge page frame back
> into a huge page since that would just make that memory used again.
> Unfortunately, each time we do this counts as a separate "mapping" just to
> track that one flag. These "mappings" are always at least 2MB, but you have
> a large enough virtual address space to reach the max_map_count even then.
> You can see this in /proc/PID/smaps, which should list mostly contiguous
> neighboring regions that differ only in a single "VmFlags" bit.
> >
> > We did make memory scavenging more aggressive in Go 1.12 (+Michael
> Knyszek), though I would have expected it to converge to roughly the same
> "huge page flag fragmentation" as before over the course of five to ten
> minutes. Is your application's virtual memory footprint the same between
> 1.11 and 1.12 or do that grow?
> >
> > You could try disabling the huge page flag manipulation to confirm
> and/or fix this. In $GOROOT/src/runtime/internal/sys/arch_amd64.go (or
> whichever GOARCH is appropriate), set HugePageSize to 0. Though there's a
> danger that Linux's transparent huge pages could blow up your application's
> resident set size if you do that.
> >
> > On Tue, Apr 2, 2019 at 3:49 AM Rémy Oudompheng 
> wrote:
> >>
> >> Hello,
> >>
> >> In a large heap program I am working on, I noticed a peculiar change in
> the way virtual memory is reserved by the runtime : with comparable heap
> size (about 150GB) and virtual memory size (growing to 400-500GB probably
> due to a kind of fragmentation), the number of distinct memory mappings has
> apparently increased between Go 1.11 and Go 1.12 reaching the system limit
> (Linux setting vm.max_map_count).
> >>
> >> Is it something that has been experienced by someone else ? I don't
> believe this class

Re: [go-nuts] Why will it deadlock if a goroutine acquire a mutex while pinned to its P?

2019-04-09 Thread 'Austin Clements' via golang-nuts
Acquiring a mutex while pinned can cause deadlock because pinning prevents
a stop-the-world. For example, the following sequence could result in a
deadlock:

M1: Acquires mutex l.
M2: Pins the M.
M2: Attempts to acquire mutex l.
M3: Initiates stop-the-world
M3: Stops M1
M3: Attempts to stop M2, but can't because M2 is pinned.

At this point, M1 can't make progress to release mutex l because M3 stopped
it, which means M2 won't be able to finish acquiring the mutex (so it will
never release the pin), which means M3 won't be able to finish stopping the
world (so it will never start M1 back up).

On Mon, Apr 8, 2019 at 2:31 AM Cholerae Hu  wrote:

> I'm reading this commit
> https://github.com/golang/go/commit/d5fd2dd6a17a816b7dfd99d4df70a85f1bf0de31 .
> Inside runtime_procPin we only increases the m.lock count, why will it
> cause deadlock when acquiring a mutex after pin?
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Change in virtual memory patterns in Go 1.12

2019-04-02 Thread 'Austin Clements' via golang-nuts
Hi Rémy. We often fight with vm.max_map_count in the runtime, sadly. Most
likely this comes from the way the runtime interacts with Linux's
transparent huge page support. When we scavenge (release to the OS) only
part of a huge page, we tell the OS not to turn that huge page frame back
into a huge page since that would just make that memory used again.
Unfortunately, each time we do this counts as a separate "mapping" just to
track that one flag. These "mappings" are always at least 2MB, but you have
a large enough virtual address space to reach the max_map_count even then.
You can see this in /proc/PID/smaps, which should list mostly contiguous
neighboring regions that differ only in a single "VmFlags" bit.

We did make memory scavenging more aggressive in Go 1.12 (+Michael Knyszek
), though I would have expected it to converge to
roughly the same "huge page flag fragmentation" as before over the course
of five to ten minutes. Is your application's virtual memory footprint the
same between 1.11 and 1.12 or do that grow?

You could try disabling the huge page flag manipulation to confirm and/or
fix this. In $GOROOT/src/runtime/internal/sys/arch_amd64.go (or whichever
GOARCH is appropriate), set HugePageSize to 0. Though there's a danger that
Linux's transparent huge pages could blow up your application's resident
set size if you do that.

On Tue, Apr 2, 2019 at 3:49 AM Rémy Oudompheng 
wrote:

> Hello,
>
> In a large heap program I am working on, I noticed a peculiar change in
> the way virtual memory is reserved by the runtime : with comparable heap
> size (about 150GB) and virtual memory size (growing to 400-500GB probably
> due to a kind of fragmentation), the number of distinct memory mappings has
> apparently increased between Go 1.11 and Go 1.12 reaching the system limit
> (Linux setting vm.max_map_count).
>
> Is it something that has been experienced by someone else ? I don't
> believe this classifies as a bug, but I was a bit surprised (especially as
> I wasn't aware of that system limit).
>
> Rémy
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] runtime.GC - documentation

2016-11-30 Thread 'Austin Clements' via golang-nuts
On Tue, Nov 29, 2016 at 6:57 PM, Rick Hudson  wrote:

> That is correct.


... but not guaranteed. :)

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Bug#843127: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?

2016-11-12 Thread Austin Clements
Quoth David Bremner on Nov 04 at  1:26 pm:
> 
> Paul Wise wrote:
> 
> > Last night I got this error from my `notmuch new --quiet` cron job. The
> > file that the error message complains about is now in the cur directory
> > of the maildir at the following path.
> >
> > /path/to/mail/cur/1478190211.H80553P18378.chianamo:2,
> >
> > I wonder if this some kind of race condition in `notmuch new` processing.
> > Perhaps it should be using inotify to find out about file movements?
> >
> > Unexpected error with file 
> > /path/to/mail/new/1478190211.H80553P18378.chianamo
> > add_file: Something went wrong trying to read or write a file
> > Error opening /path/to/mail/new/1478190211.H80553P18378.chianamo: No such 
> > file or directory
> > Note: A fatal error was encountered: Something went wrong trying to read or 
> > write a file
> 
> I agree it looks like a race condition. inotify sounds a bit
> overcomplicated and perhaps non-portable? It should probably just
> tolerate disappearing files better, consider that a warning.

Inotify really *is* the solution. This is a symptom of a much bigger
problem: scandir makes no guarantees in the presence of concurrent
directory modification. If you delete or rename a file while notmuch
new is running, it may think *completely unrelated* files in the same
directory were also deleted. Even if scandir were atomic, if you move
a mail from one directory to another between notmuch scanning the
destination directory and notmuch scanning the source directory, it'll
think the mail has been deleted and potentially remove it from the DB.

The "recommended" solution is to scandir is to start an inotify watch
before the scan and redo (or update) the scan if there are any
changes. For notmuch, it would make sense to extend that to watching
all directories to make sure it can catch renames during the scan.

A possible alternative, though I haven't worked out the details, might
be to keep a close eye on the directory mtimes. Roughly, for each
directory, check the mtime before scanning, wait if necessary until
the mtime != the current time, do the scan and process the files
optimistically. Once all directories are processed, re-check all of
the mtimes and if any have changed, do something like starting over
but hopefully more intelligent.



Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?

2016-11-12 Thread Austin Clements
Quoth David Bremner on Nov 04 at  1:26 pm:
> 
> Paul Wise wrote:
> 
> > Last night I got this error from my `notmuch new --quiet` cron job. The
> > file that the error message complains about is now in the cur directory
> > of the maildir at the following path.
> >
> > /path/to/mail/cur/1478190211.H80553P18378.chianamo:2,
> >
> > I wonder if this some kind of race condition in `notmuch new` processing.
> > Perhaps it should be using inotify to find out about file movements?
> >
> > Unexpected error with file 
> > /path/to/mail/new/1478190211.H80553P18378.chianamo
> > add_file: Something went wrong trying to read or write a file
> > Error opening /path/to/mail/new/1478190211.H80553P18378.chianamo: No such 
> > file or directory
> > Note: A fatal error was encountered: Something went wrong trying to read or 
> > write a file
> 
> I agree it looks like a race condition. inotify sounds a bit
> overcomplicated and perhaps non-portable? It should probably just
> tolerate disappearing files better, consider that a warning.

Inotify really *is* the solution. This is a symptom of a much bigger
problem: scandir makes no guarantees in the presence of concurrent
directory modification. If you delete or rename a file while notmuch
new is running, it may think *completely unrelated* files in the same
directory were also deleted. Even if scandir were atomic, if you move
a mail from one directory to another between notmuch scanning the
destination directory and notmuch scanning the source directory, it'll
think the mail has been deleted and potentially remove it from the DB.

The "recommended" solution is to scandir is to start an inotify watch
before the scan and redo (or update) the scan if there are any
changes. For notmuch, it would make sense to extend that to watching
all directories to make sure it can catch renames during the scan.

A possible alternative, though I haven't worked out the details, might
be to keep a close eye on the directory mtimes. Roughly, for each
directory, check the mtime before scanning, wait if necessary until
the mtime != the current time, do the scan and process the files
optimistically. Once all directories are processed, re-check all of
the mtimes and if any have changed, do something like starting over
but hopefully more intelligent.

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[Bug 1637683] Re: ubuntu 16.10 internal speakers don't work, but headphones do

2016-10-31 Thread Austin Clements
PSA: This may be related to a broken headphone jack that is supposed to
detect whether headphones are plugged in and change output accordingly.
Work around in ubuntu right now is to run alsamixer from terminal,
select correct soundcard, then disable auto-mute, then unmute and turn
up speakers.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1637683

Title:
  ubuntu 16.10 internal speakers don't work, but headphones do

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1637683/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Desktop-packages] [Bug 1637683] Re: ubuntu 16.10 internal speakers don't work, but headphones do

2016-10-31 Thread Austin Clements
PSA: This may be related to a broken headphone jack that is supposed to
detect whether headphones are plugged in and change output accordingly.
Work around in ubuntu right now is to run alsamixer from terminal,
select correct soundcard, then disable auto-mute, then unmute and turn
up speakers.

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to alsa-driver in Ubuntu.
https://bugs.launchpad.net/bugs/1637683

Title:
  ubuntu 16.10 internal speakers don't work, but headphones do

Status in alsa-driver package in Ubuntu:
  New

Bug description:
  Under sound settings only headphone output is listed as an option, no
  internal speakers are listed. Headphone playback does work. In
  alsamixer, speaker output was at 0 and muted, but unmuting and
  increasing volume does not produce playback in speakers.

  ASUS X550LB laptop

  ProblemType: Bug
  DistroRelease: Ubuntu 16.10
  Package: alsa-base 1.0.25+dfsg-0ubuntu5
  ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
  Uname: Linux 4.8.0-26-generic x86_64
  ApportVersion: 2.20.3-0ubuntu8
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  CurrentDesktop: Unity
  Date: Fri Oct 28 20:42:56 2016
  InstallationDate: Installed on 2016-10-29 (0 days ago)
  InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
  PackageArchitecture: all
  SourcePackage: alsa-driver
  Symptom: audio
  Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
  Symptom_Card: Built-in Audio - HDA Intel PCH
  Symptom_DevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  Symptom_Jack: Speaker, Internal
  Symptom_Type: No sound at all
  Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: X550LB.403
  dmi.board.asset.tag: ATN12345678901234567
  dmi.board.name: X550LB
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: 1.0
  dmi.chassis.asset.tag: ATN12345678901234567
  dmi.chassis.type: 10
  dmi.chassis.vendor: ASUSTeK COMPUTER INC.
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
  dmi.product.name: X550LB
  dmi.product.version: 1.0
  dmi.sys.vendor: ASUSTeK COMPUTER INC.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1637683/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1637683] Re: ubuntu 16.10 internal speakers don't work, but headphones do

2016-10-31 Thread Austin Clements
PSA: This may be related to a broken headphone jack that is supposed to
detect whether headphones are plugged in and change output accordingly.
Work around in ubuntu right now is to run alsamixer from terminal,
select correct soundcard, then disable auto-mute, then unmute and turn
up speakers.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to alsa-driver in Ubuntu.
https://bugs.launchpad.net/bugs/1637683

Title:
  ubuntu 16.10 internal speakers don't work, but headphones do

Status in alsa-driver package in Ubuntu:
  New

Bug description:
  Under sound settings only headphone output is listed as an option, no
  internal speakers are listed. Headphone playback does work. In
  alsamixer, speaker output was at 0 and muted, but unmuting and
  increasing volume does not produce playback in speakers.

  ASUS X550LB laptop

  ProblemType: Bug
  DistroRelease: Ubuntu 16.10
  Package: alsa-base 1.0.25+dfsg-0ubuntu5
  ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
  Uname: Linux 4.8.0-26-generic x86_64
  ApportVersion: 2.20.3-0ubuntu8
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  CurrentDesktop: Unity
  Date: Fri Oct 28 20:42:56 2016
  InstallationDate: Installed on 2016-10-29 (0 days ago)
  InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
  PackageArchitecture: all
  SourcePackage: alsa-driver
  Symptom: audio
  Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
  Symptom_Card: Built-in Audio - HDA Intel PCH
  Symptom_DevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  Symptom_Jack: Speaker, Internal
  Symptom_Type: No sound at all
  Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: X550LB.403
  dmi.board.asset.tag: ATN12345678901234567
  dmi.board.name: X550LB
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: 1.0
  dmi.chassis.asset.tag: ATN12345678901234567
  dmi.chassis.type: 10
  dmi.chassis.vendor: ASUSTeK COMPUTER INC.
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
  dmi.product.name: X550LB
  dmi.product.version: 1.0
  dmi.sys.vendor: ASUSTeK COMPUTER INC.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1637683/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Desktop-packages] [Bug 1637683] [NEW] ubuntu 10.16 internal speakers don't work, but headphones do

2016-10-28 Thread Austin Clements
Public bug reported:

Under sound settings only headphone output is listed as an option, no
internal speakers are listed. Headphone playback does work. In
alsamixer, speaker output was at 0 and muted, but unmuting and
increasing volume does not produce playback in speakers.

ASUS X550LB laptop

ProblemType: Bug
DistroRelease: Ubuntu 16.10
Package: alsa-base 1.0.25+dfsg-0ubuntu5
ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
Uname: Linux 4.8.0-26-generic x86_64
ApportVersion: 2.20.3-0ubuntu8
Architecture: amd64
AudioDevicesInUse:
 USERPID ACCESS COMMAND
 /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
 /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
CurrentDesktop: Unity
Date: Fri Oct 28 20:42:56 2016
InstallationDate: Installed on 2016-10-29 (0 days ago)
InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
PackageArchitecture: all
SourcePackage: alsa-driver
Symptom: audio
Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
Symptom_Card: Built-in Audio - HDA Intel PCH
Symptom_DevicesInUse:
 USERPID ACCESS COMMAND
 /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
 /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
Symptom_Jack: Speaker, Internal
Symptom_Type: No sound at all
Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/26/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: X550LB.403
dmi.board.asset.tag: ATN12345678901234567
dmi.board.name: X550LB
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: 1.0
dmi.chassis.asset.tag: ATN12345678901234567
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK COMPUTER INC.
dmi.chassis.version: 1.0
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
dmi.product.name: X550LB
dmi.product.version: 1.0
dmi.sys.vendor: ASUSTeK COMPUTER INC.

** Affects: alsa-driver (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug yakkety

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to alsa-driver in Ubuntu.
https://bugs.launchpad.net/bugs/1637683

Title:
  ubuntu 10.16 internal speakers don't work, but headphones do

Status in alsa-driver package in Ubuntu:
  New

Bug description:
  Under sound settings only headphone output is listed as an option, no
  internal speakers are listed. Headphone playback does work. In
  alsamixer, speaker output was at 0 and muted, but unmuting and
  increasing volume does not produce playback in speakers.

  ASUS X550LB laptop

  ProblemType: Bug
  DistroRelease: Ubuntu 16.10
  Package: alsa-base 1.0.25+dfsg-0ubuntu5
  ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
  Uname: Linux 4.8.0-26-generic x86_64
  ApportVersion: 2.20.3-0ubuntu8
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  CurrentDesktop: Unity
  Date: Fri Oct 28 20:42:56 2016
  InstallationDate: Installed on 2016-10-29 (0 days ago)
  InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
  PackageArchitecture: all
  SourcePackage: alsa-driver
  Symptom: audio
  Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
  Symptom_Card: Built-in Audio - HDA Intel PCH
  Symptom_DevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  Symptom_Jack: Speaker, Internal
  Symptom_Type: No sound at all
  Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: X550LB.403
  dmi.board.asset.tag: ATN12345678901234567
  dmi.board.name: X550LB
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: 1.0
  dmi.chassis.asset.tag: ATN12345678901234567
  dmi.chassis.type: 10
  dmi.chassis.vendor: ASUSTeK COMPUTER INC.
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
  dmi.product.name: X550LB
  dmi.product.version: 1.0
  dmi.sys.vendor: ASUSTeK COMPUTER INC.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1637683/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp


[Bug 1637683] [NEW] ubuntu 10.16 internal speakers don't work, but headphones do

2016-10-28 Thread Austin Clements
Public bug reported:

Under sound settings only headphone output is listed as an option, no
internal speakers are listed. Headphone playback does work. In
alsamixer, speaker output was at 0 and muted, but unmuting and
increasing volume does not produce playback in speakers.

ASUS X550LB laptop

ProblemType: Bug
DistroRelease: Ubuntu 16.10
Package: alsa-base 1.0.25+dfsg-0ubuntu5
ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
Uname: Linux 4.8.0-26-generic x86_64
ApportVersion: 2.20.3-0ubuntu8
Architecture: amd64
AudioDevicesInUse:
 USERPID ACCESS COMMAND
 /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
 /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
CurrentDesktop: Unity
Date: Fri Oct 28 20:42:56 2016
InstallationDate: Installed on 2016-10-29 (0 days ago)
InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
PackageArchitecture: all
SourcePackage: alsa-driver
Symptom: audio
Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
Symptom_Card: Built-in Audio - HDA Intel PCH
Symptom_DevicesInUse:
 USERPID ACCESS COMMAND
 /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
 /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
Symptom_Jack: Speaker, Internal
Symptom_Type: No sound at all
Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/26/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: X550LB.403
dmi.board.asset.tag: ATN12345678901234567
dmi.board.name: X550LB
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: 1.0
dmi.chassis.asset.tag: ATN12345678901234567
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK COMPUTER INC.
dmi.chassis.version: 1.0
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
dmi.product.name: X550LB
dmi.product.version: 1.0
dmi.sys.vendor: ASUSTeK COMPUTER INC.

** Affects: alsa-driver (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug yakkety

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1637683

Title:
  ubuntu 10.16 internal speakers don't work, but headphones do

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1637683/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Touch-packages] [Bug 1637683] [NEW] ubuntu 10.16 internal speakers don't work, but headphones do

2016-10-28 Thread Austin Clements
Public bug reported:

Under sound settings only headphone output is listed as an option, no
internal speakers are listed. Headphone playback does work. In
alsamixer, speaker output was at 0 and muted, but unmuting and
increasing volume does not produce playback in speakers.

ASUS X550LB laptop

ProblemType: Bug
DistroRelease: Ubuntu 16.10
Package: alsa-base 1.0.25+dfsg-0ubuntu5
ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
Uname: Linux 4.8.0-26-generic x86_64
ApportVersion: 2.20.3-0ubuntu8
Architecture: amd64
AudioDevicesInUse:
 USERPID ACCESS COMMAND
 /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
 /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
CurrentDesktop: Unity
Date: Fri Oct 28 20:42:56 2016
InstallationDate: Installed on 2016-10-29 (0 days ago)
InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
PackageArchitecture: all
SourcePackage: alsa-driver
Symptom: audio
Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
Symptom_Card: Built-in Audio - HDA Intel PCH
Symptom_DevicesInUse:
 USERPID ACCESS COMMAND
 /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
 /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
Symptom_Jack: Speaker, Internal
Symptom_Type: No sound at all
Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/26/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: X550LB.403
dmi.board.asset.tag: ATN12345678901234567
dmi.board.name: X550LB
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: 1.0
dmi.chassis.asset.tag: ATN12345678901234567
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK COMPUTER INC.
dmi.chassis.version: 1.0
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
dmi.product.name: X550LB
dmi.product.version: 1.0
dmi.sys.vendor: ASUSTeK COMPUTER INC.

** Affects: alsa-driver (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug yakkety

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to alsa-driver in Ubuntu.
https://bugs.launchpad.net/bugs/1637683

Title:
  ubuntu 10.16 internal speakers don't work, but headphones do

Status in alsa-driver package in Ubuntu:
  New

Bug description:
  Under sound settings only headphone output is listed as an option, no
  internal speakers are listed. Headphone playback does work. In
  alsamixer, speaker output was at 0 and muted, but unmuting and
  increasing volume does not produce playback in speakers.

  ASUS X550LB laptop

  ProblemType: Bug
  DistroRelease: Ubuntu 16.10
  Package: alsa-base 1.0.25+dfsg-0ubuntu5
  ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
  Uname: Linux 4.8.0-26-generic x86_64
  ApportVersion: 2.20.3-0ubuntu8
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  CurrentDesktop: Unity
  Date: Fri Oct 28 20:42:56 2016
  InstallationDate: Installed on 2016-10-29 (0 days ago)
  InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
  PackageArchitecture: all
  SourcePackage: alsa-driver
  Symptom: audio
  Symptom_AlsaPlaybackTest: ALSA playback test through plughw:PCH failed
  Symptom_Card: Built-in Audio - HDA Intel PCH
  Symptom_DevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  austinclem1   1647 F pulseaudio
   /dev/snd/controlC0:  austinclem1   1647 F pulseaudio
  Symptom_Jack: Speaker, Internal
  Symptom_Type: No sound at all
  Title: [X550LB, Realtek ALC3236, Speaker, Internal] No sound at all
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: X550LB.403
  dmi.board.asset.tag: ATN12345678901234567
  dmi.board.name: X550LB
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: 1.0
  dmi.chassis.asset.tag: ATN12345678901234567
  dmi.chassis.type: 10
  dmi.chassis.vendor: ASUSTeK COMPUTER INC.
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrX550LB.403:bd06/26/2014:svnASUSTeKCOMPUTERINC.:pnX550LB:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnX550LB:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
  dmi.product.name: X550LB
  dmi.product.version: 1.0
  dmi.sys.vendor: ASUSTeK COMPUTER INC.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1637683/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Austin Clements
Quoth Gaute Hope on Jun 06 at  8:08 pm:
> Austin Clements writes on juni 6, 2016 21:20:
> >
> >The experiment was specifically for regexp matching subject, but it should
> >work for any header we store a literal copy of in the database.
> 
> Does it work for terms in the body of the message?

No. It's not impossible that it could be made to work, but it might be
slow and unintuitive. It would have to iterate over all of the terms
in the database and see which ones match the regexp. These are
available, but I don't know how much time it takes to iterate over all
of them. It might be okay. It might not.

It could also expand to a very large query if the regexp matches many
terms, akin to how searching for "a*" can be quite expensive.

And it might not match what you expect. It could only match individual
terms, so a regexp containing any punctuation (including but not
limited to a space) simply wouldn't match anything.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: searching: '*analysis' vs 'reanalysis'

2016-06-06 Thread Austin Clements
On Mon, Jun 6, 2016 at 1:29 PM, David Bremner  wrote:

> Sebastian Fischmeister  writes:
>
> >
> > I ran into this problem before as well. Storage is cheap. Notmuch could
> > index all emails with reversed text to get around some of this
> > problem. It doesn't solve the problem of *analysis*, but it's still an
> > improvement.
>
> It would probably be more useful to have brute force regexp searches on
> headers.  Austin did some experiments that sounded promising, where you
> basically postprocess the result of a xapian query with a regexp. OTOH,
> I don't know what kept him from proposing this for mainline. If it was
> just parser issues, those are probably more or less solved now, at least
> for people using xapian 1.3+
>

The experiment was specifically for regexp matching subject, but it should
work for any header we store a literal copy of in the database. The code is
here, though in its current form it builds on my custom query parser:
https://github.com/aclements/notmuch/commit/ce41b29aba4d9b84e2f1eb6ed8df67065196c960.
Based on my understanding of Xapian 1.3+ field processors, these days it
should be quite easy to hook the PostingSource in that commit into the
Xapian QueryProcessor.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v4 7/7] complete ghost-on-removal-when-shared-thread-exists

2016-04-19 Thread Austin Clements
On Mon, 11 Apr 2016, Daniel Kahn Gillmor  wrote:
> On Sun 2016-04-10 20:33:02 -0400, David Bremner  wrote:
>> Daniel Kahn Gillmor  writes:
>>
>>> To fully complete the ghost-on-removal-when-shared-thread-exists
>>> proposal, we need to clear all ghost messages when the last active
>>> message is removed from a thread.
>>
>> For me this patch breaks T530 as follows
>>
>> T530-upgrade: Testing database upgrade
>>  FAIL   ghost message retains thread ID
>>  --- T530-upgrade.13.expected2016-04-11 00:25:19.482196274 +
>>  +++ T530-upgrade.13.output  2016-04-11 00:25:19.482196274 +
>>  @@ -1 +1 @@
>>  -thread:001d
>>  +thread:0011
>> No new mail.
>> No new mail. Removed 1 message.
>>
>> I pushed my rebased version of the patches to
>>
>>   
>> https://git.notmuchmail.org/git?p=notmuch;a=shortlog;h=refs/heads/thread-fix
>>
>> in case the problem is some mistake on my part.
>
> After having done "make download-test-databases", I can confirm that
> this is happening for me as well: it's not an issue with bremner's
> config.
>
> however, i think this particular test is wrong, and should probably be
> removed.
>
> For review, here's the final test:
>
> --
> # Ghost messages are difficult to test since they're nearly invisible.
> # However, if the upgrade works correctly, the ghost message should
> # retain the right thread ID even if all of the original messages in
> # the thread are deleted.  That's what we test.  This won't detect if
> # the upgrade just plain didn't happen, but it should detect if
> # something went wrong.
> test_begin_subtest "ghost message retains thread ID"
> # Upgrade database
> notmuch new
> # Get thread ID of real message
> thread=$(notmuch search --output=threads id:4efc743a.3060...@april.org)
> # Delete all real messages in that thread
> rm $(notmuch search --output=files $thread)
> notmuch new
> # "Deliver" ghost message
> add_message '[subject]=Ghost' '[id]=4efc3931.6030...@april.org'
> # If the ghost upgrade worked, the new message should be attached to
> # the existing thread ID.
> nthread=$(notmuch search --output=threads id:4efc3931.6030...@april.org)
> test_expect_equal "$thread" "$nthread"
> -
>
> I don't think this reasoning is sensible.  If the entire thread is
> deleted, and a new message comes in, it should *not* get the same mesage
> ID.  ghosts should only exist in the database when other messages point
> to them.
>
> So i'd be fine with killing this entire last test, unless someone can
> propose a good reason to keep it.

I agree that it's fine to remove this test. From what I recall, it
wasn't intended to test that ghost messages retained their thread IDs;
this was just the implementation property it exploited to test that
ghosts were working.

I haven't looked through this series, but it would be nice if there was
still some tests that ghosts were working across upgrade.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 2/6] lib: Add per-message last modification tracking

2015-08-08 Thread Austin Clements
Hi David. I haven't had a chance to look back at the original code, but
your follow-up expanded comment agrees with how I remember this code
working.
On Aug 7, 2015 4:41 PM, "David Bremner"  wrote:

> Daniel Schoepe  writes:
>
>
> > On Fri, 05 Jun 2015 19:28 +0200, David Bremner wrote:
> >> +/* Prior to NOTMUCH_FEATURE_LAST_MOD, messages did not
> >> + * track modification revisions.  Give all messages a
> >> + * revision of 1.
> >> + */
> >> +if (new_features & NOTMUCH_FEATURE_LAST_MOD)
> >> +_notmuch_message_upgrade_last_mod (message);
> >> [..]
> >> +/* Upgrade a message to support NOTMUCH_FEATURE_LAST_MOD.  The caller
> >> + * must call _notmuch_message_sync. */
> >> +void
> >> +_notmuch_message_upgrade_last_mod (notmuch_message_t *message)
> >> +{
> >> +/* _notmuch_message_sync will update the last modification
> >> + * revision; we just have to ask it to. */
> >> +message->modified = TRUE;
> >> +}
> >> +
> >
> > The comment in the first part says that message without LAST_MOD get a
> > revision of 1, but as far as I can tell, _notmuch_message_sync will
> > actually put the next revision number on the message. If this is what's
> > happening, either the comment or the behavior should be changed,
> > depending on what's intended.
>
> I think the behaviour is OK, but you're correct the comment is
> wrong. I'll see if Austin has any comment on that. I guess it would be
> harmless to initialize upgraded messages to some low revision number,
> but I don't see the benefit.
>
-- next part --
An HTML attachment was scrubbed...
URL: 



Re: [PATCH 2/6] lib: Add per-message last modification tracking

2015-08-08 Thread Austin Clements
Hi David. I haven't had a chance to look back at the original code, but
your follow-up expanded comment agrees with how I remember this code
working.
On Aug 7, 2015 4:41 PM, David Bremner da...@tethera.net wrote:

 Daniel Schoepe dan...@schoepe.org writes:


  On Fri, 05 Jun 2015 19:28 +0200, David Bremner wrote:
  +/* Prior to NOTMUCH_FEATURE_LAST_MOD, messages did not
  + * track modification revisions.  Give all messages a
  + * revision of 1.
  + */
  +if (new_features  NOTMUCH_FEATURE_LAST_MOD)
  +_notmuch_message_upgrade_last_mod (message);
  [..]
  +/* Upgrade a message to support NOTMUCH_FEATURE_LAST_MOD.  The caller
  + * must call _notmuch_message_sync. */
  +void
  +_notmuch_message_upgrade_last_mod (notmuch_message_t *message)
  +{
  +/* _notmuch_message_sync will update the last modification
  + * revision; we just have to ask it to. */
  +message-modified = TRUE;
  +}
  +
 
  The comment in the first part says that message without LAST_MOD get a
  revision of 1, but as far as I can tell, _notmuch_message_sync will
  actually put the next revision number on the message. If this is what's
  happening, either the comment or the behavior should be changed,
  depending on what's intended.

 I think the behaviour is OK, but you're correct the comment is
 wrong. I'll see if Austin has any comment on that. I guess it would be
 harmless to initialize upgraded messages to some low revision number,
 but I don't see the benefit.

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Understanding the "replied" tag

2015-06-13 Thread Austin Clements
Hi Xu. I may be misunderstanding your email, but it sounds like you want to 
know if a message has *any* reply message. That's not what the replied tag 
indicates. The replied tag indicates that *you* have sent a reply to a message. 
Mechanically, when you hit, say, r to start a reply and then send that message, 
notmuch tags the message you hit r on as "replied". That's the only time 
notmuch automatically sets this tag.

On June 11, 2015 10:25:44 AM PDT, Xu Wang  wrote:
>Dear all,
>
>First, I am extremely excited to be a part of this list now. notmuch
>has really helped me. Thank you go all individuals working to improve
>it and to help others to know how to use it.
>
>I would really like to know if a message has been replied to (e.g.
>using a certain message id). It seems that all I need to do is check
>for the "replied" tag. But often this tag is not there, even when
>there has been a reply (I have confirmed this through the thread
>display and checking the message that replied to the message to make
>sure it indeed has header "replied-to:").
>
>I have looked in mutt, and also I see many situations where there is
>no 'r' flag, especially for emails sent from me.
>
>The following returns true:
>notmuch config get maildir.synchronize_flags
>
>So at least both are giving same answer, but I'm not sure why not
>saying "replied" thread is correct.
>
>What is incorrect for my way of thinking about "replied"? Do I
>misunderstand what it is supposed to do or am I not updating the flags
>correctly?
>
>Kind regards,
>
>Xu
>___
>notmuch mailing list
>notmuch at notmuchmail.org
>http://notmuchmail.org/mailman/listinfo/notmuch



Re: Understanding the replied tag

2015-06-13 Thread Austin Clements
Hi Xu. I may be misunderstanding your email, but it sounds like you want to 
know if a message has *any* reply message. That's not what the replied tag 
indicates. The replied tag indicates that *you* have sent a reply to a message. 
Mechanically, when you hit, say, r to start a reply and then send that message, 
notmuch tags the message you hit r on as replied. That's the only time 
notmuch automatically sets this tag.

On June 11, 2015 10:25:44 AM PDT, Xu Wang xuwang...@gmail.com wrote:
Dear all,

First, I am extremely excited to be a part of this list now. notmuch
has really helped me. Thank you go all individuals working to improve
it and to help others to know how to use it.

I would really like to know if a message has been replied to (e.g.
using a certain message id). It seems that all I need to do is check
for the replied tag. But often this tag is not there, even when
there has been a reply (I have confirmed this through the thread
display and checking the message that replied to the message to make
sure it indeed has header replied-to:MSG-ID).

I have looked in mutt, and also I see many situations where there is
no 'r' flag, especially for emails sent from me.

The following returns true:
notmuch config get maildir.synchronize_flags

So at least both are giving same answer, but I'm not sure why not
saying replied thread is correct.

What is incorrect for my way of thinking about replied? Do I
misunderstand what it is supposed to do or am I not updating the flags
correctly?

Kind regards,

Xu
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


notmuch_thread_get_authors

2015-04-21 Thread Austin Clements
Quoth Ronny Chevalier on Apr 22 at  4:01 am:
> On Wed, Apr 22, 2015 at 3:28 AM, Austin Clements
>  wrote:
> > On Tue, 21 Apr 2015, Ronny Chevalier  wrote:
> >> On Tue, Apr 21, 2015 at 1:35 AM, David Bremner  
> >> wrote:
> >>> Ronny Chevalier  writes:
> >> Austin Clements wrote:
> >>> And I think there's a fairly easy way to do it in C code that will
> >>> also prevent library interface bloat: instead of introducing new
> >>> library APIs to get at this information, just use the existing
> >>> notmuch_thread_get_messages API and construct the matched and
> >>> non-matched lists in the CLI.  Doing it in the CLI wouldn't require
> >>> the library to export yet another string list structure, which is
> >>> always a huge pain (thanks C!), and wouldn't introduce more "helper"
> >>> functions into the library API.
> >>
> >> I disagree with what Austin said. Because this does not solve the
> >> issue at all (or I misunderstood). The issue is with the notmuch API,
> >> if someone is using this library there no way it can parse properly
> >> the authors.
> >> In my case I am not using the CLI but the notmuch library, fixing this
> >> in the CLI is just an hack, and it does not fix the issue for the
> >> library users.
> >
> > My suggestion was in no way specific to the CLI. That was the context of
> > the discussion at the time, but for the purposes of this discussion, the
> > CLI is just another library user.
> 
> Ok, sorry for misunderstanding.
> 
> >
> > You're completely right that there's no way to reliably parse the
> > authors list returned by notmuch_thread_get_authors. So don't do
> > that. Just use notmuch_thread_get_messages, walk the messages list, and
> > build your own authors list. There's no need to introduce additional
> > complexity and surface area into the library API for this specific use
> > case (IMO, even notmuch_thread_get_authors shouldn't exist, but it's
> > there for legacy reasons.) Then you can get author lists for matched,
> > non-matched, matching a specific tag, just the to, just the from, counts
> > of how many times each author appeared, whatever you want.
> >
> 
> Ok thanks!
> 
> If I read the code correctly, _notmuch_thread_create in lib/thread.cc
> process every message to get information like tags, subject and
> authors. Since notmuch_thread_get_authors is here for legacy reasons,
> would it be better to generate the list of authors only when requested
> with notmuch_thread_get_authors (and cache the result of course)?
> Because, new code will not use this and will do this work manually,
> the generation of the list in intern consumes resources for nothing.

It might be worth making this lazy. I'd be surprised if this has
noticeable CPU or memory cost in the grand scheme of putting together
a thread, but I don't have any numbers to back this up.


notmuch_thread_get_authors

2015-04-21 Thread Austin Clements
On Tue, 21 Apr 2015, Ronny Chevalier  wrote:
> On Tue, Apr 21, 2015 at 1:35 AM, David Bremner  wrote:
>> Ronny Chevalier  writes:
> Austin Clements wrote:
>> And I think there's a fairly easy way to do it in C code that will
>> also prevent library interface bloat: instead of introducing new
>> library APIs to get at this information, just use the existing
>> notmuch_thread_get_messages API and construct the matched and
>> non-matched lists in the CLI.  Doing it in the CLI wouldn't require
>> the library to export yet another string list structure, which is
>> always a huge pain (thanks C!), and wouldn't introduce more "helper"
>> functions into the library API.
>
> I disagree with what Austin said. Because this does not solve the
> issue at all (or I misunderstood). The issue is with the notmuch API,
> if someone is using this library there no way it can parse properly
> the authors.
> In my case I am not using the CLI but the notmuch library, fixing this
> in the CLI is just an hack, and it does not fix the issue for the
> library users.

My suggestion was in no way specific to the CLI. That was the context of
the discussion at the time, but for the purposes of this discussion, the
CLI is just another library user.

You're completely right that there's no way to reliably parse the
authors list returned by notmuch_thread_get_authors. So don't do
that. Just use notmuch_thread_get_messages, walk the messages list, and
build your own authors list. There's no need to introduce additional
complexity and surface area into the library API for this specific use
case (IMO, even notmuch_thread_get_authors shouldn't exist, but it's
there for legacy reasons.) Then you can get author lists for matched,
non-matched, matching a specific tag, just the to, just the from, counts
of how many times each author appeared, whatever you want.

> Furthermore, I do not see why providing a string list NULL-terminated
> in C is a huge pain?

See the notmuch_tags_* and notmuch_filesnames_* APIs. Those are just
string lists.

> Otherwise, I agree with Mark Walters comments on the patch.
>
> If no one is working to fix this at the moment, I can send a patch?
>
> Ronny


Re: notmuch_thread_get_authors

2015-04-21 Thread Austin Clements
Quoth Ronny Chevalier on Apr 22 at  4:01 am:
 On Wed, Apr 22, 2015 at 3:28 AM, Austin Clements
 acleme...@csail.mit.edu wrote:
  On Tue, 21 Apr 2015, Ronny Chevalier chevalier.ro...@gmail.com wrote:
  On Tue, Apr 21, 2015 at 1:35 AM, David Bremner da...@tethera.net wrote:
  Ronny Chevalier chevalier.ro...@gmail.com writes:
  Austin Clements wrote:
  And I think there's a fairly easy way to do it in C code that will
  also prevent library interface bloat: instead of introducing new
  library APIs to get at this information, just use the existing
  notmuch_thread_get_messages API and construct the matched and
  non-matched lists in the CLI.  Doing it in the CLI wouldn't require
  the library to export yet another string list structure, which is
  always a huge pain (thanks C!), and wouldn't introduce more helper
  functions into the library API.
 
  I disagree with what Austin said. Because this does not solve the
  issue at all (or I misunderstood). The issue is with the notmuch API,
  if someone is using this library there no way it can parse properly
  the authors.
  In my case I am not using the CLI but the notmuch library, fixing this
  in the CLI is just an hack, and it does not fix the issue for the
  library users.
 
  My suggestion was in no way specific to the CLI. That was the context of
  the discussion at the time, but for the purposes of this discussion, the
  CLI is just another library user.
 
 Ok, sorry for misunderstanding.
 
 
  You're completely right that there's no way to reliably parse the
  authors list returned by notmuch_thread_get_authors. So don't do
  that. Just use notmuch_thread_get_messages, walk the messages list, and
  build your own authors list. There's no need to introduce additional
  complexity and surface area into the library API for this specific use
  case (IMO, even notmuch_thread_get_authors shouldn't exist, but it's
  there for legacy reasons.) Then you can get author lists for matched,
  non-matched, matching a specific tag, just the to, just the from, counts
  of how many times each author appeared, whatever you want.
 
 
 Ok thanks!
 
 If I read the code correctly, _notmuch_thread_create in lib/thread.cc
 process every message to get information like tags, subject and
 authors. Since notmuch_thread_get_authors is here for legacy reasons,
 would it be better to generate the list of authors only when requested
 with notmuch_thread_get_authors (and cache the result of course)?
 Because, new code will not use this and will do this work manually,
 the generation of the list in intern consumes resources for nothing.

It might be worth making this lazy. I'd be surprised if this has
noticeable CPU or memory cost in the grand scheme of putting together
a thread, but I don't have any numbers to back this up.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch_thread_get_authors

2015-04-21 Thread Austin Clements
On Tue, 21 Apr 2015, Ronny Chevalier chevalier.ro...@gmail.com wrote:
 On Tue, Apr 21, 2015 at 1:35 AM, David Bremner da...@tethera.net wrote:
 Ronny Chevalier chevalier.ro...@gmail.com writes:
 Austin Clements wrote:
 And I think there's a fairly easy way to do it in C code that will
 also prevent library interface bloat: instead of introducing new
 library APIs to get at this information, just use the existing
 notmuch_thread_get_messages API and construct the matched and
 non-matched lists in the CLI.  Doing it in the CLI wouldn't require
 the library to export yet another string list structure, which is
 always a huge pain (thanks C!), and wouldn't introduce more helper
 functions into the library API.

 I disagree with what Austin said. Because this does not solve the
 issue at all (or I misunderstood). The issue is with the notmuch API,
 if someone is using this library there no way it can parse properly
 the authors.
 In my case I am not using the CLI but the notmuch library, fixing this
 in the CLI is just an hack, and it does not fix the issue for the
 library users.

My suggestion was in no way specific to the CLI. That was the context of
the discussion at the time, but for the purposes of this discussion, the
CLI is just another library user.

You're completely right that there's no way to reliably parse the
authors list returned by notmuch_thread_get_authors. So don't do
that. Just use notmuch_thread_get_messages, walk the messages list, and
build your own authors list. There's no need to introduce additional
complexity and surface area into the library API for this specific use
case (IMO, even notmuch_thread_get_authors shouldn't exist, but it's
there for legacy reasons.) Then you can get author lists for matched,
non-matched, matching a specific tag, just the to, just the from, counts
of how many times each author appeared, whatever you want.

 Furthermore, I do not see why providing a string list NULL-terminated
 in C is a huge pain?

See the notmuch_tags_* and notmuch_filesnames_* APIs. Those are just
string lists.

 Otherwise, I agree with Mark Walters comments on the patch.

 If no one is working to fix this at the moment, I can send a patch?

 Ronny
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Bug#762950: Bug most likely in solid

2015-03-28 Thread Austin Clements
Quoth Scott Kitterman on Mar 28 at  5:36 pm:
 On Sunday, March 15, 2015 12:43:51 PM Scott Kitterman wrote:
  I've poked around in core/libs/database/collectionmanager.cpp and it appears
  that the digikam code tries to do the right thing and the most likely issue
  is something about how solid handles these cases, so reassigning.
 
 For the cases in question for this bug, please try running:
 
 $ solid-hardware list

Here's the output for me:

$ solid-hardware list
udi = '/org/kde/solid/udev/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input4'
udi = 
'/org/kde/solid/udev/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input2'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:16.3/tty/ttyS0'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:19.0/net/eth0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1a.0/usb1/1-1/1-1.6/1-1.6:1.0/video4linux/video0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/audio'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/dsp'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/hwC0D0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/pcmC0D0c'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/pcmC0D0p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/hwC0D3'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D3p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D7p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D8p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/mixer'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/controlC0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1c.1/:03:00.0/net/wlan0'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS1'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS2'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS3'
udi = '/org/kde/solid/udev/sys/devices/platform/thinkpad_acpi/sound/card29'
udi = 
'/org/kde/solid/udev/sys/devices/platform/thinkpad_acpi/sound/card29/controlC29'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu0'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu1'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu2'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu3'
udi = '/org/kde/solid/udev/sys/devices/virtual/net/docker0'
udi = '/org/kde/solid/udev/sys/devices/virtual/net/lo'
udi = '/org/kde/solid/udev/sys/devices/virtual/sound/timer'
udi = '/org/freedesktop/UDisks2/block_devices/loop7'
udi = '/org/freedesktop/UDisks2/block_devices/loop6'
udi = '/org/freedesktop/UDisks2/block_devices/loop5'
udi = '/org/freedesktop/UDisks2/block_devices/loop4'
udi = '/org/freedesktop/UDisks2/block_devices/loop3'
udi = '/org/freedesktop/UDisks2/block_devices/loop2'
udi = '/org/freedesktop/UDisks2/block_devices/loop1'
udi = '/org/freedesktop/UDisks2/block_devices/loop0'
udi = '/org/freedesktop/UDisks2/block_devices/sda5'
udi = '/org/freedesktop/UDisks2/block_devices/sda3'
udi = '/org/freedesktop/UDisks2/block_devices/sda2'
udi = '/org/freedesktop/UDisks2/block_devices/sda1'
udi = '/org/freedesktop/UDisks2/block_devices/sda'
udi = '/org/freedesktop/UDisks2/block_devices/sda6'
udi = '/org/freedesktop/UDisks2/drives/INTEL_SSDSA2BW160G3L_BTPR217001GM160DGN'
udi = '/org/freedesktop/UPower'
udi = '/org/freedesktop/UPower/devices/line_power_AC'
udi = '/org/freedesktop/UPower/devices/battery_BAT0'
udi = '/org/kde/fstab'

This is the same regardless of whether or not Docker is running.

 Then find the UID of the mount point in question and run:
 
 $ solid-hardware details '$UID'

My digikam album lives on sda6. Here's the output *without* Docker
running

$ sudo /etc/init.d/docker stop
$ solid-hardware details /org/freedesktop/UDisks2/block_devices/sda6
udi = '/org/freedesktop/UDisks2/block_devices/sda6'
  parent = 
'/org/freedesktop/UDisks2/drives/INTEL_SSDSA2BW160G3L_BTPR217001GM160DGN'  
(string)
  vendor = ''  (string)
  product = 'INTEL SSDSA2BW160G3L'  (string)
  description = '116.9 GiB Hard Drive'  (string)
  Block.major = 8  (0x8)  (int)
  Block.minor = 6  (0x6)  (int)
  Block.device = '/dev/sda6'  (string)
  StorageAccess.accessible = true  (bool)
  StorageAccess.filePath = '/'  (string)
  StorageAccess.ignored = false  (bool)
  StorageVolume.ignored = false  (bool)
  StorageVolume.usage = 'FileSystem'  (0x2)  (enum)
  StorageVolume.fsType = 'ext4'  (string)
  StorageVolume.label = ''  (string)
  StorageVolume.uuid = '86efd6d8-2751-47e1-a6d7-2c7bf04f974b'  (string)
  

Bug#762950: Bug most likely in solid

2015-03-28 Thread Austin Clements
Quoth Scott Kitterman on Mar 28 at  5:36 pm:
 On Sunday, March 15, 2015 12:43:51 PM Scott Kitterman wrote:
  I've poked around in core/libs/database/collectionmanager.cpp and it appears
  that the digikam code tries to do the right thing and the most likely issue
  is something about how solid handles these cases, so reassigning.
 
 For the cases in question for this bug, please try running:
 
 $ solid-hardware list

Here's the output for me:

$ solid-hardware list
udi = '/org/kde/solid/udev/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input4'
udi = 
'/org/kde/solid/udev/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input2'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:16.3/tty/ttyS0'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:19.0/net/eth0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1a.0/usb1/1-1/1-1.6/1-1.6:1.0/video4linux/video0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/audio'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/dsp'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/hwC0D0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/pcmC0D0c'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/pcmC0D0p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/hwC0D3'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D3p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D7p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D8p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/mixer'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/controlC0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1c.1/:03:00.0/net/wlan0'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS1'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS2'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS3'
udi = '/org/kde/solid/udev/sys/devices/platform/thinkpad_acpi/sound/card29'
udi = 
'/org/kde/solid/udev/sys/devices/platform/thinkpad_acpi/sound/card29/controlC29'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu0'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu1'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu2'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu3'
udi = '/org/kde/solid/udev/sys/devices/virtual/net/docker0'
udi = '/org/kde/solid/udev/sys/devices/virtual/net/lo'
udi = '/org/kde/solid/udev/sys/devices/virtual/sound/timer'
udi = '/org/freedesktop/UDisks2/block_devices/loop7'
udi = '/org/freedesktop/UDisks2/block_devices/loop6'
udi = '/org/freedesktop/UDisks2/block_devices/loop5'
udi = '/org/freedesktop/UDisks2/block_devices/loop4'
udi = '/org/freedesktop/UDisks2/block_devices/loop3'
udi = '/org/freedesktop/UDisks2/block_devices/loop2'
udi = '/org/freedesktop/UDisks2/block_devices/loop1'
udi = '/org/freedesktop/UDisks2/block_devices/loop0'
udi = '/org/freedesktop/UDisks2/block_devices/sda5'
udi = '/org/freedesktop/UDisks2/block_devices/sda3'
udi = '/org/freedesktop/UDisks2/block_devices/sda2'
udi = '/org/freedesktop/UDisks2/block_devices/sda1'
udi = '/org/freedesktop/UDisks2/block_devices/sda'
udi = '/org/freedesktop/UDisks2/block_devices/sda6'
udi = '/org/freedesktop/UDisks2/drives/INTEL_SSDSA2BW160G3L_BTPR217001GM160DGN'
udi = '/org/freedesktop/UPower'
udi = '/org/freedesktop/UPower/devices/line_power_AC'
udi = '/org/freedesktop/UPower/devices/battery_BAT0'
udi = '/org/kde/fstab'

This is the same regardless of whether or not Docker is running.

 Then find the UID of the mount point in question and run:
 
 $ solid-hardware details '$UID'

My digikam album lives on sda6. Here's the output *without* Docker
running

$ sudo /etc/init.d/docker stop
$ solid-hardware details /org/freedesktop/UDisks2/block_devices/sda6
udi = '/org/freedesktop/UDisks2/block_devices/sda6'
  parent = 
'/org/freedesktop/UDisks2/drives/INTEL_SSDSA2BW160G3L_BTPR217001GM160DGN'  
(string)
  vendor = ''  (string)
  product = 'INTEL SSDSA2BW160G3L'  (string)
  description = '116.9 GiB Hard Drive'  (string)
  Block.major = 8  (0x8)  (int)
  Block.minor = 6  (0x6)  (int)
  Block.device = '/dev/sda6'  (string)
  StorageAccess.accessible = true  (bool)
  StorageAccess.filePath = '/'  (string)
  StorageAccess.ignored = false  (bool)
  StorageVolume.ignored = false  (bool)
  StorageVolume.usage = 'FileSystem'  (0x2)  (enum)
  StorageVolume.fsType = 'ext4'  (string)
  StorageVolume.label = ''  (string)
  StorageVolume.uuid = '86efd6d8-2751-47e1-a6d7-2c7bf04f974b'  (string)
  

Bug#762950: Bug most likely in solid

2015-03-28 Thread Austin Clements
Quoth Scott Kitterman on Mar 28 at  5:36 pm:
 On Sunday, March 15, 2015 12:43:51 PM Scott Kitterman wrote:
  I've poked around in core/libs/database/collectionmanager.cpp and it appears
  that the digikam code tries to do the right thing and the most likely issue
  is something about how solid handles these cases, so reassigning.
 
 For the cases in question for this bug, please try running:
 
 $ solid-hardware list

Here's the output for me:

$ solid-hardware list
udi = '/org/kde/solid/udev/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input4'
udi = 
'/org/kde/solid/udev/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input2'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:16.3/tty/ttyS0'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:19.0/net/eth0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1a.0/usb1/1-1/1-1.6/1-1.6:1.0/video4linux/video0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/audio'
udi = '/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/dsp'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/hwC0D0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/pcmC0D0c'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D0/pcmC0D0p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/hwC0D3'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D3p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D7p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/hdaudioC0D3/pcmC0D8p'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/mixer'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1b.0/sound/card0/controlC0'
udi = 
'/org/kde/solid/udev/sys/devices/pci:00/:00:1c.1/:03:00.0/net/wlan0'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS1'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS2'
udi = '/org/kde/solid/udev/sys/devices/platform/serial8250/tty/ttyS3'
udi = '/org/kde/solid/udev/sys/devices/platform/thinkpad_acpi/sound/card29'
udi = 
'/org/kde/solid/udev/sys/devices/platform/thinkpad_acpi/sound/card29/controlC29'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu0'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu1'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu2'
udi = '/org/kde/solid/udev/sys/devices/system/cpu/cpu3'
udi = '/org/kde/solid/udev/sys/devices/virtual/net/docker0'
udi = '/org/kde/solid/udev/sys/devices/virtual/net/lo'
udi = '/org/kde/solid/udev/sys/devices/virtual/sound/timer'
udi = '/org/freedesktop/UDisks2/block_devices/loop7'
udi = '/org/freedesktop/UDisks2/block_devices/loop6'
udi = '/org/freedesktop/UDisks2/block_devices/loop5'
udi = '/org/freedesktop/UDisks2/block_devices/loop4'
udi = '/org/freedesktop/UDisks2/block_devices/loop3'
udi = '/org/freedesktop/UDisks2/block_devices/loop2'
udi = '/org/freedesktop/UDisks2/block_devices/loop1'
udi = '/org/freedesktop/UDisks2/block_devices/loop0'
udi = '/org/freedesktop/UDisks2/block_devices/sda5'
udi = '/org/freedesktop/UDisks2/block_devices/sda3'
udi = '/org/freedesktop/UDisks2/block_devices/sda2'
udi = '/org/freedesktop/UDisks2/block_devices/sda1'
udi = '/org/freedesktop/UDisks2/block_devices/sda'
udi = '/org/freedesktop/UDisks2/block_devices/sda6'
udi = '/org/freedesktop/UDisks2/drives/INTEL_SSDSA2BW160G3L_BTPR217001GM160DGN'
udi = '/org/freedesktop/UPower'
udi = '/org/freedesktop/UPower/devices/line_power_AC'
udi = '/org/freedesktop/UPower/devices/battery_BAT0'
udi = '/org/kde/fstab'

This is the same regardless of whether or not Docker is running.

 Then find the UID of the mount point in question and run:
 
 $ solid-hardware details '$UID'

My digikam album lives on sda6. Here's the output *without* Docker
running

$ sudo /etc/init.d/docker stop
$ solid-hardware details /org/freedesktop/UDisks2/block_devices/sda6
udi = '/org/freedesktop/UDisks2/block_devices/sda6'
  parent = 
'/org/freedesktop/UDisks2/drives/INTEL_SSDSA2BW160G3L_BTPR217001GM160DGN'  
(string)
  vendor = ''  (string)
  product = 'INTEL SSDSA2BW160G3L'  (string)
  description = '116.9 GiB Hard Drive'  (string)
  Block.major = 8  (0x8)  (int)
  Block.minor = 6  (0x6)  (int)
  Block.device = '/dev/sda6'  (string)
  StorageAccess.accessible = true  (bool)
  StorageAccess.filePath = '/'  (string)
  StorageAccess.ignored = false  (bool)
  StorageVolume.ignored = false  (bool)
  StorageVolume.usage = 'FileSystem'  (0x2)  (enum)
  StorageVolume.fsType = 'ext4'  (string)
  StorageVolume.label = ''  (string)
  StorageVolume.uuid = '86efd6d8-2751-47e1-a6d7-2c7bf04f974b'  (string)
  

[PATCH v2 8/8] emacs: Support cid: references with shr renderer

2015-01-24 Thread Austin Clements
shr has really nice support for inline image rendering, but previously
we only had the hooks for w3m cid: references.
---
 emacs/notmuch-show.el | 45 +
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index 34dcedd..66350d4 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -767,14 +767,43 @@ (defun 
notmuch-show-get-mime-type-of-application/octet-stream (part)
  nil

 (defun notmuch-show-insert-part-text/html (msg part content-type nth depth 
button)
-  ;; text/html handler to work around bugs in renderers and our
-  ;; invisibile parts code. In particular w3m sets up a keymap which
-  ;; "leaks" outside the invisible region and causes strange effects
-  ;; in notmuch. We set mm-inline-text-html-with-w3m-keymap to nil to
-  ;; tell w3m not to set a keymap (so the normal notmuch-show-mode-map
-  ;; remains).
-  (let ((mm-inline-text-html-with-w3m-keymap nil))
-(notmuch-show-insert-part-*/* msg part content-type nth depth button)))
+  (if (eq mm-text-html-renderer 'shr)
+  ;; It's easier to drive shr ourselves than to work around the
+  ;; goofy things `mm-shr' does (like irreversibly taking over
+  ;; content ID handling).
+  (notmuch-show--insert-part-text/html-shr msg part)
+;; Otherwise, let message-mode do the heavy lifting
+;;
+;; w3m sets up a keymap which "leaks" outside the invisible region
+;; and causes strange effects in notmuch. We set
+;; mm-inline-text-html-with-w3m-keymap to nil to tell w3m not to
+;; set a keymap (so the normal notmuch-show-mode-map remains).
+(let ((mm-inline-text-html-with-w3m-keymap nil))
+  (notmuch-show-insert-part-*/* msg part content-type nth depth button
+
+;; These functions are used by notmuch-show--insert-part-text/html-shr
+(declare-function libxml-parse-html-region "xml.c")
+(declare-function shr-insert-document "shr")
+
+(defun notmuch-show--insert-part-text/html-shr (msg part)
+  ;; Make sure shr is loaded before we start let-binding its globals
+  (require 'shr)
+  (let ((dom (let ((process-crypto notmuch-show-process-crypto))
+  (with-temp-buffer
+(insert (notmuch-get-bodypart-text msg part process-crypto))
+(libxml-parse-html-region (point-min) (point-max)
+   (shr-content-function
+(lambda (url)
+  ;; shr strips the "cid:" part of URL, but doesn't
+  ;; URL-decode it (see RFC 2392).
+  (let ((cid (url-unhex-string url)))
+(first (notmuch-show--get-cid-content cid)
+   ;; Block all external images to prevent privacy leaks and
+   ;; potential attacks.  FIXME: If we block an image, offer a
+   ;; button to load external images.
+   (shr-blocked-images "."))
+(shr-insert-document dom)
+t))

 (defun notmuch-show-insert-part-*/* (msg part content-type nth depth button)
   ;; This handler _must_ succeed - it is the handler of last resort.
-- 
2.1.3



[PATCH v2 7/8] emacs: Rewrite content ID handling

2015-01-24 Thread Austin Clements
Besides generally cleaning up the code and separating the general
content ID handling from the w3m-specific code, this fixes several
problems.

Foremost is that, previously, the code roughly assumed that referenced
parts would be in the same multipart/related as the reference.
According to RFC 2392, nothing could be further from the truth:
content IDs are supposed to be globally unique and globally
addressable.  This is nonsense, but this patch at least fixes things
so content IDs can be anywhere in the same message.

As a side-effect of the above, this handles multipart/alternate
content-IDs more in line with RFC 2046 section 5.1.2 (not that I've
ever seen this in the wild).  This also properly URL-decodes cid:
URLs, as per RFC 2392 (the previous code did not), and applies crypto
settings from the show buffer (the previous code used the global
crypto settings).
---
 emacs/notmuch-show.el | 120 +++---
 1 file changed, 74 insertions(+), 46 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index 11eac5f..34dcedd 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -525,6 +525,73 @@ (defun notmuch-show-toggle-part-invisibility ( 
button)
  (overlay-put overlay 'invisible (not show))
  t)

+;; Part content ID handling
+
+(defvar notmuch-show--cids nil
+  "Alist from raw content ID to (MSG PART).")
+(make-variable-buffer-local 'notmuch-show--cids)
+
+(defun notmuch-show--register-cids (msg part)
+  "Register content-IDs in PART and all of PART's sub-parts."
+  (let ((content-id (plist-get part :content-id)))
+(when content-id
+  ;; Note that content-IDs are globally unique, except when they
+  ;; aren't: RFC 2046 section 5.1.4 permits children of a
+  ;; multipart/alternative to have the same content-ID, in which
+  ;; case the MUA is supposed to pick the best one it can render.
+  ;; We simply add the content-ID to the beginning of our alist;
+  ;; so if this happens, we'll take the last (and "best")
+  ;; alternative (even if we can't render it).
+  (push (list content-id msg part) notmuch-show--cids)))
+  ;; Recurse on sub-parts
+  (let ((ctype (notmuch-split-content-type
+   (downcase (plist-get part :content-type)
+(cond ((equal (first ctype) "multipart")
+  (mapc (apply-partially #'notmuch-show--register-cids msg)
+(plist-get part :content)))
+ ((equal ctype '("message" "rfc822"))
+  (notmuch-show--register-cids
+   msg
+   (first (plist-get (first (plist-get part :content)) :body)))
+
+(defun notmuch-show--get-cid-content (cid)
+  "Return a list (CID-content content-type) or nil.
+
+This will only find parts from messages that have been inserted
+into the current buffer.  CID must be a raw content ID, without
+enclosing angle brackets, a cid: prefix, or URL encoding.  This
+will return nil if the CID is unknown or cannot be retrieved."
+  (let ((descriptor (cdr (assoc cid notmuch-show--cids
+(when descriptor
+  (let* ((msg (first descriptor))
+(part (second descriptor))
+;; Request caching for this content, as some messages
+;; reference the same cid: part many times (hundreds!).
+(content (notmuch-get-bodypart-binary
+  msg part notmuch-show-process-crypto 'cache))
+(content-type (plist-get part :content-type)))
+   (list content content-type)
+
+(defun notmuch-show-setup-w3m ()
+  "Instruct w3m how to retrieve content from a \"related\" part of a message."
+  (interactive)
+  (if (boundp 'w3m-cid-retrieve-function-alist)
+(unless (assq 'notmuch-show-mode w3m-cid-retrieve-function-alist)
+  (push (cons 'notmuch-show-mode #'notmuch-show--cid-w3m-retrieve)
+   w3m-cid-retrieve-function-alist)))
+  (setq mm-inline-text-html-with-images t))
+
+(defvar w3m-current-buffer) ;; From `w3m.el'.
+(defun notmuch-show--cid-w3m-retrieve (url  args)
+  ;; url includes the cid: prefix and is URL encoded (see RFC 2392).
+  (let* ((cid (url-unhex-string (substring url 4)))
+(content-and-type
+ (with-current-buffer w3m-current-buffer
+   (notmuch-show--get-cid-content cid
+(when content-and-type
+  (insert (first content-and-type))
+  (second content-and-type
+
 ;; MIME part renderers

 (defun notmuch-show-multipart/*-to-list (part)
@@ -549,56 +616,11 @@ (defun notmuch-show-insert-part-multipart/alternative 
(msg part content-type nth
   (indent-rigidly start (point) 1)))
   t)

-(defun notmuch-show-setup-w3m ()
-  "Instruct w3m how to retrieve content from a \"related\" part of a message."
-  (interactive)
-  (if (boundp 'w3m-cid-retrieve-function-alist)
-(unless (assq 'notmuch-show-mode w3m-cid-retrieve-function-alist)
-  (push (cons 'notmuch-show-mode 'notmuch-show-w3m-cid-retrieve)
-   w3m-cid-retrieve-function-alist)))
-  (setq 

[PATCH v2 6/8] emacs: Use generalized content caching in w3m CID code

2015-01-24 Thread Austin Clements
Previously this did its own caching, but this is now supported by more
generally by `notmuch-get-bodypart-binary'.
---
 emacs/notmuch-show.el | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index f29428a..11eac5f 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -562,15 +562,14 @@ (defvar w3m-current-buffer) ;; From `w3m.el'.
 (defvar notmuch-show-w3m-cid-store nil)
 (make-variable-buffer-local 'notmuch-show-w3m-cid-store)

-(defun notmuch-show-w3m-cid-store-internal (content-id msg part content)
-  (push (list content-id msg part content)
-   notmuch-show-w3m-cid-store))
+(defun notmuch-show-w3m-cid-store-internal (content-id msg part)
+  (push (list content-id msg part) notmuch-show-w3m-cid-store))

 (defun notmuch-show-w3m-cid-store (msg part)
   (let ((content-id (plist-get part :content-id)))
 (when content-id
   (notmuch-show-w3m-cid-store-internal (concat "cid:" content-id)
-  msg part nil
+  msg part

 (defun notmuch-show-w3m-cid-retrieve (url  args)
   (let ((matching-part (with-current-buffer w3m-current-buffer
@@ -578,18 +577,12 @@ (defun notmuch-show-w3m-cid-retrieve (url  args)
 (if matching-part
(let* ((msg (nth 1 matching-part))
   (part (nth 2 matching-part))
-  (content (nth 3 matching-part))
   (content-type (plist-get part :content-type)))
- ;; If we don't already have the content, get it and cache
- ;; it, as some messages reference the same cid: part many
- ;; times (hundreds!), which results in many calls to
- ;; `notmuch part'.
- (unless content
-   (setq content (notmuch-get-bodypart-binary
-  msg part notmuch-show-process-crypto))
-   (with-current-buffer w3m-current-buffer
- (notmuch-show-w3m-cid-store-internal url msg part content)))
- (insert content)
+ ;; Request content caching, as some messages reference the
+ ;; same cid: part many times (hundreds!), which results in
+ ;; many calls to `notmuch show'.
+ (insert (notmuch-get-bodypart-binary
+  msg part notmuch-show-process-crypto 'cache))
  content-type)
   nil)))

-- 
2.1.3



[PATCH v2 5/8] emacs: Support caching in notmuch-get-bodypart-{binary, text}

2015-01-24 Thread Austin Clements
(The actual code change here is small, but requires re-indenting
existing code.)
---
 emacs/notmuch-lib.el | 64 
 1 file changed, 40 insertions(+), 24 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 3154725..f8e5165 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -529,39 +529,53 @@ (defun notmuch-parts-filter-by-type (parts type)
(lambda (part) (notmuch-match-content-type (plist-get part :content-type) 
type))
parts))

-(defun notmuch-get-bodypart-binary (msg part process-crypto)
+(defun notmuch-get-bodypart-binary (msg part process-crypto  cache)
   "Return the unprocessed content of PART in MSG as a unibyte string.

 This returns the \"raw\" content of the given part after content
 transfer decoding, but with no further processing (see the
 discussion of --format=raw in man notmuch-show).  In particular,
-this does no charset conversion."
-  (let ((args `("show" "--format=raw"
-   ,(format "--part=%d" (plist-get part :id))
-   ,@(when process-crypto '("--decrypt"))
-   ,(notmuch-id-to-query (plist-get msg :id)
-(with-temp-buffer
-  ;; Emacs internally uses a UTF-8-like multibyte string
-  ;; representation by default (regardless of the coding system,
-  ;; which only affects how it goes from outside data to this
-  ;; internal representation).  This *almost* never matters.
-  ;; Annoyingly, it does matter if we use this data in an image
-  ;; descriptor, since Emacs will use its internal data buffer
-  ;; directly and this multibyte representation corrupts binary
-  ;; image formats.  Since the caller is asking for binary data, a
-  ;; unibyte string is a more appropriate representation anyway.
-  (set-buffer-multibyte nil)
-  (let ((coding-system-for-read 'no-conversion))
-   (apply #'call-process notmuch-command nil '(t nil) nil args)
-   (buffer-string)
-
-(defun notmuch-get-bodypart-text (msg part process-crypto)
+this does no charset conversion.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already)."
+  (let ((data (plist-get part :binary-content)))
+(when (not data)
+  (let ((args `("show" "--format=raw"
+   ,(format "--part=%d" (plist-get part :id))
+   ,@(when process-crypto '("--decrypt"))
+   ,(notmuch-id-to-query (plist-get msg :id)
+   (with-temp-buffer
+ ;; Emacs internally uses a UTF-8-like multibyte string
+ ;; representation by default (regardless of the coding
+ ;; system, which only affects how it goes from outside data
+ ;; to this internal representation).  This *almost* never
+ ;; matters.  Annoyingly, it does matter if we use this data
+ ;; in an image descriptor, since Emacs will use its internal
+ ;; data buffer directly and this multibyte representation
+ ;; corrupts binary image formats.  Since the caller is
+ ;; asking for binary data, a unibyte string is a more
+ ;; appropriate representation anyway.
+ (set-buffer-multibyte nil)
+ (let ((coding-system-for-read 'no-conversion))
+   (apply #'call-process notmuch-command nil '(t nil) nil args)
+   (setq data (buffer-string)
+  (when cache
+   ;; Cheat.  part is non-nil, and `plist-put' always modifies
+   ;; the list in place if it's non-nil.
+   (plist-put part :binary-content data)))
+data))
+
+(defun notmuch-get-bodypart-text (msg part process-crypto  cache)
   "Return the text content of PART in MSG.

 This returns the content of the given part as a multibyte Lisp
 string after performing content transfer decoding and any
 necessary charset decoding.  It is an error to use this for
-non-text/* parts."
+non-text/* parts.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already)."
   (let ((content (plist-get part :content)))
 (when (not content)
   ;; Use show --format=sexp to fetch decoded content
@@ -572,7 +586,9 @@ (defun notmuch-get-bodypart-text (msg part process-crypto)
 (npart (apply #'notmuch-call-notmuch-sexp args)))
(setq content (plist-get npart :content))
(when (not content)
- (error "Internal error: No :content from %S" args
+ (error "Internal error: No :content from %S" args)))
+  (when cache
+   (plist-put part :content content)))
 content))

 ;; Workaround: The call to `mm-display-part' below triggers a bug in
-- 
2.1.3



[PATCH v2 4/8] emacs: Return unibyte strings for binary part data

2015-01-24 Thread Austin Clements
Unibyte strings are meant for representing binary data.  In practice,
using unibyte versus multibyte strings affects *almost* nothing.  It
does happen to matter if we use the binary data in an image descriptor
(which is, helpfully, not documented anywhere and getting it wrong
results in opaque errors like "Not a PNG image: ").
---
 emacs/notmuch-lib.el | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index a0a95d8..3154725 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -530,7 +530,7 @@ (defun notmuch-parts-filter-by-type (parts type)
parts))

 (defun notmuch-get-bodypart-binary (msg part process-crypto)
-  "Return the unprocessed content of PART in MSG.
+  "Return the unprocessed content of PART in MSG as a unibyte string.

 This returns the \"raw\" content of the given part after content
 transfer decoding, but with no further processing (see the
@@ -541,6 +541,16 @@ (defun notmuch-get-bodypart-binary (msg part 
process-crypto)
,@(when process-crypto '("--decrypt"))
,(notmuch-id-to-query (plist-get msg :id)
 (with-temp-buffer
+  ;; Emacs internally uses a UTF-8-like multibyte string
+  ;; representation by default (regardless of the coding system,
+  ;; which only affects how it goes from outside data to this
+  ;; internal representation).  This *almost* never matters.
+  ;; Annoyingly, it does matter if we use this data in an image
+  ;; descriptor, since Emacs will use its internal data buffer
+  ;; directly and this multibyte representation corrupts binary
+  ;; image formats.  Since the caller is asking for binary data, a
+  ;; unibyte string is a more appropriate representation anyway.
+  (set-buffer-multibyte nil)
   (let ((coding-system-for-read 'no-conversion))
(apply #'call-process notmuch-command nil '(t nil) nil args)
(buffer-string)
-- 
2.1.3



[PATCH v2 2/8] emacs: Create an API for fetching parts as undecoded binary

2015-01-24 Thread Austin Clements
The new function, `notmuch-get-bodypart-binary', replaces
`notmuch-get-bodypart-internal'.  Whereas the old function was really
meant for internal use in `notmuch-get-bodypart-content', it was used
in a few other places.  Since the difference between
`notmuch-get-bodypart-content' and `notmuch-get-bodypart-internal' was
unclear, these other uses were always confusing and potentially
inconsistent.  The new call clearly requests the part as undecoded
binary.

This is step 1 of 2 in separating `notmuch-get-bodypart-content' into
two APIs for retrieving either undecoded binary or decoded text.
---
 emacs/notmuch-lib.el  | 28 ++--
 emacs/notmuch-show.el | 22 +-
 2 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index fd25f7c..d4b6684 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -529,25 +529,25 @@ (defun notmuch-parts-filter-by-type (parts type)
(lambda (part) (notmuch-match-content-type (plist-get part :content-type) 
type))
parts))

-;; Helper for parts which are generally not included in the default
-;; SEXP output.
-(defun notmuch-get-bodypart-internal (query part-number process-crypto)
-  (let ((args '("show" "--format=raw"))
-   (part-arg (format "--part=%s" part-number)))
-(setq args (append args (list part-arg)))
-(if process-crypto
-   (setq args (append args '("--decrypt"
-(setq args (append args (list query)))
+(defun notmuch-get-bodypart-binary (msg part process-crypto)
+  "Return the unprocessed content of PART in MSG.
+
+This returns the \"raw\" content of the given part after content
+transfer decoding, but with no further processing (see the
+discussion of --format=raw in man notmuch-show).  In particular,
+this does no charset conversion."
+  (let ((args `("show" "--format=raw"
+   ,(format "--part=%d" (plist-get part :id))
+   ,@(when process-crypto '("--decrypt"))
+   ,(notmuch-id-to-query (plist-get msg :id)
 (with-temp-buffer
   (let ((coding-system-for-read 'no-conversion))
-   (progn
- (apply 'call-process (append (list notmuch-command nil (list t nil) 
nil) args))
- (buffer-string))
+   (apply #'call-process notmuch-command nil '(t nil) nil args)
+   (buffer-string)

 (defun notmuch-get-bodypart-content (msg part process-crypto)
   (or (plist-get part :content)
-  (notmuch-get-bodypart-internal (notmuch-id-to-query (plist-get msg :id))
-(plist-get part :id) process-crypto)))
+  (notmuch-get-bodypart-binary msg part process-crypto)))

 ;; Workaround: The call to `mm-display-part' below triggers a bug in
 ;; Emacs 24 if it attempts to use the shr renderer to display an HTML
diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index df2389e..b3e339e 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -579,16 +579,14 @@ (defun notmuch-show-w3m-cid-retrieve (url  args)
(let* ((msg (nth 1 matching-part))
   (part (nth 2 matching-part))
   (content (nth 3 matching-part))
-  (message-id (plist-get msg :id))
-  (part-number (plist-get part :id))
   (content-type (plist-get part :content-type)))
  ;; If we don't already have the content, get it and cache
  ;; it, as some messages reference the same cid: part many
  ;; times (hundreds!), which results in many calls to
  ;; `notmuch part'.
  (unless content
-   (setq content (notmuch-get-bodypart-internal (notmuch-id-to-query 
message-id)
- part-number 
notmuch-show-process-crypto))
+   (setq content (notmuch-get-bodypart-binary
+  msg part notmuch-show-process-crypto))
(with-current-buffer w3m-current-buffer
  (notmuch-show-w3m-cid-store-internal url msg part content)))
  (insert content)
@@ -2162,15 +2160,14 @@ (defun notmuch-show-stash-git-send-email ( 
no-in-reply-to)

 ;; Interactive part functions and their helpers

-(defun notmuch-show-generate-part-buffer (message-id nth)
+(defun notmuch-show-generate-part-buffer (msg part)
   "Return a temporary buffer containing the specified part's content."
   (let ((buf (generate-new-buffer " *notmuch-part*"))
(process-crypto notmuch-show-process-crypto))
 (with-current-buffer buf
-  (setq notmuch-show-process-crypto process-crypto)
-  ;; Always acquires the part via `notmuch part', even if it is
-  ;; available in the SEXP output.
-  (insert (notmuch-get-bodypart-internal message-id nth 
notmuch-show-process-crypto)))
+  ;; This is always used in the content of mm handles, which
+  ;; expect undecoded, binary part content.
+  (insert (notmuch-get-bodypart-binary msg part process-crypto)))
 buf))

 (defun 

[PATCH v2 1/8] emacs: Track full message and part descriptor in w3m CID store

2015-01-24 Thread Austin Clements
This will simplify later changes.
---
 emacs/notmuch-show.el | 33 ++---
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index 87b4881..df2389e 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -562,35 +562,26 @@ (defvar w3m-current-buffer) ;; From `w3m.el'.
 (defvar notmuch-show-w3m-cid-store nil)
 (make-variable-buffer-local 'notmuch-show-w3m-cid-store)

-(defun notmuch-show-w3m-cid-store-internal (content-id
-   message-id
-   part-number
-   content-type
-   content)
-  (push (list content-id
- message-id
- part-number
- content-type
- content)
+(defun notmuch-show-w3m-cid-store-internal (content-id msg part content)
+  (push (list content-id msg part content)
notmuch-show-w3m-cid-store))

 (defun notmuch-show-w3m-cid-store (msg part)
   (let ((content-id (plist-get part :content-id)))
 (when content-id
   (notmuch-show-w3m-cid-store-internal (concat "cid:" content-id)
-  (plist-get msg :id)
-  (plist-get part :id)
-  (plist-get part :content-type)
-  nil
+  msg part nil

 (defun notmuch-show-w3m-cid-retrieve (url  args)
   (let ((matching-part (with-current-buffer w3m-current-buffer
 (assoc url notmuch-show-w3m-cid-store
 (if matching-part
-   (let ((message-id (nth 1 matching-part))
- (part-number (nth 2 matching-part))
- (content-type (nth 3 matching-part))
- (content (nth 4 matching-part)))
+   (let* ((msg (nth 1 matching-part))
+  (part (nth 2 matching-part))
+  (content (nth 3 matching-part))
+  (message-id (plist-get msg :id))
+  (part-number (plist-get part :id))
+  (content-type (plist-get part :content-type)))
  ;; If we don't already have the content, get it and cache
  ;; it, as some messages reference the same cid: part many
  ;; times (hundreds!), which results in many calls to
@@ -599,11 +590,7 @@ (defun notmuch-show-w3m-cid-retrieve (url  args)
(setq content (notmuch-get-bodypart-internal (notmuch-id-to-query 
message-id)
  part-number 
notmuch-show-process-crypto))
(with-current-buffer w3m-current-buffer
- (notmuch-show-w3m-cid-store-internal url
-  message-id
-  part-number
-  content-type
-  content)))
+ (notmuch-show-w3m-cid-store-internal url msg part content)))
  (insert content)
  content-type)
   nil)))
-- 
2.1.3



[PATCH v2 0/8] Improve charset and cid: handling

2015-01-24 Thread Austin Clements
This is v2 of id:1398105468-14317-1-git-send-email-amdragon at mit.edu.
This improves some comments/documentation, fixes a bug that caused
cryptographic processing to not happen on HTML parts, and addresses
some byte compiler warnings on Emacs 23.  This version has also been
rebased against the several months of changes that happened on master
since v1 (which, remarkably, was trivial).

The diff from v1 is below.

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 83cbf2f..f8e5165 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -535,7 +535,10 @@ (defun notmuch-get-bodypart-binary (msg part 
process-crypto  cache)
 This returns the \"raw\" content of the given part after content
 transfer decoding, but with no further processing (see the
 discussion of --format=raw in man notmuch-show).  In particular,
-this does no charset conversion."
+this does no charset conversion.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already)."
   (let ((data (plist-get part :binary-content)))
 (when (not data)
   (let ((args `("show" "--format=raw"
@@ -558,6 +561,8 @@ (defun notmuch-get-bodypart-binary (msg part process-crypto 
 cache)
(apply #'call-process notmuch-command nil '(t nil) nil args)
(setq data (buffer-string)
   (when cache
+   ;; Cheat.  part is non-nil, and `plist-put' always modifies
+   ;; the list in place if it's non-nil.
(plist-put part :binary-content data)))
 data))

@@ -567,7 +572,10 @@ (defun notmuch-get-bodypart-text (msg part process-crypto 
 cache)
 This returns the content of the given part as a multibyte Lisp
 string after performing content transfer decoding and any
 necessary charset decoding.  It is an error to use this for
-non-text/* parts."
+non-text/* parts.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already)."
   (let ((content (plist-get part :content)))
 (when (not content)
   ;; Use show --format=sexp to fetch decoded content
diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index d190711..66350d4 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -781,10 +781,14 @@ (defun notmuch-show-insert-part-text/html (msg part 
content-type nth depth butto
 (let ((mm-inline-text-html-with-w3m-keymap nil))
   (notmuch-show-insert-part-*/* msg part content-type nth depth button

+;; These functions are used by notmuch-show--insert-part-text/html-shr
+(declare-function libxml-parse-html-region "xml.c")
+(declare-function shr-insert-document "shr")
+
 (defun notmuch-show--insert-part-text/html-shr (msg part)
   ;; Make sure shr is loaded before we start let-binding its globals
   (require 'shr)
-  (let ((dom (let (process-crypto notmuch-show-process-crypto)
+  (let ((dom (let ((process-crypto notmuch-show-process-crypto))
   (with-temp-buffer
 (insert (notmuch-get-bodypart-text msg part process-crypto))
 (libxml-parse-html-region (point-min) (point-max)




[PATCH 08/11] emacs: Support caching in notmuch-get-bodypart-{binary, text}

2015-01-24 Thread Austin Clements
On Fri, 25 Apr 2014, Mark Walters  wrote:
> On Thu, 24 Apr 2014, Austin Clements  wrote:
>> Quoth Mark Walters on Apr 24 at 11:46 am:
>>> 
>>> On Mon, 21 Apr 2014, Austin Clements  wrote:
>>> > (The actual code change here is small, but requires re-indenting
>>> > existing code.)
>>> > ---
>>> >  emacs/notmuch-lib.el | 52 
>>> > ++--
>>> >  1 file changed, 30 insertions(+), 22 deletions(-)
>>> >
>>> > diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
>>> > index fc67b14..fee8512 100644
>>> > --- a/emacs/notmuch-lib.el
>>> > +++ b/emacs/notmuch-lib.el
>>> > @@ -503,33 +503,39 @@ (defun notmuch-parts-filter-by-type (parts type)
>>> > (lambda (part) (notmuch-match-content-type (plist-get part 
>>> > :content-type) type))
>>> > parts))
>>> >  
>>> > -(defun notmuch-get-bodypart-binary (msg part process-crypto)
>>> > +(defun notmuch-get-bodypart-binary (msg part process-crypto  
>>> > cache)
>>> >"Return the unprocessed content of PART in MSG as a unibyte string.
>>> >  
>>> >  This returns the \"raw\" content of the given part after content
>>> >  transfer decoding, but with no further processing (see the
>>> >  discussion of --format=raw in man notmuch-show).  In particular,
>>> >  this does no charset conversion."
>>> > -  (let ((args `("show" "--format=raw"
>>> > - ,(format "--part=%d" (plist-get part :id))
>>> > - ,@(when process-crypto '("--decrypt"))
>>> > - ,(notmuch-id-to-query (plist-get msg :id)
>>> > -(with-temp-buffer
>>> > -  ;; Emacs internally uses a UTF-8-like multibyte string
>>> > -  ;; representation by default (regardless of the coding system,
>>> > -  ;; which only affects how it goes from outside data to this
>>> > -  ;; internal representation).  This *almost* never matters.
>>> > -  ;; Annoyingly, it does matter if we use this data in an image
>>> > -  ;; descriptor, since Emacs will use its internal data buffer
>>> > -  ;; directly and this multibyte representation corrupts binary
>>> > -  ;; image formats.  Since the caller is asking for binary data, a
>>> > -  ;; unibyte string is a more appropriate representation anyway.
>>> > -  (set-buffer-multibyte nil)
>>> > -  (let ((coding-system-for-read 'no-conversion))
>>> > - (apply #'call-process notmuch-command nil '(t nil) nil args)
>>> > - (buffer-string)
>>> > -
>>> > -(defun notmuch-get-bodypart-text (msg part process-crypto)
>>> > +  (let ((data (plist-get part :binary-content)))
>>> > +(when (not data)
>>> > +  (let ((args `("show" "--format=raw"
>>> > + ,(format "--part=%d" (plist-get part :id))
>>> > + ,@(when process-crypto '("--decrypt"))
>>> > + ,(notmuch-id-to-query (plist-get msg :id)
>>> > + (with-temp-buffer
>>> > +   ;; Emacs internally uses a UTF-8-like multibyte string
>>> > +   ;; representation by default (regardless of the coding
>>> > +   ;; system, which only affects how it goes from outside data
>>> > +   ;; to this internal representation).  This *almost* never
>>> > +   ;; matters.  Annoyingly, it does matter if we use this data
>>> > +   ;; in an image descriptor, since Emacs will use its internal
>>> > +   ;; data buffer directly and this multibyte representation
>>> > +   ;; corrupts binary image formats.  Since the caller is
>>> > +   ;; asking for binary data, a unibyte string is a more
>>> > +   ;; appropriate representation anyway.
>>> > +   (set-buffer-multibyte nil)
>>> > +   (let ((coding-system-for-read 'no-conversion))
>>> > + (apply #'call-process notmuch-command nil '(t nil) nil args)
>>> > + (setq data (buffer-string)
>>> > +  (when cache
>>> > + (plist-put part :binary-content data)))
>>> > +data))
>>> 
>>> I am a little puzzled by this but that could be lack of familiarity with
>>> elisp. As far as I can see plist-put will sometimes modify the original
>>> plist and sometimes return a new plist. If the latt

[PATCH 00/11] Improve charset and cid: handling

2015-01-24 Thread Austin Clements
I added declare-functions for both of these, which should take care of
the Emacs 23 warnings and be more robust on Emacs 24.  We can't reach
the function that calls these unless shr is actually available, but the
byte compiler doesn't know that.

On Sat, 26 Apr 2014, Mark Walters  wrote:
> Aside from the minor comments I mentioned in previous emails and one
> more comment below this looks good. 
>
> The extra comment is that on emacs23 I get the following when compiling:
>
> In end of data:
> notmuch-show.el:2188:1:Warning: the following functions are not known to be
> defined: libxml-parse-html-region, shr-insert-document
>
> Finally, I have not really tested it as I mainly use emacs23
>
> Best wishes
>
> Mark
>
>
>
>
> On Mon, 21 Apr 2014, Austin Clements  wrote:
>> I set out to quickly add support for cid: links in the shr renderer
>> and wound up making our charset handling more robust and rewriting our
>> content-ID handling.  The test introduced in patch 2 passes in all but
>> one really obscure case, but only because of many unwritten and
>> potentially fragile assumptions that Emacs and the CLI make about each
>> other.
>>
>> The first three patches could reasonably go in to 0.18.  The rest of
>> this series is certainly post-0.18, but I didn't want to lose track of
>> it.
>>
>> This series comes in three stages.  Each depends on the earlier ones,
>> but each prefix makes sense on its own and could be pushed without the
>> later stages.
>>
>> Patch 1 is a simple clean up patch.
>>
>> Patches 2 through 7 robust-ify our charset handling in Emacs, mostly
>> by splitting the broken `notmuch-get-bodypart-content' API into
>> `notmuch-get-bodypart-binary' and `notmuch-get-bodypart-text' so a
>> caller can explicitly convey their requirements.
>>
>> The remaining patches improve our content-ID handling and add support
>> for cid: links for shr.
>>
>> ___
>> notmuch mailing list
>> notmuch at notmuchmail.org
>> http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH 11/11] emacs: Support cid: references with shr renderer

2015-01-24 Thread Austin Clements
On Thu, 01 May 2014, David Edmondson  wrote:
> On Mon, Apr 21 2014, Austin Clements wrote:
>> +(defun notmuch-show--insert-part-text/html-shr (msg part)
>> +  ;; Make sure shr is loaded before we start let-binding its globals
>> +  (require 'shr)
>> +  (let ((dom (let (process-crypto notmuch-show-process-crypto)
>
> Missing brackets? (You let-bind both `process-crypto' and
> `notmuch-show-process-crypto' as nil.)

'Doh.  Thanks.


[PATCH 06/11] emacs: Remove broken `notmuch-get-bodypart-content' API

2015-01-24 Thread Austin Clements
On Fri, 11 Jul 2014, David Bremner  wrote:
> Austin Clements  writes:
>
>> +This returns the content of the given part as a multibyte Lisp
>
> What does "multibyte" mean here? utf8? current encoding?

Elisp has two kinds of stings: "unibyte strings" and "multibyte
strings".

  
https://www.gnu.org/software/emacs/manual/html_node/elisp/Non_002dASCII-in-Strings.html

You can think of unibyte strings as binary data; they're just vectors of
bytes without any particular encoding semantics (though when you use a
unibyte string you can endow it with encoding).  Multibyte strings,
however, are text; they're vectors of Unicode code points.

>> +string after performing content transfer decoding and any
>> +necessary charset decoding.  It is an error to use this for
>> +non-text/* parts."
>> +  (let ((content (plist-get part :content)))
>> +(when (not content)
>> +  ;; Use show --format=sexp to fetch decoded content
>> +  (let* ((args `("show" "--format=sexp" "--include-html"
>> + ,(format "--part=%s" (plist-get part :id))
>> + ,@(when process-crypto '("--decrypt"))
>> + ,(notmuch-id-to-query (plist-get msg :id
>> + (npart (apply #'notmuch-call-notmuch-sexp args)))
>> +(setq content (plist-get npart :content))
>> +(when (not content)
>> +  (error "Internal error: No :content from %S" args
>> +content))
>
> I'm a bit curious at the lack of setting "coding-system-for-read" here.
> Are we assuming the user has their environment set up correctly? Not so
> much a criticism as being nervous about everything coding-system
> related.

That is interesting.  coding-system-for-read should really go in
notmuch-call-notmuch-sexp, but I worry that, while *almost* all strings
the CLI outputs are UTF-8, not quite all of them are.  For example, we
output filenames exactly at the OS reports the bytes to us (which is
necessary, in a sense, because POSIX enforces no particular encoding on
file names, but still really unfortunate).

We could set coding-system-for-read, but a full solution needs more
cooperation from the CLI.  Possibly the right answer, at least for the
sexp format, is to do our own UTF-8 to "\u" escapes for strings that
are known to be UTF-8 and leave the raw bytes for the few that aren't.
Then we would set the coding-system-for-read to 'no-conversion and I
think everything would Just Work.

That doesn't help for JSON, which is supposed to be all UTF-8 all the
time.  I can think of solutions there, but they're all ugly and involve
things like encoding filenames as base64 when they aren't valid UTF-8.

So...  I don't think I'm going to do anything about this at this moment.

> I didn't see anything else to object to in this patch or the previous
> one.


[PATCH 04/11] emacs: Track full message and part descriptor in w3m CID store

2015-01-24 Thread Austin Clements
On Thu, 10 Jul 2014, David Bremner  wrote:
> Austin Clements  writes:
>
>> This will simplify later changes.
>
> I'd have preferred the whitespace changes as a seperate patch, but OK.

Not sure which whitespace changes you're referring to.  Everything in
this diff is actual (minor) code changes.


Re: [PATCH 08/11] emacs: Support caching in notmuch-get-bodypart-{binary, text}

2015-01-24 Thread Austin Clements
On Fri, 25 Apr 2014, Mark Walters markwalters1...@gmail.com wrote:
 On Thu, 24 Apr 2014, Austin Clements amdra...@mit.edu wrote:
 Quoth Mark Walters on Apr 24 at 11:46 am:
 
 On Mon, 21 Apr 2014, Austin Clements amdra...@mit.edu wrote:
  (The actual code change here is small, but requires re-indenting
  existing code.)
  ---
   emacs/notmuch-lib.el | 52 
  ++--
   1 file changed, 30 insertions(+), 22 deletions(-)
 
  diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
  index fc67b14..fee8512 100644
  --- a/emacs/notmuch-lib.el
  +++ b/emacs/notmuch-lib.el
  @@ -503,33 +503,39 @@ (defun notmuch-parts-filter-by-type (parts type)
  (lambda (part) (notmuch-match-content-type (plist-get part 
  :content-type) type))
  parts))
   
  -(defun notmuch-get-bodypart-binary (msg part process-crypto)
  +(defun notmuch-get-bodypart-binary (msg part process-crypto optional 
  cache)
 Return the unprocessed content of PART in MSG as a unibyte string.
   
   This returns the \raw\ content of the given part after content
   transfer decoding, but with no further processing (see the
   discussion of --format=raw in man notmuch-show).  In particular,
   this does no charset conversion.
  -  (let ((args `(show --format=raw
  - ,(format --part=%d (plist-get part :id))
  - ,@(when process-crypto '(--decrypt))
  - ,(notmuch-id-to-query (plist-get msg :id)
  -(with-temp-buffer
  -  ;; Emacs internally uses a UTF-8-like multibyte string
  -  ;; representation by default (regardless of the coding system,
  -  ;; which only affects how it goes from outside data to this
  -  ;; internal representation).  This *almost* never matters.
  -  ;; Annoyingly, it does matter if we use this data in an image
  -  ;; descriptor, since Emacs will use its internal data buffer
  -  ;; directly and this multibyte representation corrupts binary
  -  ;; image formats.  Since the caller is asking for binary data, a
  -  ;; unibyte string is a more appropriate representation anyway.
  -  (set-buffer-multibyte nil)
  -  (let ((coding-system-for-read 'no-conversion))
  - (apply #'call-process notmuch-command nil '(t nil) nil args)
  - (buffer-string)
  -
  -(defun notmuch-get-bodypart-text (msg part process-crypto)
  +  (let ((data (plist-get part :binary-content)))
  +(when (not data)
  +  (let ((args `(show --format=raw
  + ,(format --part=%d (plist-get part :id))
  + ,@(when process-crypto '(--decrypt))
  + ,(notmuch-id-to-query (plist-get msg :id)
  + (with-temp-buffer
  +   ;; Emacs internally uses a UTF-8-like multibyte string
  +   ;; representation by default (regardless of the coding
  +   ;; system, which only affects how it goes from outside data
  +   ;; to this internal representation).  This *almost* never
  +   ;; matters.  Annoyingly, it does matter if we use this data
  +   ;; in an image descriptor, since Emacs will use its internal
  +   ;; data buffer directly and this multibyte representation
  +   ;; corrupts binary image formats.  Since the caller is
  +   ;; asking for binary data, a unibyte string is a more
  +   ;; appropriate representation anyway.
  +   (set-buffer-multibyte nil)
  +   (let ((coding-system-for-read 'no-conversion))
  + (apply #'call-process notmuch-command nil '(t nil) nil args)
  + (setq data (buffer-string)
  +  (when cache
  + (plist-put part :binary-content data)))
  +data))
 
 I am a little puzzled by this but that could be lack of familiarity with
 elisp. As far as I can see plist-put will sometimes modify the original
 plist and sometimes return a new plist. If the latter happens then I
 think it works out as if we hadn't cached anything as the part passed to
 the function is unmodified. That might not matter in this case (though I
 find the lack of determinism disturbing).
 
 Or is something else going on?

 No, your familiarity with Elisp serves you well.  I'm completely
 cheating here.  According to the specification of plist-put, it's
 allowed to return a new list but in reality this only happens when the
 original plist is nil.  We lean on this already all over the
 notmuch-emacs code, but maybe that doesn't excuse me adding one more
 cheat.

 I could add a comment here explaining what's going on, I could
 manually do the list insertion in a way that's guaranteed to mutate it
 in place, or I could add a nil :binary-content property when parts are
 created (since plist-put is guaranteed to mutate existing keys in
 place).

 I think a comment is fine. 

Done.

 (Incidentally what is the best way of telling if emacs has changed an
 object or returned a new one for other commands? Something like (setq
 oldobject object) (setq object (operation-on object)) (if (eq object
 oldobject) ... ))

If `eq' returns t, it definitely returned the original object, though it
may or may not have modified

Re: [PATCH 04/11] emacs: Track full message and part descriptor in w3m CID store

2015-01-24 Thread Austin Clements
On Thu, 10 Jul 2014, David Bremner da...@tethera.net wrote:
 Austin Clements amdra...@mit.edu writes:

 This will simplify later changes.

 I'd have preferred the whitespace changes as a seperate patch, but OK.

Not sure which whitespace changes you're referring to.  Everything in
this diff is actual (minor) code changes.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 11/11] emacs: Support cid: references with shr renderer

2015-01-24 Thread Austin Clements
On Thu, 01 May 2014, David Edmondson d...@dme.org wrote:
 On Mon, Apr 21 2014, Austin Clements wrote:
 +(defun notmuch-show--insert-part-text/html-shr (msg part)
 +  ;; Make sure shr is loaded before we start let-binding its globals
 +  (require 'shr)
 +  (let ((dom (let (process-crypto notmuch-show-process-crypto)

 Missing brackets? (You let-bind both `process-crypto' and
 `notmuch-show-process-crypto' as nil.)

'Doh.  Thanks.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 8/8] emacs: Support cid: references with shr renderer

2015-01-24 Thread Austin Clements
shr has really nice support for inline image rendering, but previously
we only had the hooks for w3m cid: references.
---
 emacs/notmuch-show.el | 45 +
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index 34dcedd..66350d4 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -767,14 +767,43 @@ (defun 
notmuch-show-get-mime-type-of-application/octet-stream (part)
  nil
 
 (defun notmuch-show-insert-part-text/html (msg part content-type nth depth 
button)
-  ;; text/html handler to work around bugs in renderers and our
-  ;; invisibile parts code. In particular w3m sets up a keymap which
-  ;; leaks outside the invisible region and causes strange effects
-  ;; in notmuch. We set mm-inline-text-html-with-w3m-keymap to nil to
-  ;; tell w3m not to set a keymap (so the normal notmuch-show-mode-map
-  ;; remains).
-  (let ((mm-inline-text-html-with-w3m-keymap nil))
-(notmuch-show-insert-part-*/* msg part content-type nth depth button)))
+  (if (eq mm-text-html-renderer 'shr)
+  ;; It's easier to drive shr ourselves than to work around the
+  ;; goofy things `mm-shr' does (like irreversibly taking over
+  ;; content ID handling).
+  (notmuch-show--insert-part-text/html-shr msg part)
+;; Otherwise, let message-mode do the heavy lifting
+;;
+;; w3m sets up a keymap which leaks outside the invisible region
+;; and causes strange effects in notmuch. We set
+;; mm-inline-text-html-with-w3m-keymap to nil to tell w3m not to
+;; set a keymap (so the normal notmuch-show-mode-map remains).
+(let ((mm-inline-text-html-with-w3m-keymap nil))
+  (notmuch-show-insert-part-*/* msg part content-type nth depth button
+
+;; These functions are used by notmuch-show--insert-part-text/html-shr
+(declare-function libxml-parse-html-region xml.c)
+(declare-function shr-insert-document shr)
+
+(defun notmuch-show--insert-part-text/html-shr (msg part)
+  ;; Make sure shr is loaded before we start let-binding its globals
+  (require 'shr)
+  (let ((dom (let ((process-crypto notmuch-show-process-crypto))
+  (with-temp-buffer
+(insert (notmuch-get-bodypart-text msg part process-crypto))
+(libxml-parse-html-region (point-min) (point-max)
+   (shr-content-function
+(lambda (url)
+  ;; shr strips the cid: part of URL, but doesn't
+  ;; URL-decode it (see RFC 2392).
+  (let ((cid (url-unhex-string url)))
+(first (notmuch-show--get-cid-content cid)
+   ;; Block all external images to prevent privacy leaks and
+   ;; potential attacks.  FIXME: If we block an image, offer a
+   ;; button to load external images.
+   (shr-blocked-images .))
+(shr-insert-document dom)
+t))
 
 (defun notmuch-show-insert-part-*/* (msg part content-type nth depth button)
   ;; This handler _must_ succeed - it is the handler of last resort.
-- 
2.1.3

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 2/8] emacs: Create an API for fetching parts as undecoded binary

2015-01-24 Thread Austin Clements
The new function, `notmuch-get-bodypart-binary', replaces
`notmuch-get-bodypart-internal'.  Whereas the old function was really
meant for internal use in `notmuch-get-bodypart-content', it was used
in a few other places.  Since the difference between
`notmuch-get-bodypart-content' and `notmuch-get-bodypart-internal' was
unclear, these other uses were always confusing and potentially
inconsistent.  The new call clearly requests the part as undecoded
binary.

This is step 1 of 2 in separating `notmuch-get-bodypart-content' into
two APIs for retrieving either undecoded binary or decoded text.
---
 emacs/notmuch-lib.el  | 28 ++--
 emacs/notmuch-show.el | 22 +-
 2 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index fd25f7c..d4b6684 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -529,25 +529,25 @@ (defun notmuch-parts-filter-by-type (parts type)
(lambda (part) (notmuch-match-content-type (plist-get part :content-type) 
type))
parts))
 
-;; Helper for parts which are generally not included in the default
-;; SEXP output.
-(defun notmuch-get-bodypart-internal (query part-number process-crypto)
-  (let ((args '(show --format=raw))
-   (part-arg (format --part=%s part-number)))
-(setq args (append args (list part-arg)))
-(if process-crypto
-   (setq args (append args '(--decrypt
-(setq args (append args (list query)))
+(defun notmuch-get-bodypart-binary (msg part process-crypto)
+  Return the unprocessed content of PART in MSG.
+
+This returns the \raw\ content of the given part after content
+transfer decoding, but with no further processing (see the
+discussion of --format=raw in man notmuch-show).  In particular,
+this does no charset conversion.
+  (let ((args `(show --format=raw
+   ,(format --part=%d (plist-get part :id))
+   ,@(when process-crypto '(--decrypt))
+   ,(notmuch-id-to-query (plist-get msg :id)
 (with-temp-buffer
   (let ((coding-system-for-read 'no-conversion))
-   (progn
- (apply 'call-process (append (list notmuch-command nil (list t nil) 
nil) args))
- (buffer-string))
+   (apply #'call-process notmuch-command nil '(t nil) nil args)
+   (buffer-string)
 
 (defun notmuch-get-bodypart-content (msg part process-crypto)
   (or (plist-get part :content)
-  (notmuch-get-bodypart-internal (notmuch-id-to-query (plist-get msg :id))
-(plist-get part :id) process-crypto)))
+  (notmuch-get-bodypart-binary msg part process-crypto)))
 
 ;; Workaround: The call to `mm-display-part' below triggers a bug in
 ;; Emacs 24 if it attempts to use the shr renderer to display an HTML
diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index df2389e..b3e339e 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -579,16 +579,14 @@ (defun notmuch-show-w3m-cid-retrieve (url rest args)
(let* ((msg (nth 1 matching-part))
   (part (nth 2 matching-part))
   (content (nth 3 matching-part))
-  (message-id (plist-get msg :id))
-  (part-number (plist-get part :id))
   (content-type (plist-get part :content-type)))
  ;; If we don't already have the content, get it and cache
  ;; it, as some messages reference the same cid: part many
  ;; times (hundreds!), which results in many calls to
  ;; `notmuch part'.
  (unless content
-   (setq content (notmuch-get-bodypart-internal (notmuch-id-to-query 
message-id)
- part-number 
notmuch-show-process-crypto))
+   (setq content (notmuch-get-bodypart-binary
+  msg part notmuch-show-process-crypto))
(with-current-buffer w3m-current-buffer
  (notmuch-show-w3m-cid-store-internal url msg part content)))
  (insert content)
@@ -2162,15 +2160,14 @@ (defun notmuch-show-stash-git-send-email (optional 
no-in-reply-to)
 
 ;; Interactive part functions and their helpers
 
-(defun notmuch-show-generate-part-buffer (message-id nth)
+(defun notmuch-show-generate-part-buffer (msg part)
   Return a temporary buffer containing the specified part's content.
   (let ((buf (generate-new-buffer  *notmuch-part*))
(process-crypto notmuch-show-process-crypto))
 (with-current-buffer buf
-  (setq notmuch-show-process-crypto process-crypto)
-  ;; Always acquires the part via `notmuch part', even if it is
-  ;; available in the SEXP output.
-  (insert (notmuch-get-bodypart-internal message-id nth 
notmuch-show-process-crypto)))
+  ;; This is always used in the content of mm handles, which
+  ;; expect undecoded, binary part content.
+  (insert (notmuch-get-bodypart-binary msg part process-crypto)))
 buf))
 
 (defun 

[PATCH v2 3/8] emacs: Remove broken `notmuch-get-bodypart-content' API

2015-01-24 Thread Austin Clements
`notmuch-get-bodypart-content' could do two very different things,
depending on conditions: for text/* parts other than text/html, it
would return the part content as a multibyte Lisp string *after*
charset conversion, while for other parts (including text/html), it
would return binary part content without charset conversion.

This commit completes the split of `notmuch-get-bodypart-content' into
two different and explicit APIs: `notmuch-get-bodypart-binary' and
`notmuch-get-bodypart-text'.  It updates all callers to use one or the
other depending on what's appropriate.
---
 emacs/notmuch-lib.el  | 37 -
 emacs/notmuch-show.el |  8 
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index d4b6684..a0a95d8 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -545,9 +545,25 @@ (defun notmuch-get-bodypart-binary (msg part 
process-crypto)
(apply #'call-process notmuch-command nil '(t nil) nil args)
(buffer-string)
 
-(defun notmuch-get-bodypart-content (msg part process-crypto)
-  (or (plist-get part :content)
-  (notmuch-get-bodypart-binary msg part process-crypto)))
+(defun notmuch-get-bodypart-text (msg part process-crypto)
+  Return the text content of PART in MSG.
+
+This returns the content of the given part as a multibyte Lisp
+string after performing content transfer decoding and any
+necessary charset decoding.  It is an error to use this for
+non-text/* parts.
+  (let ((content (plist-get part :content)))
+(when (not content)
+  ;; Use show --format=sexp to fetch decoded content
+  (let* ((args `(show --format=sexp --include-html
+,(format --part=%s (plist-get part :id))
+,@(when process-crypto '(--decrypt))
+,(notmuch-id-to-query (plist-get msg :id
+(npart (apply #'notmuch-call-notmuch-sexp args)))
+   (setq content (plist-get npart :content))
+   (when (not content)
+ (error Internal error: No :content from %S args
+content))
 
 ;; Workaround: The call to `mm-display-part' below triggers a bug in
 ;; Emacs 24 if it attempts to use the shr renderer to display an HTML
@@ -568,18 +584,21 @@ (defun notmuch-mm-display-part-inline (msg part 
content-type process-crypto)
 current buffer, if possible.
   (let ((display-buffer (current-buffer)))
 (with-temp-buffer
-  ;; In case there is :content, the content string is already converted
-  ;; into emacs internal format. `gnus-decoded' is a fake charset,
-  ;; which means no further decoding (to be done by mm- functions).
-  (let* ((charset (if (plist-member part :content)
- 'gnus-decoded
+  ;; In case we already have :content, use it and tell mm-* that
+  ;; it's already been charset-decoded by using the fake
+  ;; `gnus-decoded' charset.  Otherwise, we'll fetch the binary
+  ;; part content and let mm-* decode it.
+  (let* ((have-content (plist-member part :content))
+(charset (if have-content 'gnus-decoded
(plist-get part :content-charset)))
 (handle (mm-make-handle (current-buffer) `(,content-type (charset 
. ,charset)
;; If the user wants the part inlined, insert the content and
;; test whether we are able to inline it (which includes both
;; capability and suitability tests).
(when (mm-inlined-p handle)
- (insert (notmuch-get-bodypart-content msg part process-crypto))
+ (if have-content
+ (insert (notmuch-get-bodypart-text msg part process-crypto))
+   (insert (notmuch-get-bodypart-binary msg part process-crypto)))
  (when (mm-inlinable-p handle)
(set-buffer display-buffer)
(mm-display-part handle)
diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index b3e339e..f29428a 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -702,7 +702,7 @@ (defun notmuch-show-insert-part-text/plain (msg part 
content-type nth depth butt
   (let ((start (if button
   (button-start button)
 (point
-(insert (notmuch-get-bodypart-content msg part 
notmuch-show-process-crypto))
+(insert (notmuch-get-bodypart-text msg part notmuch-show-process-crypto))
 (save-excursion
   (save-restriction
(narrow-to-region start (point-max))
@@ -711,9 +711,9 @@ (defun notmuch-show-insert-part-text/plain (msg part 
content-type nth depth butt
 
 (defun notmuch-show-insert-part-text/calendar (msg part content-type nth depth 
button)
   (insert (with-temp-buffer
-   (insert (notmuch-get-bodypart-content msg part 
notmuch-show-process-crypto))
-   ;; notmuch-get-bodypart-content provides raw, non-converted
-   ;; data. Replace CRLF with LF before icalendar can use it.
+   (insert (notmuch-get-bodypart-text msg part 

[PATCH v2 4/8] emacs: Return unibyte strings for binary part data

2015-01-24 Thread Austin Clements
Unibyte strings are meant for representing binary data.  In practice,
using unibyte versus multibyte strings affects *almost* nothing.  It
does happen to matter if we use the binary data in an image descriptor
(which is, helpfully, not documented anywhere and getting it wrong
results in opaque errors like Not a PNG image: giant binary spew
that is, in fact, a PNG image).
---
 emacs/notmuch-lib.el | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index a0a95d8..3154725 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -530,7 +530,7 @@ (defun notmuch-parts-filter-by-type (parts type)
parts))
 
 (defun notmuch-get-bodypart-binary (msg part process-crypto)
-  Return the unprocessed content of PART in MSG.
+  Return the unprocessed content of PART in MSG as a unibyte string.
 
 This returns the \raw\ content of the given part after content
 transfer decoding, but with no further processing (see the
@@ -541,6 +541,16 @@ (defun notmuch-get-bodypart-binary (msg part 
process-crypto)
,@(when process-crypto '(--decrypt))
,(notmuch-id-to-query (plist-get msg :id)
 (with-temp-buffer
+  ;; Emacs internally uses a UTF-8-like multibyte string
+  ;; representation by default (regardless of the coding system,
+  ;; which only affects how it goes from outside data to this
+  ;; internal representation).  This *almost* never matters.
+  ;; Annoyingly, it does matter if we use this data in an image
+  ;; descriptor, since Emacs will use its internal data buffer
+  ;; directly and this multibyte representation corrupts binary
+  ;; image formats.  Since the caller is asking for binary data, a
+  ;; unibyte string is a more appropriate representation anyway.
+  (set-buffer-multibyte nil)
   (let ((coding-system-for-read 'no-conversion))
(apply #'call-process notmuch-command nil '(t nil) nil args)
(buffer-string)
-- 
2.1.3

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 0/8] Improve charset and cid: handling

2015-01-24 Thread Austin Clements
This is v2 of id:1398105468-14317-1-git-send-email-amdra...@mit.edu.
This improves some comments/documentation, fixes a bug that caused
cryptographic processing to not happen on HTML parts, and addresses
some byte compiler warnings on Emacs 23.  This version has also been
rebased against the several months of changes that happened on master
since v1 (which, remarkably, was trivial).

The diff from v1 is below.

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 83cbf2f..f8e5165 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -535,7 +535,10 @@ (defun notmuch-get-bodypart-binary (msg part 
process-crypto optional cache)
 This returns the \raw\ content of the given part after content
 transfer decoding, but with no further processing (see the
 discussion of --format=raw in man notmuch-show).  In particular,
-this does no charset conversion.
+this does no charset conversion.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already).
   (let ((data (plist-get part :binary-content)))
 (when (not data)
   (let ((args `(show --format=raw
@@ -558,6 +561,8 @@ (defun notmuch-get-bodypart-binary (msg part process-crypto 
optional cache)
(apply #'call-process notmuch-command nil '(t nil) nil args)
(setq data (buffer-string)
   (when cache
+   ;; Cheat.  part is non-nil, and `plist-put' always modifies
+   ;; the list in place if it's non-nil.
(plist-put part :binary-content data)))
 data))
 
@@ -567,7 +572,10 @@ (defun notmuch-get-bodypart-text (msg part process-crypto 
optional cache)
 This returns the content of the given part as a multibyte Lisp
 string after performing content transfer decoding and any
 necessary charset decoding.  It is an error to use this for
-non-text/* parts.
+non-text/* parts.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already).
   (let ((content (plist-get part :content)))
 (when (not content)
   ;; Use show --format=sexp to fetch decoded content
diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index d190711..66350d4 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -781,10 +781,14 @@ (defun notmuch-show-insert-part-text/html (msg part 
content-type nth depth butto
 (let ((mm-inline-text-html-with-w3m-keymap nil))
   (notmuch-show-insert-part-*/* msg part content-type nth depth button
 
+;; These functions are used by notmuch-show--insert-part-text/html-shr
+(declare-function libxml-parse-html-region xml.c)
+(declare-function shr-insert-document shr)
+
 (defun notmuch-show--insert-part-text/html-shr (msg part)
   ;; Make sure shr is loaded before we start let-binding its globals
   (require 'shr)
-  (let ((dom (let (process-crypto notmuch-show-process-crypto)
+  (let ((dom (let ((process-crypto notmuch-show-process-crypto))
   (with-temp-buffer
 (insert (notmuch-get-bodypart-text msg part process-crypto))
 (libxml-parse-html-region (point-min) (point-max)


___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 6/8] emacs: Use generalized content caching in w3m CID code

2015-01-24 Thread Austin Clements
Previously this did its own caching, but this is now supported by more
generally by `notmuch-get-bodypart-binary'.
---
 emacs/notmuch-show.el | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index f29428a..11eac5f 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -562,15 +562,14 @@ (defvar w3m-current-buffer) ;; From `w3m.el'.
 (defvar notmuch-show-w3m-cid-store nil)
 (make-variable-buffer-local 'notmuch-show-w3m-cid-store)
 
-(defun notmuch-show-w3m-cid-store-internal (content-id msg part content)
-  (push (list content-id msg part content)
-   notmuch-show-w3m-cid-store))
+(defun notmuch-show-w3m-cid-store-internal (content-id msg part)
+  (push (list content-id msg part) notmuch-show-w3m-cid-store))
 
 (defun notmuch-show-w3m-cid-store (msg part)
   (let ((content-id (plist-get part :content-id)))
 (when content-id
   (notmuch-show-w3m-cid-store-internal (concat cid: content-id)
-  msg part nil
+  msg part
 
 (defun notmuch-show-w3m-cid-retrieve (url rest args)
   (let ((matching-part (with-current-buffer w3m-current-buffer
@@ -578,18 +577,12 @@ (defun notmuch-show-w3m-cid-retrieve (url rest args)
 (if matching-part
(let* ((msg (nth 1 matching-part))
   (part (nth 2 matching-part))
-  (content (nth 3 matching-part))
   (content-type (plist-get part :content-type)))
- ;; If we don't already have the content, get it and cache
- ;; it, as some messages reference the same cid: part many
- ;; times (hundreds!), which results in many calls to
- ;; `notmuch part'.
- (unless content
-   (setq content (notmuch-get-bodypart-binary
-  msg part notmuch-show-process-crypto))
-   (with-current-buffer w3m-current-buffer
- (notmuch-show-w3m-cid-store-internal url msg part content)))
- (insert content)
+ ;; Request content caching, as some messages reference the
+ ;; same cid: part many times (hundreds!), which results in
+ ;; many calls to `notmuch show'.
+ (insert (notmuch-get-bodypart-binary
+  msg part notmuch-show-process-crypto 'cache))
  content-type)
   nil)))
 
-- 
2.1.3

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 1/8] emacs: Track full message and part descriptor in w3m CID store

2015-01-24 Thread Austin Clements
This will simplify later changes.
---
 emacs/notmuch-show.el | 33 ++---
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
index 87b4881..df2389e 100644
--- a/emacs/notmuch-show.el
+++ b/emacs/notmuch-show.el
@@ -562,35 +562,26 @@ (defvar w3m-current-buffer) ;; From `w3m.el'.
 (defvar notmuch-show-w3m-cid-store nil)
 (make-variable-buffer-local 'notmuch-show-w3m-cid-store)
 
-(defun notmuch-show-w3m-cid-store-internal (content-id
-   message-id
-   part-number
-   content-type
-   content)
-  (push (list content-id
- message-id
- part-number
- content-type
- content)
+(defun notmuch-show-w3m-cid-store-internal (content-id msg part content)
+  (push (list content-id msg part content)
notmuch-show-w3m-cid-store))
 
 (defun notmuch-show-w3m-cid-store (msg part)
   (let ((content-id (plist-get part :content-id)))
 (when content-id
   (notmuch-show-w3m-cid-store-internal (concat cid: content-id)
-  (plist-get msg :id)
-  (plist-get part :id)
-  (plist-get part :content-type)
-  nil
+  msg part nil
 
 (defun notmuch-show-w3m-cid-retrieve (url rest args)
   (let ((matching-part (with-current-buffer w3m-current-buffer
 (assoc url notmuch-show-w3m-cid-store
 (if matching-part
-   (let ((message-id (nth 1 matching-part))
- (part-number (nth 2 matching-part))
- (content-type (nth 3 matching-part))
- (content (nth 4 matching-part)))
+   (let* ((msg (nth 1 matching-part))
+  (part (nth 2 matching-part))
+  (content (nth 3 matching-part))
+  (message-id (plist-get msg :id))
+  (part-number (plist-get part :id))
+  (content-type (plist-get part :content-type)))
  ;; If we don't already have the content, get it and cache
  ;; it, as some messages reference the same cid: part many
  ;; times (hundreds!), which results in many calls to
@@ -599,11 +590,7 @@ (defun notmuch-show-w3m-cid-retrieve (url rest args)
(setq content (notmuch-get-bodypart-internal (notmuch-id-to-query 
message-id)
  part-number 
notmuch-show-process-crypto))
(with-current-buffer w3m-current-buffer
- (notmuch-show-w3m-cid-store-internal url
-  message-id
-  part-number
-  content-type
-  content)))
+ (notmuch-show-w3m-cid-store-internal url msg part content)))
  (insert content)
  content-type)
   nil)))
-- 
2.1.3

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 5/8] emacs: Support caching in notmuch-get-bodypart-{binary, text}

2015-01-24 Thread Austin Clements
(The actual code change here is small, but requires re-indenting
existing code.)
---
 emacs/notmuch-lib.el | 64 
 1 file changed, 40 insertions(+), 24 deletions(-)

diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 3154725..f8e5165 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -529,39 +529,53 @@ (defun notmuch-parts-filter-by-type (parts type)
(lambda (part) (notmuch-match-content-type (plist-get part :content-type) 
type))
parts))
 
-(defun notmuch-get-bodypart-binary (msg part process-crypto)
+(defun notmuch-get-bodypart-binary (msg part process-crypto optional cache)
   Return the unprocessed content of PART in MSG as a unibyte string.
 
 This returns the \raw\ content of the given part after content
 transfer decoding, but with no further processing (see the
 discussion of --format=raw in man notmuch-show).  In particular,
-this does no charset conversion.
-  (let ((args `(show --format=raw
-   ,(format --part=%d (plist-get part :id))
-   ,@(when process-crypto '(--decrypt))
-   ,(notmuch-id-to-query (plist-get msg :id)
-(with-temp-buffer
-  ;; Emacs internally uses a UTF-8-like multibyte string
-  ;; representation by default (regardless of the coding system,
-  ;; which only affects how it goes from outside data to this
-  ;; internal representation).  This *almost* never matters.
-  ;; Annoyingly, it does matter if we use this data in an image
-  ;; descriptor, since Emacs will use its internal data buffer
-  ;; directly and this multibyte representation corrupts binary
-  ;; image formats.  Since the caller is asking for binary data, a
-  ;; unibyte string is a more appropriate representation anyway.
-  (set-buffer-multibyte nil)
-  (let ((coding-system-for-read 'no-conversion))
-   (apply #'call-process notmuch-command nil '(t nil) nil args)
-   (buffer-string)
-
-(defun notmuch-get-bodypart-text (msg part process-crypto)
+this does no charset conversion.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already).
+  (let ((data (plist-get part :binary-content)))
+(when (not data)
+  (let ((args `(show --format=raw
+   ,(format --part=%d (plist-get part :id))
+   ,@(when process-crypto '(--decrypt))
+   ,(notmuch-id-to-query (plist-get msg :id)
+   (with-temp-buffer
+ ;; Emacs internally uses a UTF-8-like multibyte string
+ ;; representation by default (regardless of the coding
+ ;; system, which only affects how it goes from outside data
+ ;; to this internal representation).  This *almost* never
+ ;; matters.  Annoyingly, it does matter if we use this data
+ ;; in an image descriptor, since Emacs will use its internal
+ ;; data buffer directly and this multibyte representation
+ ;; corrupts binary image formats.  Since the caller is
+ ;; asking for binary data, a unibyte string is a more
+ ;; appropriate representation anyway.
+ (set-buffer-multibyte nil)
+ (let ((coding-system-for-read 'no-conversion))
+   (apply #'call-process notmuch-command nil '(t nil) nil args)
+   (setq data (buffer-string)
+  (when cache
+   ;; Cheat.  part is non-nil, and `plist-put' always modifies
+   ;; the list in place if it's non-nil.
+   (plist-put part :binary-content data)))
+data))
+
+(defun notmuch-get-bodypart-text (msg part process-crypto optional cache)
   Return the text content of PART in MSG.
 
 This returns the content of the given part as a multibyte Lisp
 string after performing content transfer decoding and any
 necessary charset decoding.  It is an error to use this for
-non-text/* parts.
+non-text/* parts.
+
+If CACHE is non-nil, the content of this part will be saved in
+MSG (if it isn't already).
   (let ((content (plist-get part :content)))
 (when (not content)
   ;; Use show --format=sexp to fetch decoded content
@@ -572,7 +586,9 @@ (defun notmuch-get-bodypart-text (msg part process-crypto)
 (npart (apply #'notmuch-call-notmuch-sexp args)))
(setq content (plist-get npart :content))
(when (not content)
- (error Internal error: No :content from %S args
+ (error Internal error: No :content from %S args)))
+  (when cache
+   (plist-put part :content content)))
 content))
 
 ;; Workaround: The call to `mm-display-part' below triggers a bug in
-- 
2.1.3

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 06/11] emacs: Remove broken `notmuch-get-bodypart-content' API

2015-01-24 Thread Austin Clements
On Fri, 11 Jul 2014, David Bremner da...@tethera.net wrote:
 Austin Clements amdra...@mit.edu writes:

 +This returns the content of the given part as a multibyte Lisp

 What does multibyte mean here? utf8? current encoding?

Elisp has two kinds of stings: unibyte strings and multibyte
strings.

  
https://www.gnu.org/software/emacs/manual/html_node/elisp/Non_002dASCII-in-Strings.html

You can think of unibyte strings as binary data; they're just vectors of
bytes without any particular encoding semantics (though when you use a
unibyte string you can endow it with encoding).  Multibyte strings,
however, are text; they're vectors of Unicode code points.

 +string after performing content transfer decoding and any
 +necessary charset decoding.  It is an error to use this for
 +non-text/* parts.
 +  (let ((content (plist-get part :content)))
 +(when (not content)
 +  ;; Use show --format=sexp to fetch decoded content
 +  (let* ((args `(show --format=sexp --include-html
 + ,(format --part=%s (plist-get part :id))
 + ,@(when process-crypto '(--decrypt))
 + ,(notmuch-id-to-query (plist-get msg :id
 + (npart (apply #'notmuch-call-notmuch-sexp args)))
 +(setq content (plist-get npart :content))
 +(when (not content)
 +  (error Internal error: No :content from %S args
 +content))

 I'm a bit curious at the lack of setting coding-system-for-read here.
 Are we assuming the user has their environment set up correctly? Not so
 much a criticism as being nervous about everything coding-system
 related.

That is interesting.  coding-system-for-read should really go in
notmuch-call-notmuch-sexp, but I worry that, while *almost* all strings
the CLI outputs are UTF-8, not quite all of them are.  For example, we
output filenames exactly at the OS reports the bytes to us (which is
necessary, in a sense, because POSIX enforces no particular encoding on
file names, but still really unfortunate).

We could set coding-system-for-read, but a full solution needs more
cooperation from the CLI.  Possibly the right answer, at least for the
sexp format, is to do our own UTF-8 to \u escapes for strings that
are known to be UTF-8 and leave the raw bytes for the few that aren't.
Then we would set the coding-system-for-read to 'no-conversion and I
think everything would Just Work.

That doesn't help for JSON, which is supposed to be all UTF-8 all the
time.  I can think of solutions there, but they're all ugly and involve
things like encoding filenames as base64 when they aren't valid UTF-8.

So...  I don't think I'm going to do anything about this at this moment.

 I didn't see anything else to object to in this patch or the previous
 one.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH 00/11] Improve charset and cid: handling

2015-01-24 Thread Austin Clements
I added declare-functions for both of these, which should take care of
the Emacs 23 warnings and be more robust on Emacs 24.  We can't reach
the function that calls these unless shr is actually available, but the
byte compiler doesn't know that.

On Sat, 26 Apr 2014, Mark Walters markwalters1...@gmail.com wrote:
 Aside from the minor comments I mentioned in previous emails and one
 more comment below this looks good. 

 The extra comment is that on emacs23 I get the following when compiling:

 In end of data:
 notmuch-show.el:2188:1:Warning: the following functions are not known to be
 defined: libxml-parse-html-region, shr-insert-document

 Finally, I have not really tested it as I mainly use emacs23

 Best wishes

 Mark




 On Mon, 21 Apr 2014, Austin Clements amdra...@mit.edu wrote:
 I set out to quickly add support for cid: links in the shr renderer
 and wound up making our charset handling more robust and rewriting our
 content-ID handling.  The test introduced in patch 2 passes in all but
 one really obscure case, but only because of many unwritten and
 potentially fragile assumptions that Emacs and the CLI make about each
 other.

 The first three patches could reasonably go in to 0.18.  The rest of
 this series is certainly post-0.18, but I didn't want to lose track of
 it.

 This series comes in three stages.  Each depends on the earlier ones,
 but each prefix makes sense on its own and could be pushed without the
 later stages.

 Patch 1 is a simple clean up patch.

 Patches 2 through 7 robust-ify our charset handling in Emacs, mostly
 by splitting the broken `notmuch-get-bodypart-content' API into
 `notmuch-get-bodypart-binary' and `notmuch-get-bodypart-text' so a
 caller can explicitly convey their requirements.

 The remaining patches improve our content-ID handling and add support
 for cid: links for shr.

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


privacy problem: text/html parts pull in network resources

2015-01-21 Thread Austin Clements
Quoth Daniel Kahn Gillmor on Jan 21 at  4:36 pm:
> On Wed 2015-01-21 16:14:07 -0500, Austin Clements wrote:
> > I have a fix for this on shr buried deep in an old patch series that I
> > never got back to: id:1398105468-14317-12-git-send-email-amdragon at mit.edu
> >
> > For shr, the key is to set shr-blocked-images to ".".
> 
> I've just done this, but it doesn't seem to help.
> 
> > However, IIRC, in the current notmuch message rendering pipeline, mm
> > overrides this variable with something computed from
> > gnus-blocked-images.  That said, I'm not sure why gnus-blocked-images
> > isn't *already* taking care of this, but that's probably the place to
> > start digging.
> 
> gnus-blocked-images is set for me to the function
> gnus-block-private-groups, but i don't know what that is (the function
> is undocumented afaict).  Setting gnus-blocked-images to a regexp of "."
> seems to work for me, though.

In notmuch, mm will wind up calling (gnus-block-private-groups nil).
Unfortunately, gnus apparently considers nil to be a news group rather
than a "private group" (gnus speak for email, I think), so
gnus-block-private-groups returns nil (meaning *don't* block images)
rather than ".".

Probably notmuch should override the gnus-blocked-images variable,
since the default value is simply wrong for notmuch.  Maybe something
along the lines of the following should go around our text/html
handler?

  (let ((gnus-blocked-images
 (if (eq gnus-blocked-images 'gnus-block-private-groups)
 ;; mm uses gnus-blocked-images to control image loading.
 ;; However, the default value of gnus-blocked-images
 ;; doesn't work for notmuch because
 ;; gnus-block-private-groups depends on gnus variables we
 ;; don't set.  Override it to disallow network image
 ;; loading.
 "."
   ;; Use the user's customized value.
   gnus-blocked-images)))
...)

Long live abstraction!


privacy problem: text/html parts pull in network resources

2015-01-21 Thread Austin Clements
I have a fix for this on shr buried deep in an old patch series that I
never got back to: id:1398105468-14317-12-git-send-email-amdragon at mit.edu

For shr, the key is to set shr-blocked-images to ".".  However, IIRC,
in the current notmuch message rendering pipeline, mm overrides this
variable with something computed from gnus-blocked-images.  That said,
I'm not sure why gnus-blocked-images isn't *already* taking care of
this, but that's probably the place to start digging.

Quoth Daniel Kahn Gillmor on Jan 21 at  4:00 pm:
> If i send a message with a text/html part (either it's only text/html,
> or all parts are rendered, or it's multipart/alternative with only a
> text/html subpart) and that HTML has  src="http://example.org/test.png"/> in it, then notmuch will make a
> network request for that image.
> 
> This is a privacy disaster, because it enables an e-mail sender to use
> "web bugs" to tell when a given notmuch user has opened their e-mail.
> 
> It's also a bit of a consistency/storage/indexing disaster because it
> means that what you see when you open a given message will change
> depending on the network environment you're in when you open it.
> 
> It's also potentially a security problem because it means that anyone in
> control of the remote server (or the network between you and the remote
> server if the image isn't sourced over https) can feed arbitrary data
> into whatever emacs image rendering library is being used.  (granted,
> this is not a unique problem because this can already be done by the
> original message sender with a multipart/mixed message, but it's an
> additional exposure of attack surface)
> 
> I just raised this on #notmuch, and i don't have the time or the
> knowledge to look into it now, but i think the defaults here need to be
> to avoid network access entirely unless the user explicitly requests it.
> 
>--dkg



> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch



Re: privacy problem: text/html parts pull in network resources

2015-01-21 Thread Austin Clements
I have a fix for this on shr buried deep in an old patch series that I
never got back to: id:1398105468-14317-12-git-send-email-amdra...@mit.edu

For shr, the key is to set shr-blocked-images to ..  However, IIRC,
in the current notmuch message rendering pipeline, mm overrides this
variable with something computed from gnus-blocked-images.  That said,
I'm not sure why gnus-blocked-images isn't *already* taking care of
this, but that's probably the place to start digging.

Quoth Daniel Kahn Gillmor on Jan 21 at  4:00 pm:
 If i send a message with a text/html part (either it's only text/html,
 or all parts are rendered, or it's multipart/alternative with only a
 text/html subpart) and that HTML has img
 src=http://example.org/test.png/ in it, then notmuch will make a
 network request for that image.
 
 This is a privacy disaster, because it enables an e-mail sender to use
 web bugs to tell when a given notmuch user has opened their e-mail.
 
 It's also a bit of a consistency/storage/indexing disaster because it
 means that what you see when you open a given message will change
 depending on the network environment you're in when you open it.
 
 It's also potentially a security problem because it means that anyone in
 control of the remote server (or the network between you and the remote
 server if the image isn't sourced over https) can feed arbitrary data
 into whatever emacs image rendering library is being used.  (granted,
 this is not a unique problem because this can already be done by the
 original message sender with a multipart/mixed message, but it's an
 additional exposure of attack surface)
 
 I just raised this on #notmuch, and i don't have the time or the
 knowledge to look into it now, but i think the defaults here need to be
 to avoid network access entirely unless the user explicitly requests it.
 
--dkg



 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: privacy problem: text/html parts pull in network resources

2015-01-21 Thread Austin Clements
Quoth Daniel Kahn Gillmor on Jan 21 at  4:36 pm:
 On Wed 2015-01-21 16:14:07 -0500, Austin Clements wrote:
  I have a fix for this on shr buried deep in an old patch series that I
  never got back to: id:1398105468-14317-12-git-send-email-amdra...@mit.edu
 
  For shr, the key is to set shr-blocked-images to ..
 
 I've just done this, but it doesn't seem to help.
 
  However, IIRC, in the current notmuch message rendering pipeline, mm
  overrides this variable with something computed from
  gnus-blocked-images.  That said, I'm not sure why gnus-blocked-images
  isn't *already* taking care of this, but that's probably the place to
  start digging.
 
 gnus-blocked-images is set for me to the function
 gnus-block-private-groups, but i don't know what that is (the function
 is undocumented afaict).  Setting gnus-blocked-images to a regexp of .
 seems to work for me, though.

In notmuch, mm will wind up calling (gnus-block-private-groups nil).
Unfortunately, gnus apparently considers nil to be a news group rather
than a private group (gnus speak for email, I think), so
gnus-block-private-groups returns nil (meaning *don't* block images)
rather than ..

Probably notmuch should override the gnus-blocked-images variable,
since the default value is simply wrong for notmuch.  Maybe something
along the lines of the following should go around our text/html
handler?

  (let ((gnus-blocked-images
 (if (eq gnus-blocked-images 'gnus-block-private-groups)
 ;; mm uses gnus-blocked-images to control image loading.
 ;; However, the default value of gnus-blocked-images
 ;; doesn't work for notmuch because
 ;; gnus-block-private-groups depends on gnus variables we
 ;; don't set.  Override it to disallow network image
 ;; loading.
 .
   ;; Use the user's customized value.
   gnus-blocked-images)))
...)

Long live abstraction!
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[WIP PATCH 4/4] lib: Add "lastmod:" queries for filtering by last modification

2015-01-15 Thread Austin Clements
Quoth David Bremner on Jan 15 at 10:08 pm:
> Austin Clements  writes:
> 
> > From: Austin Clements 
> >
> > XXX Includes reference to notmuch search --db-revision, which doesn't
> > exist.
> 
> What would --db-revision do if it was implimented? Did you want to pass
> a UUID in?

Yes, exactly.  And have the CLI abort if the UUID didn't match
(meaning the lastmod values are no longer in the same sequence).

> d


[PATCH v2 2/5] Add the NOTMUCH_FEATURE_INDEXED_MIMETYPES database feature

2015-01-15 Thread Austin Clements
Just one nit. Otherwise this patch LGTM.

On January 15, 2015 12:20:08 PM EST, Jani Nikula  wrote:
>
>Austin, would you mind having a look at this one please?
>
>Thanks,
>Jani.
>
>On Wed, 14 Jan 2015, Todd  wrote:
>> ---
>>  lib/database-private.h | 15 ---
>>  lib/database.cc| 10 --
>>  2 files changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/database-private.h b/lib/database-private.h
>> index 15e03cc..6d6fa2c 100644
>> --- a/lib/database-private.h
>> +++ b/lib/database-private.h
>> @@ -92,6 +92,14 @@ enum _notmuch_features {
>>   *
>>   * Introduced: version 3. */
>>  NOTMUCH_FEATURE_GHOSTS = 1 << 4,
>> +
>> +
>> +/* If set, then the database was created after the introduction
>of
>> + * indexed mime types. If unset, then the database may contain a
>> + * mixture of messages with indexed and non-indexed mime types.
>> + *
>> + * Introduced: version 3. */
>> +NOTMUCH_FEATURE_INDEXED_MIMETYPES = 1 << 5,
>>  };
>>  
>>  /* In C++, a named enum is its own type, so define bitwise operators
>> @@ -161,9 +169,10 @@ struct _notmuch_database {
>>  
>>  /* Current database features.  If any of these are missing from a
>>   * database, request an upgrade.
>> - * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because
>> - * upgrade doesn't currently introduce the feature (though brand new
>> - * databases will have it). */
>> + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES and
>> + * NOTMUCH_FEATURE_INDEXED_MIMETYPES are not included because
>upgrade
>> + * doesn't currently introduce the features (though brand new
>databases
>> + * will have it). */
>>  #define NOTMUCH_FEATURES_CURRENT \
>>  (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
>>   NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
>> diff --git a/lib/database.cc b/lib/database.cc
>> index 3601f9d..2de60f8 100644
>> --- a/lib/database.cc
>> +++ b/lib/database.cc
>> @@ -304,6 +304,11 @@ static const struct {
>>"exact folder:/path: search", "rw" },
>>  { NOTMUCH_FEATURE_GHOSTS,
>>"mail documents for missing messages", "w"},
>> +/* Knowledge of the index mime-types are not required for
>reading
>> + * a database because a reader will just be unable to query
>> + * them. */
>> +{ NOTMUCH_FEATURE_INDEXED_MIMETYPES,
>> +  "mime-types in database", "w"},

I would label this "indexed MIME types" to be closer to the enum and because 
"MIME" is an acronym and hence should be capitalized.

>>  };
>>  
>>  const char *
>> @@ -646,9 +651,10 @@ notmuch_database_create (const char *path,
>notmuch_database_t **database)
>>  if (status)
>>  goto DONE;
>>  
>> -/* Upgrade doesn't add this feature to existing databases, but
>new
>> - * databases have it. */
>> +/* Upgrade doesn't add these feature to existing databases, but
>> + * new databases have them. */
>>  notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES;
>> +notmuch->features |= NOTMUCH_FEATURE_INDEXED_MIMETYPES;
>>  
>>  status = notmuch_database_upgrade (notmuch, NULL, NULL);
>>  if (status) {
>> -- 
>> 1.9.1
>>
>> ___
>> notmuch mailing list
>> notmuch at notmuchmail.org
>> http://notmuchmail.org/mailman/listinfo/notmuch



Re: [PATCH v2 2/5] Add the NOTMUCH_FEATURE_INDEXED_MIMETYPES database feature

2015-01-15 Thread Austin Clements
Just one nit. Otherwise this patch LGTM.

On January 15, 2015 12:20:08 PM EST, Jani Nikula j...@nikula.org wrote:

Austin, would you mind having a look at this one please?

Thanks,
Jani.

On Wed, 14 Jan 2015, Todd t...@electricoding.com wrote:
 ---
  lib/database-private.h | 15 ---
  lib/database.cc| 10 --
  2 files changed, 20 insertions(+), 5 deletions(-)

 diff --git a/lib/database-private.h b/lib/database-private.h
 index 15e03cc..6d6fa2c 100644
 --- a/lib/database-private.h
 +++ b/lib/database-private.h
 @@ -92,6 +92,14 @@ enum _notmuch_features {
   *
   * Introduced: version 3. */
  NOTMUCH_FEATURE_GHOSTS = 1  4,
 +
 +
 +/* If set, then the database was created after the introduction
of
 + * indexed mime types. If unset, then the database may contain a
 + * mixture of messages with indexed and non-indexed mime types.
 + *
 + * Introduced: version 3. */
 +NOTMUCH_FEATURE_INDEXED_MIMETYPES = 1  5,
  };
  
  /* In C++, a named enum is its own type, so define bitwise operators
 @@ -161,9 +169,10 @@ struct _notmuch_database {
  
  /* Current database features.  If any of these are missing from a
   * database, request an upgrade.
 - * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because
 - * upgrade doesn't currently introduce the feature (though brand new
 - * databases will have it). */
 + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES and
 + * NOTMUCH_FEATURE_INDEXED_MIMETYPES are not included because
upgrade
 + * doesn't currently introduce the features (though brand new
databases
 + * will have it). */
  #define NOTMUCH_FEATURES_CURRENT \
  (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
   NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
 diff --git a/lib/database.cc b/lib/database.cc
 index 3601f9d..2de60f8 100644
 --- a/lib/database.cc
 +++ b/lib/database.cc
 @@ -304,6 +304,11 @@ static const struct {
exact folder:/path: search, rw },
  { NOTMUCH_FEATURE_GHOSTS,
mail documents for missing messages, w},
 +/* Knowledge of the index mime-types are not required for
reading
 + * a database because a reader will just be unable to query
 + * them. */
 +{ NOTMUCH_FEATURE_INDEXED_MIMETYPES,
 +  mime-types in database, w},

I would label this indexed MIME types to be closer to the enum and because 
MIME is an acronym and hence should be capitalized.

  };
  
  const char *
 @@ -646,9 +651,10 @@ notmuch_database_create (const char *path,
notmuch_database_t **database)
  if (status)
  goto DONE;
  
 -/* Upgrade doesn't add this feature to existing databases, but
new
 - * databases have it. */
 +/* Upgrade doesn't add these feature to existing databases, but
 + * new databases have them. */
  notmuch-features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES;
 +notmuch-features |= NOTMUCH_FEATURE_INDEXED_MIMETYPES;
  
  status = notmuch_database_upgrade (notmuch, NULL, NULL);
  if (status) {
 -- 
 1.9.1

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [WIP PATCH 4/4] lib: Add lastmod: queries for filtering by last modification

2015-01-15 Thread Austin Clements
Quoth David Bremner on Jan 15 at 10:08 pm:
 Austin Clements acleme...@csail.mit.edu writes:
 
  From: Austin Clements amdra...@mit.edu
 
  XXX Includes reference to notmuch search --db-revision, which doesn't
  exist.
 
 What would --db-revision do if it was implimented? Did you want to pass
 a UUID in?

Yes, exactly.  And have the CLI abort if the UUID didn't match
(meaning the lastmod values are no longer in the same sequence).

 d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Negating searches in notmuch running via emacs?

2015-01-06 Thread Austin Clements
Hi David.

In general, "-" will negate a query term.  However, Xapian's query
syntax (to which notmuch is beholden) ignores "-" at the beginning of
a query (and immediately after an open paren).  Hence, if you want to
negate the first term in a query, you'll have to use "not" instead.

Hope that helps.

Quoth David Creelman on Jan 06 at  8:46 am:
> Hi Carl (et al),
> Thanks for notmuch. It's making my email setup a bit more efficient. I used 
> to be a filter to differnt
> mbox kind of person, but I'm a changed man now !!
> 
> Just wondered, is it possible to negate a search in notmuch for emacs?
> 
> I have a bunch of stuff that is in my inbox that I don't want to see and have 
> tried doing things like
> 
>   -googlealerts-noreply
>   -from:googlealerts-noreply
> 
> But it looks like the '-' is ignored
> 
> Is there a way of getting filtering? It looks like it's possible from the 
> command line application (with -
> and + on search terms).
> 
> Regards,
> David
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Re: Negating searches in notmuch running via emacs?

2015-01-06 Thread Austin Clements
Hi David.

In general, - will negate a query term.  However, Xapian's query
syntax (to which notmuch is beholden) ignores - at the beginning of
a query (and immediately after an open paren).  Hence, if you want to
negate the first term in a query, you'll have to use not instead.

Hope that helps.

Quoth David Creelman on Jan 06 at  8:46 am:
 Hi Carl (et al),
 Thanks for notmuch. It's making my email setup a bit more efficient. I used 
 to be a filter to differnt
 mbox kind of person, but I'm a changed man now !!
 
 Just wondered, is it possible to negate a search in notmuch for emacs?
 
 I have a bunch of stuff that is in my inbox that I don't want to see and have 
 tried doing things like
 
   -googlealerts-noreply
   -from:googlealerts-noreply
 
 But it looks like the '-' is ignored
 
 Is there a way of getting filtering? It looks like it's possible from the 
 command line application (with -
 and + on search terms).
 
 Regards,
 David
 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] NEWS: Database version 3, API improvements, and ghost messages

2014-11-10 Thread Austin Clements
---
 NEWS | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/NEWS b/NEWS
index b30ed1b..7a121e4 100644
--- a/NEWS
+++ b/NEWS
@@ -36,9 +36,39 @@ Improved `q` binding in notmuch buffers
 Library changes
 ---

+Introduced database version 3 with support for "database features."
+
+  Features are independent aspects of the database schema.
+  Representing these independently of the database version number will
+  let us evolve the database format faster and more incrementally,
+  while maintaining better forwards and backwards compatibility.
+
+Library users are no longer required to call `notmuch_database_upgrade`
+
+  Previously, library users were required to call
+  `notmuch_database_needs_upgrade` and `notmuch_database_upgrade`
+  before using a writable database.  Even the CLI didn't get this
+  right, and it is no longer required.  Now, individual APIs may
+  return `NOTMUCH_STATUS_UPGRADE_REQUIRED` if the database format is
+  too out of date for that API.
+
+Library users can now abort an atomic section by closing the database
+
+  Previously there was no supported way to abort an atomic section.
+  Callers can now simply close the database, and any outstanding
+  atomic section will be aborted.
+
 Add return status to notmuch_database_close and
 notmuch_database_destroy

+Bug fixes and performance improvements for thread linking
+
+  The database now represents missing-but-referenced messages ("ghost
+  messages") similarly to how it represents regular messages.  This
+  enables an improved thread linking algorithm that performs better
+  and fixes a bug that sometimes prevented notmuch from linking
+  messages into the same thread.
+
 nmbug
 -

-- 
2.1.1



[PATCH] NEWS: Database version 3, API improvements, and ghost messages

2014-11-10 Thread Austin Clements
---
 NEWS | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/NEWS b/NEWS
index b30ed1b..7a121e4 100644
--- a/NEWS
+++ b/NEWS
@@ -36,9 +36,39 @@ Improved `q` binding in notmuch buffers
 Library changes
 ---
 
+Introduced database version 3 with support for database features.
+
+  Features are independent aspects of the database schema.
+  Representing these independently of the database version number will
+  let us evolve the database format faster and more incrementally,
+  while maintaining better forwards and backwards compatibility.
+
+Library users are no longer required to call `notmuch_database_upgrade`
+
+  Previously, library users were required to call
+  `notmuch_database_needs_upgrade` and `notmuch_database_upgrade`
+  before using a writable database.  Even the CLI didn't get this
+  right, and it is no longer required.  Now, individual APIs may
+  return `NOTMUCH_STATUS_UPGRADE_REQUIRED` if the database format is
+  too out of date for that API.
+
+Library users can now abort an atomic section by closing the database
+
+  Previously there was no supported way to abort an atomic section.
+  Callers can now simply close the database, and any outstanding
+  atomic section will be aborted.
+
 Add return status to notmuch_database_close and
 notmuch_database_destroy
 
+Bug fixes and performance improvements for thread linking
+
+  The database now represents missing-but-referenced messages (ghost
+  messages) similarly to how it represents regular messages.  This
+  enables an improved thread linking algorithm that performs better
+  and fixes a bug that sometimes prevented notmuch from linking
+  messages into the same thread.
+
 nmbug
 -
 
-- 
2.1.1

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3.1 3/9] lib: Introduce macros for bit operations

2014-10-24 Thread Austin Clements
These macros help clarify basic bit-twiddling code and are written to
be robust against C undefined behavior of shift operators.
---
 lib/message.cc|  6 +++---
 lib/notmuch-private.h | 11 +++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 38bc929..55d2ff6 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -869,7 +869,7 @@ notmuch_bool_t
 notmuch_message_get_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag)
 {
-return message->flags & (1 << flag);
+return NOTMUCH_TEST_BIT (message->flags, flag);
 }

 void
@@ -877,9 +877,9 @@ notmuch_message_set_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag, notmuch_bool_t enable)
 {
 if (enable)
-   message->flags |= (1 << flag);
+   NOTMUCH_SET_BIT (>flags, flag);
 else
-   message->flags &= ~(1 << flag);
+   NOTMUCH_CLEAR_BIT (>flags, flag);
 }

 time_t
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index 36cc12b..b86897c 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -63,6 +63,17 @@ NOTMUCH_BEGIN_DECLS
 #define STRNCMP_LITERAL(var, literal) \
 strncmp ((var), (literal), sizeof (literal) - 1)

+/* Robust bit test/set/reset macros */
+#define NOTMUCH_TEST_BIT(val, bit) \
+(((bit) < 0 || (bit) >= CHAR_BIT * sizeof (unsigned long long)) ? 0
\
+ : !!((val) & (1ull << (bit
+#define NOTMUCH_SET_BIT(valp, bit) \
+(((bit) < 0 || (bit) >= CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) |= (1ull << (bit
+#define NOTMUCH_CLEAR_BIT(valp,  bit) \
+(((bit) < 0 || (bit) >= CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) &= ~(1ull << (bit
+
 #define unused(x) x __attribute__ ((unused))

 #ifdef __cplusplus
-- 
2.1.0



[PATCH v1 1/3] search: Seperately report matching and non-matching authors.

2014-10-24 Thread Austin Clements
Quoth Mark Walters on Oct 24 at 10:23 am:
> 
> Hi
> 
> I definitely like the idea: some comments below.

Agreed.

> On Fri, 24 Oct 2014, David Edmondson  wrote:
> > In addition to the :authors attribute of each search result, include
> > :authors_matched and :authors_non_matched attributes. Both attributes
> > are always included. If there are no non-matching authors, the
> > :authors_non_matched attribute is set to the empty string.
> 
> What about having both authors_matched and authors_not_matched as lists
> of authors (ie one string for each author)? Then emacs, for example,
> wouldn't try and parse the string back into authors before
> splitting. And authors_non_matched could be an empty list when
> appropriate which seems more natural than the empty string.
> 
> All the above is based on what a client might want in the output rather
> than what is easy or sensible to implement in the C code.

I was going to suggest exactly the same thing.

And I think there's a fairly easy way to do it in C code that will
also prevent library interface bloat: instead of introducing new
library APIs to get at this information, just use the existing
notmuch_thread_get_messages API and construct the matched and
non-matched lists in the CLI.  Doing it in the CLI wouldn't require
the library to export yet another string list structure, which is
always a huge pain (thanks C!), and wouldn't introduce more "helper"
functions into the library API.

I think the only disadvantage to doing this would be some duplication
of the {matched_,}authors_{array,hash} logic into the CLI, but those
are only there to support notmuch_thread_get_authors, which I think is
a huge hack and should go away in the long term.  (And, by doing this
list building in the CLI, we avoid further embellishing the
unnecessary "get thread authors" API).


Re: [PATCH v1 1/3] search: Seperately report matching and non-matching authors.

2014-10-24 Thread Austin Clements
Quoth Mark Walters on Oct 24 at 10:23 am:
 
 Hi
 
 I definitely like the idea: some comments below.

Agreed.

 On Fri, 24 Oct 2014, David Edmondson d...@dme.org wrote:
  In addition to the :authors attribute of each search result, include
  :authors_matched and :authors_non_matched attributes. Both attributes
  are always included. If there are no non-matching authors, the
  :authors_non_matched attribute is set to the empty string.
 
 What about having both authors_matched and authors_not_matched as lists
 of authors (ie one string for each author)? Then emacs, for example,
 wouldn't try and parse the string back into authors before
 splitting. And authors_non_matched could be an empty list when
 appropriate which seems more natural than the empty string.
 
 All the above is based on what a client might want in the output rather
 than what is easy or sensible to implement in the C code.

I was going to suggest exactly the same thing.

And I think there's a fairly easy way to do it in C code that will
also prevent library interface bloat: instead of introducing new
library APIs to get at this information, just use the existing
notmuch_thread_get_messages API and construct the matched and
non-matched lists in the CLI.  Doing it in the CLI wouldn't require
the library to export yet another string list structure, which is
always a huge pain (thanks C!), and wouldn't introduce more helper
functions into the library API.

I think the only disadvantage to doing this would be some duplication
of the {matched_,}authors_{array,hash} logic into the CLI, but those
are only there to support notmuch_thread_get_authors, which I think is
a huge hack and should go away in the long term.  (And, by doing this
list building in the CLI, we avoid further embellishing the
unnecessary get thread authors API).
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3.1 3/9] lib: Introduce macros for bit operations

2014-10-24 Thread Austin Clements
These macros help clarify basic bit-twiddling code and are written to
be robust against C undefined behavior of shift operators.
---
 lib/message.cc|  6 +++---
 lib/notmuch-private.h | 11 +++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 38bc929..55d2ff6 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -869,7 +869,7 @@ notmuch_bool_t
 notmuch_message_get_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag)
 {
-return message-flags  (1  flag);
+return NOTMUCH_TEST_BIT (message-flags, flag);
 }
 
 void
@@ -877,9 +877,9 @@ notmuch_message_set_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag, notmuch_bool_t enable)
 {
 if (enable)
-   message-flags |= (1  flag);
+   NOTMUCH_SET_BIT (message-flags, flag);
 else
-   message-flags = ~(1  flag);
+   NOTMUCH_CLEAR_BIT (message-flags, flag);
 }
 
 time_t
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index 36cc12b..b86897c 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -63,6 +63,17 @@ NOTMUCH_BEGIN_DECLS
 #define STRNCMP_LITERAL(var, literal) \
 strncmp ((var), (literal), sizeof (literal) - 1)
 
+/* Robust bit test/set/reset macros */
+#define NOTMUCH_TEST_BIT(val, bit) \
+(((bit)  0 || (bit) = CHAR_BIT * sizeof (unsigned long long)) ? 0
\
+ : !!((val)  (1ull  (bit
+#define NOTMUCH_SET_BIT(valp, bit) \
+(((bit)  0 || (bit) = CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) |= (1ull  (bit
+#define NOTMUCH_CLEAR_BIT(valp,  bit) \
+(((bit)  0 || (bit) = CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) = ~(1ull  (bit
+
 #define unused(x) x __attribute__ ((unused))
 
 #ifdef __cplusplus
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 9/9] lib: Remove unnecessary thread linking steps when using ghost messages

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

Previously, it was necessary to link new messages to children to work
around some (though not all) problems with the old metadata-based
approach to stored thread IDs.  With ghost messages, this is no longer
necessary, so don't bother with child linking when ghost messages are
in use.
---
 lib/database.cc | 29 +
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index b673718..3601f9d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -2136,11 +2136,11 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
  * reference 'message'.
  *
  * In all cases, we assign to the current message the first thread ID
- * found (through either parent or child). We will also merge any
- * existing, distinct threads where this message belongs to both,
- * (which is not uncommon when messages are processed out of order).
+ * found. We will also merge any existing, distinct threads where this
+ * message belongs to both, (which is not uncommon when messages are
+ * processed out of order).
  *
- * Finally, if no thread ID has been found through parent or child, we
+ * Finally, if no thread ID has been found through referenced messages, we
  * call _notmuch_message_generate_thread_id to generate a new thread
  * ID. This should only happen for new, top-level messages, (no
  * References or In-Reply-To header in this message, and no previously
@@ -2172,10 +2172,23 @@ _notmuch_database_link_message (notmuch_database_t 
*notmuch,
 if (status)
goto DONE;

-status = _notmuch_database_link_message_to_children (notmuch, message,
-_id);
-if (status)
-   goto DONE;
+if (! (notmuch->features & NOTMUCH_FEATURE_GHOSTS)) {
+   /* In general, it shouldn't be necessary to link children,
+* since the earlier indexing of those children will have
+* stored a thread ID for the missing parent.  However, prior
+* to ghost messages, these stored thread IDs were NOT
+* rewritten during thread merging (and there was no
+* performant way to do so), so if indexed children were
+* pulled into a different thread ID by a merge, it was
+* necessary to pull them *back* into the stored thread ID of
+* the parent.  With ghost messages, we just rewrite the
+* stored thread IDs during merging, so this workaround isn't
+* necessary. */
+   status = _notmuch_database_link_message_to_children (notmuch, message,
+_id);
+   if (status)
+   goto DONE;
+}

 /* If not part of any existing thread, generate a new thread ID. */
 if (thread_id == NULL) {
-- 
2.1.0



[PATCH v3 8/9] test: Test upgrade to ghost messages feature

2014-10-23 Thread Austin Clements
---
 test/T530-upgrade.sh | 21 +
 1 file changed, 21 insertions(+)

diff --git a/test/T530-upgrade.sh b/test/T530-upgrade.sh
index c4c4ac8..6b42a69 100755
--- a/test/T530-upgrade.sh
+++ b/test/T530-upgrade.sh
@@ -116,4 +116,25 @@ MAIL_DIR/bar/new/21:2,
 MAIL_DIR/bar/new/22:2,
 MAIL_DIR/cur/51:2,"

+# Ghost messages are difficult to test since they're nearly invisible.
+# However, if the upgrade works correctly, the ghost message should
+# retain the right thread ID even if all of the original messages in
+# the thread are deleted.  That's what we test.  This won't detect if
+# the upgrade just plain didn't happen, but it should detect if
+# something went wrong.
+test_begin_subtest "ghost message retains thread ID"
+# Upgrade database
+notmuch new
+# Get thread ID of real message
+thread=$(notmuch search --output=threads id:4EFC743A.3060609 at april.org)
+# Delete all real messages in that thread
+rm $(notmuch search --output=files $thread)
+notmuch new
+# "Deliver" ghost message
+add_message '[subject]=Ghost' '[id]=4EFC3931.6030007 at april.org'
+# If the ghost upgrade worked, the new message should be attached to
+# the existing thread ID.
+nthread=$(notmuch search --output=threads id:4EFC3931.6030007 at april.org)
+test_expect_equal "$thread" "$nthread"
+
 test_done
-- 
2.1.0



[PATCH v3 7/9] lib: Enable ghost messages feature

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

This fixes the broken thread order test.
---
 lib/database-private.h| 2 +-
 test/T260-thread-order.sh | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index e2e4bc8..15e03cc 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -166,7 +166,7 @@ struct _notmuch_database {
  * databases will have it). */
 #define NOTMUCH_FEATURES_CURRENT \
 (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
- NOTMUCH_FEATURE_BOOL_FOLDER)
+ NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)

 /* Return the list of terms from the given iterator matching a prefix.
  * The prefix will be stripped from the strings in the returned list.
diff --git a/test/T260-thread-order.sh b/test/T260-thread-order.sh
index b435d79..99f5833 100755
--- a/test/T260-thread-order.sh
+++ b/test/T260-thread-order.sh
@@ -30,7 +30,6 @@ expected=$(for ((i = 0; i < $nthreads; i++)); do
 test_expect_equal "$output" "$expected"

 test_begin_subtest "Messages with all parents get linked in all delivery 
orders"
-test_subtest_known_broken
 # Here we do the same thing as the previous test, but each message
 # references all of its parents.  Since every message references the
 # root of the thread, each thread should always be fully joined.  This
-- 
2.1.0



[PATCH v3 6/9] lib: Implement upgrade to ghost messages feature

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

Somehow this is the first upgrade pass that actually does *any* error
checking, so this also adds the bit of necessary infrastructure to
handle that.
---
 lib/database.cc | 66 +++--
 1 file changed, 64 insertions(+), 2 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 92a92d9..b673718 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1231,6 +1231,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 notmuch_bool_t timer_is_active = FALSE;
 enum _notmuch_features target_features, new_features;
 notmuch_status_t status;
+notmuch_private_status_t private_status;
 unsigned int count = 0, total = 0;

 status = _notmuch_database_ensure_writable (notmuch);
@@ -1275,6 +1276,13 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
for (t = db->allterms_begin ("XTIMESTAMP"); t != t_end; t++)
++total;
 }
+if (new_features & NOTMUCH_FEATURE_GHOSTS) {
+   /* The ghost message upgrade converts all thread_id_*
+* metadata values into ghost message documents. */
+   t_end = db->metadata_keys_end ("thread_id_");
+   for (t = db->metadata_keys_begin ("thread_id_"); t != t_end; ++t)
+   ++total;
+}

 /* Perform the upgrade in a transaction. */
 db->begin_transaction (true);
@@ -1378,10 +1386,64 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
}
 }

+/* Perform metadata upgrades. */
+
+/* Prior to NOTMUCH_FEATURE_GHOSTS, thread IDs for missing
+ * messages were stored as database metadata. Change these to
+ * ghost messages.
+ */
+if (new_features & NOTMUCH_FEATURE_GHOSTS) {
+   notmuch_message_t *message;
+   std::string message_id, thread_id;
+
+   t_end = db->metadata_keys_end (NOTMUCH_METADATA_THREAD_ID_PREFIX);
+   for (t = db->metadata_keys_begin (NOTMUCH_METADATA_THREAD_ID_PREFIX);
+t != t_end; ++t) {
+   if (do_progress_notify) {
+   progress_notify (closure, (double) count / total);
+   do_progress_notify = 0;
+   }
+
+   message_id = (*t).substr (
+   strlen (NOTMUCH_METADATA_THREAD_ID_PREFIX));
+   thread_id = db->get_metadata (*t);
+
+   /* Create ghost message */
+   message = _notmuch_message_create_for_message_id (
+   notmuch, message_id.c_str (), _status);
+   if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
+   /* Document already exists; ignore the stored thread ID */
+   } else if (private_status ==
+  NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
+   private_status = _notmuch_message_initialize_ghost (
+   message, thread_id.c_str ());
+   if (! private_status)
+   _notmuch_message_sync (message);
+   }
+
+   if (private_status) {
+   fprintf (stderr,
+"Upgrade failed while creating ghost messages.\n");
+   status = COERCE_STATUS (private_status, "Unexpected status from 
_notmuch_message_initialize_ghost");
+   goto DONE;
+   }
+
+   /* Clear saved metadata thread ID */
+   db->set_metadata (*t, "");
+
+   ++count;
+   }
+}
+
+status = NOTMUCH_STATUS_SUCCESS;
 db->set_metadata ("features", _print_features (local, notmuch->features));
 db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));

-db->commit_transaction ();
+ DONE:
+if (status == NOTMUCH_STATUS_SUCCESS)
+   db->commit_transaction ();
+else
+   db->cancel_transaction ();

 if (timer_is_active) {
/* Now stop the timer. */
@@ -1397,7 +1459,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 }

 talloc_free (local);
-return NOTMUCH_STATUS_SUCCESS;
+return status;
 }

 notmuch_status_t
-- 
2.1.0



[PATCH v3 5/9] lib: Implement ghost-based thread linking

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

This updates the thread linking code to use ghost messages instead of
user metadata to link messages into threads.

In contrast with the old approach, this is actually correct.
Previously, thread merging updated only the thread IDs of message
documents, not thread IDs stored in user metadata.  As originally
diagnosed by Mark Walters [1] and as demonstrated by the broken
T260-thread-order test, this can cause notmuch to fail to link
messages even though they're in the same thread.  In principle the old
approach could have been fixed by updating the user metadata thread
IDs as well, but these are not indexed and hence this would have
required a full scan of all stored thread IDs.  Ghost messages solve
this problem naturally by reusing the exact same thread ID and message
ID representation and indexing as regular messages.

Furthermore, thanks to this greater symmetry, ghost messages are also
algorithmically simpler.  We continue to support the old user metadata
format, so this patch can't delete any code, but when we do remove
support for the old format, several functions can simply be deleted.

[1] id:8738h7kv2q.fsf at qmul.ac.uk
---
 lib/database.cc | 99 +++--
 1 file changed, 83 insertions(+), 16 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index c641bcd..92a92d9 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
message_id);
 }

+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret);
+
 /* Find the thread ID to which the message with 'message_id' belongs.
  *
  * Note: 'thread_id_ret' must not be NULL!
@@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
  *
  * Note: If there is no message in the database with the given
  * 'message_id' then a new thread_id will be allocated for this
- * message and stored in the database metadata, (where this same
+ * message ID and stored in the database metadata so that the
  * thread ID can be looked up if the message is added to the database
- * later).
+ * later.
  */
 static notmuch_status_t
 _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,
@@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t 
*notmuch,
  const char *message_id,
  const char **thread_id_ret)
 {
+notmuch_private_status_t status;
+notmuch_message_t *message;
+
+if (! (notmuch->features & NOTMUCH_FEATURE_GHOSTS))
+   return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,
+thread_id_ret);
+
+/* Look for this message (regular or ghost) */
+message = _notmuch_message_create_for_message_id (
+   notmuch, message_id, );
+if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
+   /* Message exists */
+   *thread_id_ret = talloc_steal (
+   ctx, notmuch_message_get_thread_id (message));
+} else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
+   /* Message did not exist.  Give it a fresh thread ID and
+* populate this message as a ghost message. */
+   *thread_id_ret = talloc_strdup (
+   ctx, _notmuch_database_generate_thread_id (notmuch));
+   if (! *thread_id_ret) {
+   status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;
+   } else {
+   status = _notmuch_message_initialize_ghost (message, 
*thread_id_ret);
+   if (status == 0)
+   /* Commit the new ghost message */
+   _notmuch_message_sync (message);
+   }
+} else {
+   /* Create failed. Fall through. */
+}
+
+notmuch_message_destroy (message);
+
+return COERCE_STATUS (status, "Error creating ghost message");
+}
+
+/* Pre-ghost messages _resolve_message_id_to_thread_id */
+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret)
+{
 notmuch_status_t status;
 notmuch_message_t *message;
 string thread_id_string;
@@ -2007,13 +2056,16 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
 }
 }

-/* Given a (mostly empty) 'message' and its corresponding
+/* Given a blank or ghost 'message' and its corresponding
  * 'message_file' link it to existing threads in the database.
  *
- * The first check is in the metadata of the database to see if we
- * have pre-allocated a thread_id in advance for this message

[PATCH v3 4/9] lib: Internal support for querying and creating ghost messages

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

This updates the message abstraction to support ghost messages: it
adds a message flag that distinguishes regular messages from ghost
messages, and an internal function for initializing a newly created
(blank) message as a ghost message.
---
 lib/message.cc| 52 +--
 lib/notmuch-private.h |  4 
 lib/notmuch.h |  9 -
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 55d2ff6..a7a13cc 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -39,6 +39,9 @@ struct visible _notmuch_message {
 notmuch_message_file_t *message_file;
 notmuch_message_list_t *replies;
 unsigned long flags;
+/* For flags that are initialized on-demand, lazy_flags indicates
+ * if each flag has been initialized. */
+unsigned long lazy_flags;

 Xapian::Document doc;
 Xapian::termcount termpos;
@@ -99,6 +102,7 @@ _notmuch_message_create_for_document (const void 
*talloc_owner,

 message->frozen = 0;
 message->flags = 0;
+message->lazy_flags = 0;

 /* Each of these will be lazily created as needed. */
 message->message_id = NULL;
@@ -192,7 +196,7 @@ _notmuch_message_create (const void *talloc_owner,
  *
  * There is already a document with message ID 'message_id' in the
  * database. The returned message can be used to query/modify the
- * document.
+ * document. The message may be a ghost message.
  *
  *   NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND:
  *
@@ -305,6 +309,7 @@ _notmuch_message_ensure_metadata (notmuch_message_t 
*message)
 const char *thread_prefix = _find_prefix ("thread"),
*tag_prefix = _find_prefix ("tag"),
*id_prefix = _find_prefix ("id"),
+   *type_prefix = _find_prefix ("type"),
*filename_prefix = _find_prefix ("file-direntry"),
*replyto_prefix = _find_prefix ("replyto");

@@ -337,10 +342,25 @@ _notmuch_message_ensure_metadata (notmuch_message_t 
*message)
message->message_id =
_notmuch_message_get_term (message, i, end, id_prefix);

+/* Get document type */
+assert (strcmp (id_prefix, type_prefix) < 0);
+if (! NOTMUCH_TEST_BIT (message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST)) {
+   i.skip_to (type_prefix);
+   /* "T" is the prefix "type" fields.  See
+* BOOLEAN_PREFIX_INTERNAL. */
+   if (*i == "Tmail")
+   NOTMUCH_CLEAR_BIT (>flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+   else if (*i == "Tghost")
+   NOTMUCH_SET_BIT (>flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+   else
+   INTERNAL_ERROR ("Message without type term");
+   NOTMUCH_SET_BIT (>lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+}
+
 /* Get filename list.  Here we get only the terms.  We lazily
  * expand them to full file names when needed in
  * _notmuch_message_ensure_filename_list. */
-assert (strcmp (id_prefix, filename_prefix) < 0);
+assert (strcmp (type_prefix, filename_prefix) < 0);
 if (!message->filename_term_list && !message->filename_list)
message->filename_term_list =
_notmuch_database_get_terms_with_prefix (message, i, end,
@@ -371,6 +391,11 @@ _notmuch_message_invalidate_metadata (notmuch_message_t 
*message,
message->tag_list = NULL;
 }

+if (strcmp ("type", prefix_name) == 0) {
+   NOTMUCH_CLEAR_BIT (>flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+   NOTMUCH_CLEAR_BIT (>lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+}
+
 if (strcmp ("file-direntry", prefix_name) == 0) {
talloc_free (message->filename_term_list);
talloc_free (message->filename_list);
@@ -869,6 +894,10 @@ notmuch_bool_t
 notmuch_message_get_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag)
 {
+if (flag == NOTMUCH_MESSAGE_FLAG_GHOST &&
+   ! NOTMUCH_TEST_BIT (message->lazy_flags, flag))
+   _notmuch_message_ensure_metadata (message);
+
 return NOTMUCH_TEST_BIT (message->flags, flag);
 }

@@ -880,6 +909,7 @@ notmuch_message_set_flag (notmuch_message_t *message,
NOTMUCH_SET_BIT (>flags, flag);
 else
NOTMUCH_CLEAR_BIT (>flags, flag);
+NOTMUCH_SET_BIT (>lazy_flags, flag);
 }

 time_t
@@ -989,6 +1019,24 @@ _notmuch_message_delete (notmuch_message_t *message)
 return NOTMUCH_STATUS_SUCCESS;
 }

+/* Transform a blank message into a ghost message.  The caller must
+ * _notmuch_message_sync the message. */
+notmuch_private_status_t
+_notmuch_message_initialize_ghost (notmuch_message_t *message,
+  const char *thread_id)
+{
+notmuch_private_status_t status;
+
+status = _notmuch_message_add_term (message, "type", 

[PATCH v3 3/9] lib: Introduce macros for bit operations

2014-10-23 Thread Austin Clements
These macros help clarify basic bit-twiddling code and are written to
be robust against C undefined behavior of shift operators.
---
 lib/message.cc|  6 +++---
 lib/notmuch-private.h | 11 +++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 38bc929..55d2ff6 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -869,7 +869,7 @@ notmuch_bool_t
 notmuch_message_get_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag)
 {
-return message->flags & (1 << flag);
+return NOTMUCH_TEST_BIT (message->flags, flag);
 }

 void
@@ -877,9 +877,9 @@ notmuch_message_set_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag, notmuch_bool_t enable)
 {
 if (enable)
-   message->flags |= (1 << flag);
+   NOTMUCH_SET_BIT (>flags, flag);
 else
-   message->flags &= ~(1 << flag);
+   NOTMUCH_CLEAR_BIT (>flags, flag);
 }

 time_t
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index 36cc12b..7250291 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -63,6 +63,17 @@ NOTMUCH_BEGIN_DECLS
 #define STRNCMP_LITERAL(var, literal) \
 strncmp ((var), (literal), sizeof (literal) - 1)

+/* Robust bit test/set/reset macros */
+#define NOTMUCH_TEST_BIT(val, bit) \
+((bit < 0 || bit >= CHAR_BIT * sizeof (unsigned long long)) ? 0\
+ : !!((val) & (1ull << bit)))
+#define NOTMUCH_SET_BIT(valp, bit) \
+((bit < 0 || bit >= CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) |= (1ull << bit)))
+#define NOTMUCH_CLEAR_BIT(valp,  bit) \
+((bit < 0 || bit >= CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) &= ~(1ull << bit)))
+
 #define unused(x) x __attribute__ ((unused))

 #ifdef __cplusplus
-- 
2.1.0



[PATCH v3 2/9] lib: Update database schema doc for ghost messages

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

This describes the structure of ghost mail documents.  Ghost messages
are not yet implemented.
---
 lib/database.cc | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 8fd7fad..c641bcd 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -50,8 +50,8 @@ typedef struct {

 /* Here's the current schema for our database (for NOTMUCH_DATABASE_VERSION):
  *
- * We currently have two different types of documents (mail and
- * directory) and also some metadata.
+ * We currently have three different types of documents (mail, ghost,
+ * and directory) and also some metadata.
  *
  * Mail document
  * -
@@ -109,6 +109,15 @@ typedef struct {
  *
  * The data portion of a mail document is empty.
  *
+ * Ghost mail document [if NOTMUCH_FEATURE_GHOSTS]
+ * ---
+ * A ghost mail document is like a mail document, but where we don't
+ * have the message content.  These are used to track thread reference
+ * information for messages we haven't received.
+ *
+ * A ghost mail document has type: ghost; id and thread fields that
+ * are identical to the mail document fields; and a MESSAGE_ID value.
+ *
  * Directory document
  * --
  * A directory document is used by a client of the notmuch library to
@@ -172,6 +181,13 @@ typedef struct {
  * generated is 1 and the value will be
  * incremented for each thread ID.
  *
+ * Obsolete metadata
+ * -
+ *
+ * If ! NOTMUCH_FEATURE_GHOSTS, there are no ghost mail documents.
+ * Instead, the database has the following additional database
+ * metadata:
+ *
  * thread_id_* A pre-allocated thread ID for a particular
  * message. This is actually an arbitrarily large
  * family of metadata name. Any particular name is
-- 
2.1.0



[PATCH v3 1/9] lib: Add a ghost messages database feature

2014-10-23 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

This will be implemented over the next several patches.  The feature
is not yet "enabled" (this does not add it to
NOTMUCH_FEATURES_CURRENT).
---
 lib/database-private.h | 7 +++
 lib/database.cc| 2 ++
 2 files changed, 9 insertions(+)

diff --git a/lib/database-private.h b/lib/database-private.h
index ca0751c..e2e4bc8 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -85,6 +85,13 @@ enum _notmuch_features {
  *
  * Introduced: version 2. */
 NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3,
+
+/* If set, missing messages are stored in ghost mail documents.
+ * If unset, thread IDs of ghost messages are stored as database
+ * metadata instead of in ghost documents.
+ *
+ * Introduced: version 3. */
+NOTMUCH_FEATURE_GHOSTS = 1 << 4,
 };

 /* In C++, a named enum is its own type, so define bitwise operators
diff --git a/lib/database.cc b/lib/database.cc
index 1c6ffc5..8fd7fad 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -286,6 +286,8 @@ static const struct {
   "from/subject/message-ID in database", "w" },
 { NOTMUCH_FEATURE_BOOL_FOLDER,
   "exact folder:/path: search", "rw" },
+{ NOTMUCH_FEATURE_GHOSTS,
+  "mail documents for missing messages", "w"},
 };

 const char *
-- 
2.1.0



[PATCH v3 0/9] Add ghost messages and fix thread linking

2014-10-23 Thread Austin Clements
This is v3 of
id:1412637438-4821-1-git-send-email-aclements at csail.mit.edu.
This fixes some comments, as suggested by Mark.  There are no
code changes relative to v2.  The diff from v2 is below.

diff --git a/lib/database.cc b/lib/database.cc
index 6e51a72..3601f9d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -2121,10 +2121,13 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
 /* Given a blank or ghost 'message' and its corresponding
  * 'message_file' link it to existing threads in the database.
  *
- * The first check is in the metadata of the database to see if we
- * have pre-allocated a thread_id in advance for this message, (which
- * would have happened if a message was previously added that
- * referenced this one).
+ * First, if is_ghost, this retrieves the thread ID already stored in
+ * the message (which will be the case if a message was previously
+ * added that referenced this one).  If the message is blank
+ * (!is_ghost), it doesn't have a thread ID yet (we'll generate one
+ * later in this function).  If the database does not support ghost
+ * messages, this checks for a thread ID stored in database metadata
+ * for this message ID.
  *
  * Second, we look at 'message_file' and its link-relevant headers
  * (References and In-Reply-To) for message IDs.
@@ -2132,12 +2135,12 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
  * Finally, we look in the database for existing message that
  * reference 'message'.
  *
- * In all cases, we assign to the current message the first thread_id
- * found (through either parent or child). We will also merge any
- * existing, distinct threads where this message belongs to both,
- * (which is not uncommon when messages are processed out of order).
+ * In all cases, we assign to the current message the first thread ID
+ * found. We will also merge any existing, distinct threads where this
+ * message belongs to both, (which is not uncommon when messages are
+ * processed out of order).
  *
- * Finally, if no thread ID has been found through parent or child, we
+ * Finally, if no thread ID has been found through referenced messages, we
  * call _notmuch_message_generate_thread_id to generate a new thread
  * ID. This should only happen for new, top-level messages, (no
  * References or In-Reply-To header in this message, and no previously



[PATCH v3 0/9] Add ghost messages and fix thread linking

2014-10-23 Thread Austin Clements
This is v3 of
id:1412637438-4821-1-git-send-email-acleme...@csail.mit.edu.
This fixes some comments, as suggested by Mark.  There are no
code changes relative to v2.  The diff from v2 is below.

diff --git a/lib/database.cc b/lib/database.cc
index 6e51a72..3601f9d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -2121,10 +2121,13 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
 /* Given a blank or ghost 'message' and its corresponding
  * 'message_file' link it to existing threads in the database.
  *
- * The first check is in the metadata of the database to see if we
- * have pre-allocated a thread_id in advance for this message, (which
- * would have happened if a message was previously added that
- * referenced this one).
+ * First, if is_ghost, this retrieves the thread ID already stored in
+ * the message (which will be the case if a message was previously
+ * added that referenced this one).  If the message is blank
+ * (!is_ghost), it doesn't have a thread ID yet (we'll generate one
+ * later in this function).  If the database does not support ghost
+ * messages, this checks for a thread ID stored in database metadata
+ * for this message ID.
  *
  * Second, we look at 'message_file' and its link-relevant headers
  * (References and In-Reply-To) for message IDs.
@@ -2132,12 +2135,12 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
  * Finally, we look in the database for existing message that
  * reference 'message'.
  *
- * In all cases, we assign to the current message the first thread_id
- * found (through either parent or child). We will also merge any
- * existing, distinct threads where this message belongs to both,
- * (which is not uncommon when messages are processed out of order).
+ * In all cases, we assign to the current message the first thread ID
+ * found. We will also merge any existing, distinct threads where this
+ * message belongs to both, (which is not uncommon when messages are
+ * processed out of order).
  *
- * Finally, if no thread ID has been found through parent or child, we
+ * Finally, if no thread ID has been found through referenced messages, we
  * call _notmuch_message_generate_thread_id to generate a new thread
  * ID. This should only happen for new, top-level messages, (no
  * References or In-Reply-To header in this message, and no previously

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 3/9] lib: Introduce macros for bit operations

2014-10-23 Thread Austin Clements
These macros help clarify basic bit-twiddling code and are written to
be robust against C undefined behavior of shift operators.
---
 lib/message.cc|  6 +++---
 lib/notmuch-private.h | 11 +++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 38bc929..55d2ff6 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -869,7 +869,7 @@ notmuch_bool_t
 notmuch_message_get_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag)
 {
-return message-flags  (1  flag);
+return NOTMUCH_TEST_BIT (message-flags, flag);
 }
 
 void
@@ -877,9 +877,9 @@ notmuch_message_set_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag, notmuch_bool_t enable)
 {
 if (enable)
-   message-flags |= (1  flag);
+   NOTMUCH_SET_BIT (message-flags, flag);
 else
-   message-flags = ~(1  flag);
+   NOTMUCH_CLEAR_BIT (message-flags, flag);
 }
 
 time_t
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index 36cc12b..7250291 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -63,6 +63,17 @@ NOTMUCH_BEGIN_DECLS
 #define STRNCMP_LITERAL(var, literal) \
 strncmp ((var), (literal), sizeof (literal) - 1)
 
+/* Robust bit test/set/reset macros */
+#define NOTMUCH_TEST_BIT(val, bit) \
+((bit  0 || bit = CHAR_BIT * sizeof (unsigned long long)) ? 0\
+ : !!((val)  (1ull  bit)))
+#define NOTMUCH_SET_BIT(valp, bit) \
+((bit  0 || bit = CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) |= (1ull  bit)))
+#define NOTMUCH_CLEAR_BIT(valp,  bit) \
+((bit  0 || bit = CHAR_BIT * sizeof (unsigned long long)) ? *(valp) \
+ : (*(valp) = ~(1ull  bit)))
+
 #define unused(x) x __attribute__ ((unused))
 
 #ifdef __cplusplus
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 2/9] lib: Update database schema doc for ghost messages

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This describes the structure of ghost mail documents.  Ghost messages
are not yet implemented.
---
 lib/database.cc | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 8fd7fad..c641bcd 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -50,8 +50,8 @@ typedef struct {
 
 /* Here's the current schema for our database (for NOTMUCH_DATABASE_VERSION):
  *
- * We currently have two different types of documents (mail and
- * directory) and also some metadata.
+ * We currently have three different types of documents (mail, ghost,
+ * and directory) and also some metadata.
  *
  * Mail document
  * -
@@ -109,6 +109,15 @@ typedef struct {
  *
  * The data portion of a mail document is empty.
  *
+ * Ghost mail document [if NOTMUCH_FEATURE_GHOSTS]
+ * ---
+ * A ghost mail document is like a mail document, but where we don't
+ * have the message content.  These are used to track thread reference
+ * information for messages we haven't received.
+ *
+ * A ghost mail document has type: ghost; id and thread fields that
+ * are identical to the mail document fields; and a MESSAGE_ID value.
+ *
  * Directory document
  * --
  * A directory document is used by a client of the notmuch library to
@@ -172,6 +181,13 @@ typedef struct {
  * generated is 1 and the value will be
  * incremented for each thread ID.
  *
+ * Obsolete metadata
+ * -
+ *
+ * If ! NOTMUCH_FEATURE_GHOSTS, there are no ghost mail documents.
+ * Instead, the database has the following additional database
+ * metadata:
+ *
  * thread_id_* A pre-allocated thread ID for a particular
  * message. This is actually an arbitrarily large
  * family of metadata name. Any particular name is
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 1/9] lib: Add a ghost messages database feature

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This will be implemented over the next several patches.  The feature
is not yet enabled (this does not add it to
NOTMUCH_FEATURES_CURRENT).
---
 lib/database-private.h | 7 +++
 lib/database.cc| 2 ++
 2 files changed, 9 insertions(+)

diff --git a/lib/database-private.h b/lib/database-private.h
index ca0751c..e2e4bc8 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -85,6 +85,13 @@ enum _notmuch_features {
  *
  * Introduced: version 2. */
 NOTMUCH_FEATURE_BOOL_FOLDER = 1  3,
+
+/* If set, missing messages are stored in ghost mail documents.
+ * If unset, thread IDs of ghost messages are stored as database
+ * metadata instead of in ghost documents.
+ *
+ * Introduced: version 3. */
+NOTMUCH_FEATURE_GHOSTS = 1  4,
 };
 
 /* In C++, a named enum is its own type, so define bitwise operators
diff --git a/lib/database.cc b/lib/database.cc
index 1c6ffc5..8fd7fad 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -286,6 +286,8 @@ static const struct {
   from/subject/message-ID in database, w },
 { NOTMUCH_FEATURE_BOOL_FOLDER,
   exact folder:/path: search, rw },
+{ NOTMUCH_FEATURE_GHOSTS,
+  mail documents for missing messages, w},
 };
 
 const char *
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 6/9] lib: Implement upgrade to ghost messages feature

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

Somehow this is the first upgrade pass that actually does *any* error
checking, so this also adds the bit of necessary infrastructure to
handle that.
---
 lib/database.cc | 66 +++--
 1 file changed, 64 insertions(+), 2 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 92a92d9..b673718 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1231,6 +1231,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 notmuch_bool_t timer_is_active = FALSE;
 enum _notmuch_features target_features, new_features;
 notmuch_status_t status;
+notmuch_private_status_t private_status;
 unsigned int count = 0, total = 0;
 
 status = _notmuch_database_ensure_writable (notmuch);
@@ -1275,6 +1276,13 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
for (t = db-allterms_begin (XTIMESTAMP); t != t_end; t++)
++total;
 }
+if (new_features  NOTMUCH_FEATURE_GHOSTS) {
+   /* The ghost message upgrade converts all thread_id_*
+* metadata values into ghost message documents. */
+   t_end = db-metadata_keys_end (thread_id_);
+   for (t = db-metadata_keys_begin (thread_id_); t != t_end; ++t)
+   ++total;
+}
 
 /* Perform the upgrade in a transaction. */
 db-begin_transaction (true);
@@ -1378,10 +1386,64 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
}
 }
 
+/* Perform metadata upgrades. */
+
+/* Prior to NOTMUCH_FEATURE_GHOSTS, thread IDs for missing
+ * messages were stored as database metadata. Change these to
+ * ghost messages.
+ */
+if (new_features  NOTMUCH_FEATURE_GHOSTS) {
+   notmuch_message_t *message;
+   std::string message_id, thread_id;
+
+   t_end = db-metadata_keys_end (NOTMUCH_METADATA_THREAD_ID_PREFIX);
+   for (t = db-metadata_keys_begin (NOTMUCH_METADATA_THREAD_ID_PREFIX);
+t != t_end; ++t) {
+   if (do_progress_notify) {
+   progress_notify (closure, (double) count / total);
+   do_progress_notify = 0;
+   }
+
+   message_id = (*t).substr (
+   strlen (NOTMUCH_METADATA_THREAD_ID_PREFIX));
+   thread_id = db-get_metadata (*t);
+
+   /* Create ghost message */
+   message = _notmuch_message_create_for_message_id (
+   notmuch, message_id.c_str (), private_status);
+   if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
+   /* Document already exists; ignore the stored thread ID */
+   } else if (private_status ==
+  NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
+   private_status = _notmuch_message_initialize_ghost (
+   message, thread_id.c_str ());
+   if (! private_status)
+   _notmuch_message_sync (message);
+   }
+
+   if (private_status) {
+   fprintf (stderr,
+Upgrade failed while creating ghost messages.\n);
+   status = COERCE_STATUS (private_status, Unexpected status from 
_notmuch_message_initialize_ghost);
+   goto DONE;
+   }
+
+   /* Clear saved metadata thread ID */
+   db-set_metadata (*t, );
+
+   ++count;
+   }
+}
+
+status = NOTMUCH_STATUS_SUCCESS;
 db-set_metadata (features, _print_features (local, notmuch-features));
 db-set_metadata (version, STRINGIFY (NOTMUCH_DATABASE_VERSION));
 
-db-commit_transaction ();
+ DONE:
+if (status == NOTMUCH_STATUS_SUCCESS)
+   db-commit_transaction ();
+else
+   db-cancel_transaction ();
 
 if (timer_is_active) {
/* Now stop the timer. */
@@ -1397,7 +1459,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 }
 
 talloc_free (local);
-return NOTMUCH_STATUS_SUCCESS;
+return status;
 }
 
 notmuch_status_t
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 8/9] test: Test upgrade to ghost messages feature

2014-10-23 Thread Austin Clements
---
 test/T530-upgrade.sh | 21 +
 1 file changed, 21 insertions(+)

diff --git a/test/T530-upgrade.sh b/test/T530-upgrade.sh
index c4c4ac8..6b42a69 100755
--- a/test/T530-upgrade.sh
+++ b/test/T530-upgrade.sh
@@ -116,4 +116,25 @@ MAIL_DIR/bar/new/21:2,
 MAIL_DIR/bar/new/22:2,
 MAIL_DIR/cur/51:2,
 
+# Ghost messages are difficult to test since they're nearly invisible.
+# However, if the upgrade works correctly, the ghost message should
+# retain the right thread ID even if all of the original messages in
+# the thread are deleted.  That's what we test.  This won't detect if
+# the upgrade just plain didn't happen, but it should detect if
+# something went wrong.
+test_begin_subtest ghost message retains thread ID
+# Upgrade database
+notmuch new
+# Get thread ID of real message
+thread=$(notmuch search --output=threads id:4efc743a.3060...@april.org)
+# Delete all real messages in that thread
+rm $(notmuch search --output=files $thread)
+notmuch new
+# Deliver ghost message
+add_message '[subject]=Ghost' '[id]=4efc3931.6030...@april.org'
+# If the ghost upgrade worked, the new message should be attached to
+# the existing thread ID.
+nthread=$(notmuch search --output=threads id:4efc3931.6030...@april.org)
+test_expect_equal $thread $nthread
+
 test_done
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 4/9] lib: Internal support for querying and creating ghost messages

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This updates the message abstraction to support ghost messages: it
adds a message flag that distinguishes regular messages from ghost
messages, and an internal function for initializing a newly created
(blank) message as a ghost message.
---
 lib/message.cc| 52 +--
 lib/notmuch-private.h |  4 
 lib/notmuch.h |  9 -
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 55d2ff6..a7a13cc 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -39,6 +39,9 @@ struct visible _notmuch_message {
 notmuch_message_file_t *message_file;
 notmuch_message_list_t *replies;
 unsigned long flags;
+/* For flags that are initialized on-demand, lazy_flags indicates
+ * if each flag has been initialized. */
+unsigned long lazy_flags;
 
 Xapian::Document doc;
 Xapian::termcount termpos;
@@ -99,6 +102,7 @@ _notmuch_message_create_for_document (const void 
*talloc_owner,
 
 message-frozen = 0;
 message-flags = 0;
+message-lazy_flags = 0;
 
 /* Each of these will be lazily created as needed. */
 message-message_id = NULL;
@@ -192,7 +196,7 @@ _notmuch_message_create (const void *talloc_owner,
  *
  * There is already a document with message ID 'message_id' in the
  * database. The returned message can be used to query/modify the
- * document.
+ * document. The message may be a ghost message.
  *
  *   NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND:
  *
@@ -305,6 +309,7 @@ _notmuch_message_ensure_metadata (notmuch_message_t 
*message)
 const char *thread_prefix = _find_prefix (thread),
*tag_prefix = _find_prefix (tag),
*id_prefix = _find_prefix (id),
+   *type_prefix = _find_prefix (type),
*filename_prefix = _find_prefix (file-direntry),
*replyto_prefix = _find_prefix (replyto);
 
@@ -337,10 +342,25 @@ _notmuch_message_ensure_metadata (notmuch_message_t 
*message)
message-message_id =
_notmuch_message_get_term (message, i, end, id_prefix);
 
+/* Get document type */
+assert (strcmp (id_prefix, type_prefix)  0);
+if (! NOTMUCH_TEST_BIT (message-lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST)) {
+   i.skip_to (type_prefix);
+   /* T is the prefix type fields.  See
+* BOOLEAN_PREFIX_INTERNAL. */
+   if (*i == Tmail)
+   NOTMUCH_CLEAR_BIT (message-flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+   else if (*i == Tghost)
+   NOTMUCH_SET_BIT (message-flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+   else
+   INTERNAL_ERROR (Message without type term);
+   NOTMUCH_SET_BIT (message-lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+}
+
 /* Get filename list.  Here we get only the terms.  We lazily
  * expand them to full file names when needed in
  * _notmuch_message_ensure_filename_list. */
-assert (strcmp (id_prefix, filename_prefix)  0);
+assert (strcmp (type_prefix, filename_prefix)  0);
 if (!message-filename_term_list  !message-filename_list)
message-filename_term_list =
_notmuch_database_get_terms_with_prefix (message, i, end,
@@ -371,6 +391,11 @@ _notmuch_message_invalidate_metadata (notmuch_message_t 
*message,
message-tag_list = NULL;
 }
 
+if (strcmp (type, prefix_name) == 0) {
+   NOTMUCH_CLEAR_BIT (message-flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+   NOTMUCH_CLEAR_BIT (message-lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
+}
+
 if (strcmp (file-direntry, prefix_name) == 0) {
talloc_free (message-filename_term_list);
talloc_free (message-filename_list);
@@ -869,6 +894,10 @@ notmuch_bool_t
 notmuch_message_get_flag (notmuch_message_t *message,
  notmuch_message_flag_t flag)
 {
+if (flag == NOTMUCH_MESSAGE_FLAG_GHOST 
+   ! NOTMUCH_TEST_BIT (message-lazy_flags, flag))
+   _notmuch_message_ensure_metadata (message);
+
 return NOTMUCH_TEST_BIT (message-flags, flag);
 }
 
@@ -880,6 +909,7 @@ notmuch_message_set_flag (notmuch_message_t *message,
NOTMUCH_SET_BIT (message-flags, flag);
 else
NOTMUCH_CLEAR_BIT (message-flags, flag);
+NOTMUCH_SET_BIT (message-lazy_flags, flag);
 }
 
 time_t
@@ -989,6 +1019,24 @@ _notmuch_message_delete (notmuch_message_t *message)
 return NOTMUCH_STATUS_SUCCESS;
 }
 
+/* Transform a blank message into a ghost message.  The caller must
+ * _notmuch_message_sync the message. */
+notmuch_private_status_t
+_notmuch_message_initialize_ghost (notmuch_message_t *message,
+  const char *thread_id)
+{
+notmuch_private_status_t status;
+
+status = _notmuch_message_add_term (message, type, ghost);
+if (status)
+   return status;
+status = _notmuch_message_add_term (message, thread, thread_id);
+if (status)
+   return status;
+
+return NOTMUCH_PRIVATE_STATUS_SUCCESS;
+}
+
 /* Ensure

[PATCH v3 5/9] lib: Implement ghost-based thread linking

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This updates the thread linking code to use ghost messages instead of
user metadata to link messages into threads.

In contrast with the old approach, this is actually correct.
Previously, thread merging updated only the thread IDs of message
documents, not thread IDs stored in user metadata.  As originally
diagnosed by Mark Walters [1] and as demonstrated by the broken
T260-thread-order test, this can cause notmuch to fail to link
messages even though they're in the same thread.  In principle the old
approach could have been fixed by updating the user metadata thread
IDs as well, but these are not indexed and hence this would have
required a full scan of all stored thread IDs.  Ghost messages solve
this problem naturally by reusing the exact same thread ID and message
ID representation and indexing as regular messages.

Furthermore, thanks to this greater symmetry, ghost messages are also
algorithmically simpler.  We continue to support the old user metadata
format, so this patch can't delete any code, but when we do remove
support for the old format, several functions can simply be deleted.

[1] id:8738h7kv2q@qmul.ac.uk
---
 lib/database.cc | 99 +++--
 1 file changed, 83 insertions(+), 16 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index c641bcd..92a92d9 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
message_id);
 }
 
+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret);
+
 /* Find the thread ID to which the message with 'message_id' belongs.
  *
  * Note: 'thread_id_ret' must not be NULL!
@@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
  *
  * Note: If there is no message in the database with the given
  * 'message_id' then a new thread_id will be allocated for this
- * message and stored in the database metadata, (where this same
+ * message ID and stored in the database metadata so that the
  * thread ID can be looked up if the message is added to the database
- * later).
+ * later.
  */
 static notmuch_status_t
 _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,
@@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t 
*notmuch,
  const char *message_id,
  const char **thread_id_ret)
 {
+notmuch_private_status_t status;
+notmuch_message_t *message;
+
+if (! (notmuch-features  NOTMUCH_FEATURE_GHOSTS))
+   return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,
+thread_id_ret);
+
+/* Look for this message (regular or ghost) */
+message = _notmuch_message_create_for_message_id (
+   notmuch, message_id, status);
+if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
+   /* Message exists */
+   *thread_id_ret = talloc_steal (
+   ctx, notmuch_message_get_thread_id (message));
+} else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
+   /* Message did not exist.  Give it a fresh thread ID and
+* populate this message as a ghost message. */
+   *thread_id_ret = talloc_strdup (
+   ctx, _notmuch_database_generate_thread_id (notmuch));
+   if (! *thread_id_ret) {
+   status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;
+   } else {
+   status = _notmuch_message_initialize_ghost (message, 
*thread_id_ret);
+   if (status == 0)
+   /* Commit the new ghost message */
+   _notmuch_message_sync (message);
+   }
+} else {
+   /* Create failed. Fall through. */
+}
+
+notmuch_message_destroy (message);
+
+return COERCE_STATUS (status, Error creating ghost message);
+}
+
+/* Pre-ghost messages _resolve_message_id_to_thread_id */
+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret)
+{
 notmuch_status_t status;
 notmuch_message_t *message;
 string thread_id_string;
@@ -2007,13 +2056,16 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
 }
 }
 
-/* Given a (mostly empty) 'message' and its corresponding
+/* Given a blank or ghost 'message' and its corresponding
  * 'message_file' link it to existing threads in the database.
  *
- * The first check is in the metadata of the database to see if we
- * have pre-allocated a thread_id in advance for this message, (which
- * would have

[PATCH v3 7/9] lib: Enable ghost messages feature

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This fixes the broken thread order test.
---
 lib/database-private.h| 2 +-
 test/T260-thread-order.sh | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index e2e4bc8..15e03cc 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -166,7 +166,7 @@ struct _notmuch_database {
  * databases will have it). */
 #define NOTMUCH_FEATURES_CURRENT \
 (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
- NOTMUCH_FEATURE_BOOL_FOLDER)
+ NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
 
 /* Return the list of terms from the given iterator matching a prefix.
  * The prefix will be stripped from the strings in the returned list.
diff --git a/test/T260-thread-order.sh b/test/T260-thread-order.sh
index b435d79..99f5833 100755
--- a/test/T260-thread-order.sh
+++ b/test/T260-thread-order.sh
@@ -30,7 +30,6 @@ expected=$(for ((i = 0; i  $nthreads; i++)); do
 test_expect_equal $output $expected
 
 test_begin_subtest Messages with all parents get linked in all delivery 
orders
-test_subtest_known_broken
 # Here we do the same thing as the previous test, but each message
 # references all of its parents.  Since every message references the
 # root of the thread, each thread should always be fully joined.  This
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 9/9] lib: Remove unnecessary thread linking steps when using ghost messages

2014-10-23 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

Previously, it was necessary to link new messages to children to work
around some (though not all) problems with the old metadata-based
approach to stored thread IDs.  With ghost messages, this is no longer
necessary, so don't bother with child linking when ghost messages are
in use.
---
 lib/database.cc | 29 +
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index b673718..3601f9d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -2136,11 +2136,11 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
  * reference 'message'.
  *
  * In all cases, we assign to the current message the first thread ID
- * found (through either parent or child). We will also merge any
- * existing, distinct threads where this message belongs to both,
- * (which is not uncommon when messages are processed out of order).
+ * found. We will also merge any existing, distinct threads where this
+ * message belongs to both, (which is not uncommon when messages are
+ * processed out of order).
  *
- * Finally, if no thread ID has been found through parent or child, we
+ * Finally, if no thread ID has been found through referenced messages, we
  * call _notmuch_message_generate_thread_id to generate a new thread
  * ID. This should only happen for new, top-level messages, (no
  * References or In-Reply-To header in this message, and no previously
@@ -2172,10 +2172,23 @@ _notmuch_database_link_message (notmuch_database_t 
*notmuch,
 if (status)
goto DONE;
 
-status = _notmuch_database_link_message_to_children (notmuch, message,
-thread_id);
-if (status)
-   goto DONE;
+if (! (notmuch-features  NOTMUCH_FEATURE_GHOSTS)) {
+   /* In general, it shouldn't be necessary to link children,
+* since the earlier indexing of those children will have
+* stored a thread ID for the missing parent.  However, prior
+* to ghost messages, these stored thread IDs were NOT
+* rewritten during thread merging (and there was no
+* performant way to do so), so if indexed children were
+* pulled into a different thread ID by a merge, it was
+* necessary to pull them *back* into the stored thread ID of
+* the parent.  With ghost messages, we just rewrite the
+* stored thread IDs during merging, so this workaround isn't
+* necessary. */
+   status = _notmuch_database_link_message_to_children (notmuch, message,
+thread_id);
+   if (status)
+   goto DONE;
+}
 
 /* If not part of any existing thread, generate a new thread ID. */
 if (thread_id == NULL) {
-- 
2.1.0

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 12/12] lib: Remove unnecessary thread linking steps when using ghost messages

2014-10-21 Thread Austin Clements
Quoth Mark Walters on Oct 22 at 12:17 am:
> On Tue, 07 Oct 2014, Austin Clements  wrote:
> > From: Austin Clements 
> >
> > Previously, it was necessary to link new messages to children to work
> > around some (though not all) problems with the old metadata-based
> > approach to stored thread IDs.  With ghost messages, this is no longer
> > necessary, so don't bother with child linking when ghost messages are
> > in use.
> > ---
> >  lib/database.cc | 21 +
> >  1 file changed, 17 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/database.cc b/lib/database.cc
> > index 1316529..6e51a72 100644
> > --- a/lib/database.cc
> > +++ b/lib/database.cc
> > @@ -2169,10 +2169,23 @@ _notmuch_database_link_message (notmuch_database_t 
> > *notmuch,
> >  if (status)
> > goto DONE;
> >  
> > -status = _notmuch_database_link_message_to_children (notmuch, message,
> > -_id);
> > -if (status)
> > -   goto DONE;
> > +if (! (notmuch->features & NOTMUCH_FEATURE_GHOSTS)) {
> > +   /* In general, it shouldn't be necessary to link children,
> > +* since the earlier indexing of those children will have
> > +* stored a thread ID for the missing parent.  However, prior
> > +* to ghost messages, these stored thread IDs were NOT
> > +* rewritten during thread merging (and there was no
> > +* performant way to do so), so if indexed children were
> > +* pulled into a different thread ID by a merge, it was
> > +* necessary to pull them *back* into the stored thread ID of
> > +* the parent.  With ghost messages, we just rewrite the
> > +* stored thread IDs during merging, so this workaround isn't
> > +* necessary. */
> > +   status = _notmuch_database_link_message_to_children (notmuch, message,
> > +_id);
> > +   if (status)
> > +   goto DONE;
> > +}
> 
> Ok so this basically answers my earlier comment. It might be worth
> updating the big comment at the start of the function to match the new
> code though.

Like so?

diff --git a/lib/database.cc b/lib/database.cc
index d063ec8..3601f9d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -2136,11 +2136,11 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
  * reference 'message'.
  *
  * In all cases, we assign to the current message the first thread ID
- * found (through either parent or child). We will also merge any
- * existing, distinct threads where this message belongs to both,
- * (which is not uncommon when messages are processed out of order).
+ * found. We will also merge any existing, distinct threads where this
+ * message belongs to both, (which is not uncommon when messages are
+ * processed out of order).
  *
- * Finally, if no thread ID has been found through parent or child, we
+ * Finally, if no thread ID has been found through referenced messages, we
  * call _notmuch_message_generate_thread_id to generate a new thread
  * ID. This should only happen for new, top-level messages, (no
  * References or In-Reply-To header in this message, and no previously


> Best wishes
> 
> Mark
> 
> >  
> >  /* If not part of any existing thread, generate a new thread ID. */
> >  if (thread_id == NULL) {
> >
> > ___
> > notmuch mailing list
> > notmuch at notmuchmail.org
> > http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v2 08/12] lib: Implement ghost-based thread linking

2014-10-21 Thread Austin Clements
Quoth Mark Walters on Oct 22 at 12:10 am:
> On Tue, 07 Oct 2014, Austin Clements  wrote:
> > From: Austin Clements 
> >
> > This updates the thread linking code to use ghost messages instead of
> > user metadata to link messages into threads.
> >
> > In contrast with the old approach, this is actually correct.
> > Previously, thread merging updated only the thread IDs of message
> > documents, not thread IDs stored in user metadata.  As originally
> > diagnosed by Mark Walters [1] and as demonstrated by the broken
> > T260-thread-order test, this can cause notmuch to fail to link
> > messages even though they're in the same thread.  In principle the old
> > approach could have been fixed by updating the user metadata thread
> > IDs as well, but these are not indexed and hence this would have
> > required a full scan of all stored thread IDs.  Ghost messages solve
> > this problem naturally by reusing the exact same thread ID and message
> > ID representation and indexing as regular messages.
> >
> > Furthermore, thanks to this greater symmetry, ghost messages are also
> > algorithmically simpler.  We continue to support the old user metadata
> > format, so this patch can't delete any code, but when we do remove
> > support for the old format, several functions can simply be deleted.
> >
> > [1] id:8738h7kv2q.fsf at qmul.ac.uk
> > ---
> >  lib/database.cc | 86 
> > +
> >  1 file changed, 75 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/database.cc b/lib/database.cc
> > index c641bcd..fdcc526 100644
> > --- a/lib/database.cc
> > +++ b/lib/database.cc
> > @@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char 
> > *message_id)
> > message_id);
> >  }
> >  
> > +static notmuch_status_t
> > +_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
> > + void *ctx,
> > + const char *message_id,
> > + const char **thread_id_ret);
> > +
> >  /* Find the thread ID to which the message with 'message_id' belongs.
> >   *
> >   * Note: 'thread_id_ret' must not be NULL!
> > @@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char 
> > *message_id)
> >   *
> >   * Note: If there is no message in the database with the given
> >   * 'message_id' then a new thread_id will be allocated for this
> > - * message and stored in the database metadata, (where this same
> > + * message ID and stored in the database metadata so that the
> >   * thread ID can be looked up if the message is added to the database
> > - * later).
> > + * later.
> >   */
> >  static notmuch_status_t
> >  _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,
> > @@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t 
> > *notmuch,
> >   const char *message_id,
> >   const char **thread_id_ret)
> >  {
> > +notmuch_private_status_t status;
> > +notmuch_message_t *message;
> > +
> > +if (! (notmuch->features & NOTMUCH_FEATURE_GHOSTS))
> > +   return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,
> > +thread_id_ret);
> > +
> > +/* Look for this message (regular or ghost) */
> > +message = _notmuch_message_create_for_message_id (
> > +   notmuch, message_id, );
> > +if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
> > +   /* Message exists */
> > +   *thread_id_ret = talloc_steal (
> > +   ctx, notmuch_message_get_thread_id (message));
> > +} else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
> > +   /* Message did not exist.  Give it a fresh thread ID and
> > +* populate this message as a ghost message. */
> > +   *thread_id_ret = talloc_strdup (
> > +   ctx, _notmuch_database_generate_thread_id (notmuch));
> > +   if (! *thread_id_ret) {
> > +   status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;
> > +   } else {
> > +   status = _notmuch_message_initialize_ghost (message, 
> > *thread_id_ret);
> > +   if (status == 0)
> > +   /* Commit the new ghost message */
> > +   _notmuch_message_sync (message);
> > +   }
> > +} else {
> > +   /* Create failed. Fall through. */
> > +}
> > +
> > +notmuch_m

[PATCH v2 07/12] lib: Internal support for querying and creating ghost messages

2014-10-21 Thread Austin Clements
Quoth Mark Walters on Oct 22 at 12:05 am:
> 
> Hi 
> 
> I am slowly working my way through this series: only two trivial queries
> so far.
> 
> On Tue, 07 Oct 2014, Austin Clements  wrote:
> > From: Austin Clements 
> >
> > This updates the message abstraction to support ghost messages: it
> > adds a message flag that distinguishes regular messages from ghost
> > messages, and an internal function for initializing a newly created
> > (blank) message as a ghost message.
> > ---
> >  lib/message.cc| 52 
> > +--
> >  lib/notmuch-private.h |  4 
> >  lib/notmuch.h |  9 -
> >  3 files changed, 62 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/message.cc b/lib/message.cc
> > index 55d2ff6..a7a13cc 100644
> > --- a/lib/message.cc
> > +++ b/lib/message.cc
> > @@ -39,6 +39,9 @@ struct visible _notmuch_message {
> >  notmuch_message_file_t *message_file;
> >  notmuch_message_list_t *replies;
> >  unsigned long flags;
> > +/* For flags that are initialized on-demand, lazy_flags indicates
> > + * if each flag has been initialized. */
> > +unsigned long lazy_flags;
> 
> I wonder if valid_flags might be better here as, as far as I can see,
> the reason for these is so we can invalidate a flag more than an
> optimisation (which is what I thought the lazy implied).

I do think of this as an optimization.  If we were to compute the
value of this flag when a message was created (and keep it
up-to-date), there would be no need for lazy_flags.  But, unlike the
other flags, computing this is expensive.

> >  
> >  Xapian::Document doc;
> >  Xapian::termcount termpos;
> > @@ -99,6 +102,7 @@ _notmuch_message_create_for_document (const void 
> > *talloc_owner,
> >  
> >  message->frozen = 0;
> >  message->flags = 0;
> > +message->lazy_flags = 0;
> >  
> >  /* Each of these will be lazily created as needed. */
> >  message->message_id = NULL;
> > @@ -192,7 +196,7 @@ _notmuch_message_create (const void *talloc_owner,
> >   *
> >   * There is already a document with message ID 'message_id' in the
> >   * database. The returned message can be used to query/modify the
> > - * document.
> > + * document. The message may be a ghost message.
> >   *
> >   *   NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND:
> >   *
> > @@ -305,6 +309,7 @@ _notmuch_message_ensure_metadata (notmuch_message_t 
> > *message)
> >  const char *thread_prefix = _find_prefix ("thread"),
> > *tag_prefix = _find_prefix ("tag"),
> > *id_prefix = _find_prefix ("id"),
> > +   *type_prefix = _find_prefix ("type"),
> > *filename_prefix = _find_prefix ("file-direntry"),
> > *replyto_prefix = _find_prefix ("replyto");
> >  
> > @@ -337,10 +342,25 @@ _notmuch_message_ensure_metadata (notmuch_message_t 
> > *message)
> > message->message_id =
> > _notmuch_message_get_term (message, i, end, id_prefix);
> >  
> > +/* Get document type */
> > +assert (strcmp (id_prefix, type_prefix) < 0);
> > +if (! NOTMUCH_TEST_BIT (message->lazy_flags, 
> > NOTMUCH_MESSAGE_FLAG_GHOST)) {
> > +   i.skip_to (type_prefix);
> > +   /* "T" is the prefix "type" fields.  See
> > +* BOOLEAN_PREFIX_INTERNAL. */
> > +   if (*i == "Tmail")
> > +   NOTMUCH_CLEAR_BIT (>flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +   else if (*i == "Tghost")
> > +   NOTMUCH_SET_BIT (>flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +   else
> > +   INTERNAL_ERROR ("Message without type term");
> > +   NOTMUCH_SET_BIT (>lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +}
> > +
> >  /* Get filename list.  Here we get only the terms.  We lazily
> >   * expand them to full file names when needed in
> >   * _notmuch_message_ensure_filename_list. */
> > -assert (strcmp (id_prefix, filename_prefix) < 0);
> > +assert (strcmp (type_prefix, filename_prefix) < 0);
> >  if (!message->filename_term_list && !message->filename_list)
> > message->filename_term_list =
> > _notmuch_database_get_terms_with_prefix (message, i, end,
> > @@ -371,6 +391,11 @@ _notmuch_message_invalidate_metadata 
> > (notmuch_message_t *message,
> > message->tag_list = NULL;
> >  }
> >  
> >

Re: [PATCH v2 08/12] lib: Implement ghost-based thread linking

2014-10-21 Thread Austin Clements
Quoth Mark Walters on Oct 22 at 12:10 am:
 On Tue, 07 Oct 2014, Austin Clements acleme...@csail.mit.edu wrote:
  From: Austin Clements amdra...@mit.edu
 
  This updates the thread linking code to use ghost messages instead of
  user metadata to link messages into threads.
 
  In contrast with the old approach, this is actually correct.
  Previously, thread merging updated only the thread IDs of message
  documents, not thread IDs stored in user metadata.  As originally
  diagnosed by Mark Walters [1] and as demonstrated by the broken
  T260-thread-order test, this can cause notmuch to fail to link
  messages even though they're in the same thread.  In principle the old
  approach could have been fixed by updating the user metadata thread
  IDs as well, but these are not indexed and hence this would have
  required a full scan of all stored thread IDs.  Ghost messages solve
  this problem naturally by reusing the exact same thread ID and message
  ID representation and indexing as regular messages.
 
  Furthermore, thanks to this greater symmetry, ghost messages are also
  algorithmically simpler.  We continue to support the old user metadata
  format, so this patch can't delete any code, but when we do remove
  support for the old format, several functions can simply be deleted.
 
  [1] id:8738h7kv2q@qmul.ac.uk
  ---
   lib/database.cc | 86 
  +
   1 file changed, 75 insertions(+), 11 deletions(-)
 
  diff --git a/lib/database.cc b/lib/database.cc
  index c641bcd..fdcc526 100644
  --- a/lib/database.cc
  +++ b/lib/database.cc
  @@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char 
  *message_id)
  message_id);
   }
   
  +static notmuch_status_t
  +_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
  + void *ctx,
  + const char *message_id,
  + const char **thread_id_ret);
  +
   /* Find the thread ID to which the message with 'message_id' belongs.
*
* Note: 'thread_id_ret' must not be NULL!
  @@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char 
  *message_id)
*
* Note: If there is no message in the database with the given
* 'message_id' then a new thread_id will be allocated for this
  - * message and stored in the database metadata, (where this same
  + * message ID and stored in the database metadata so that the
* thread ID can be looked up if the message is added to the database
  - * later).
  + * later.
*/
   static notmuch_status_t
   _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,
  @@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t 
  *notmuch,
const char *message_id,
const char **thread_id_ret)
   {
  +notmuch_private_status_t status;
  +notmuch_message_t *message;
  +
  +if (! (notmuch-features  NOTMUCH_FEATURE_GHOSTS))
  +   return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,
  +thread_id_ret);
  +
  +/* Look for this message (regular or ghost) */
  +message = _notmuch_message_create_for_message_id (
  +   notmuch, message_id, status);
  +if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
  +   /* Message exists */
  +   *thread_id_ret = talloc_steal (
  +   ctx, notmuch_message_get_thread_id (message));
  +} else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
  +   /* Message did not exist.  Give it a fresh thread ID and
  +* populate this message as a ghost message. */
  +   *thread_id_ret = talloc_strdup (
  +   ctx, _notmuch_database_generate_thread_id (notmuch));
  +   if (! *thread_id_ret) {
  +   status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;
  +   } else {
  +   status = _notmuch_message_initialize_ghost (message, 
  *thread_id_ret);
  +   if (status == 0)
  +   /* Commit the new ghost message */
  +   _notmuch_message_sync (message);
  +   }
  +} else {
  +   /* Create failed. Fall through. */
  +}
  +
  +notmuch_message_destroy (message);
  +
  +return COERCE_STATUS (status, Error creating ghost message);
  +}
  +
  +/* Pre-ghost messages _resolve_message_id_to_thread_id */
  +static notmuch_status_t
  +_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
  + void *ctx,
  + const char *message_id,
  + const char **thread_id_ret)
  +{
   notmuch_status_t status;
   notmuch_message_t *message;
   string thread_id_string;
  @@ -2007,7 +2056,7 @@ _consume_metadata_thread_id (void *ctx, 
  notmuch_database_t *notmuch,
   }
   }
   
  -/* Given a (mostly empty) 'message' and its corresponding
  +/* Given a blank or ghost 'message' and its

Re: [PATCH v2 12/12] lib: Remove unnecessary thread linking steps when using ghost messages

2014-10-21 Thread Austin Clements
Quoth Mark Walters on Oct 22 at 12:17 am:
 On Tue, 07 Oct 2014, Austin Clements acleme...@csail.mit.edu wrote:
  From: Austin Clements amdra...@mit.edu
 
  Previously, it was necessary to link new messages to children to work
  around some (though not all) problems with the old metadata-based
  approach to stored thread IDs.  With ghost messages, this is no longer
  necessary, so don't bother with child linking when ghost messages are
  in use.
  ---
   lib/database.cc | 21 +
   1 file changed, 17 insertions(+), 4 deletions(-)
 
  diff --git a/lib/database.cc b/lib/database.cc
  index 1316529..6e51a72 100644
  --- a/lib/database.cc
  +++ b/lib/database.cc
  @@ -2169,10 +2169,23 @@ _notmuch_database_link_message (notmuch_database_t 
  *notmuch,
   if (status)
  goto DONE;
   
  -status = _notmuch_database_link_message_to_children (notmuch, message,
  -thread_id);
  -if (status)
  -   goto DONE;
  +if (! (notmuch-features  NOTMUCH_FEATURE_GHOSTS)) {
  +   /* In general, it shouldn't be necessary to link children,
  +* since the earlier indexing of those children will have
  +* stored a thread ID for the missing parent.  However, prior
  +* to ghost messages, these stored thread IDs were NOT
  +* rewritten during thread merging (and there was no
  +* performant way to do so), so if indexed children were
  +* pulled into a different thread ID by a merge, it was
  +* necessary to pull them *back* into the stored thread ID of
  +* the parent.  With ghost messages, we just rewrite the
  +* stored thread IDs during merging, so this workaround isn't
  +* necessary. */
  +   status = _notmuch_database_link_message_to_children (notmuch, message,
  +thread_id);
  +   if (status)
  +   goto DONE;
  +}
 
 Ok so this basically answers my earlier comment. It might be worth
 updating the big comment at the start of the function to match the new
 code though.

Like so?

diff --git a/lib/database.cc b/lib/database.cc
index d063ec8..3601f9d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -2136,11 +2136,11 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
  * reference 'message'.
  *
  * In all cases, we assign to the current message the first thread ID
- * found (through either parent or child). We will also merge any
- * existing, distinct threads where this message belongs to both,
- * (which is not uncommon when messages are processed out of order).
+ * found. We will also merge any existing, distinct threads where this
+ * message belongs to both, (which is not uncommon when messages are
+ * processed out of order).
  *
- * Finally, if no thread ID has been found through parent or child, we
+ * Finally, if no thread ID has been found through referenced messages, we
  * call _notmuch_message_generate_thread_id to generate a new thread
  * ID. This should only happen for new, top-level messages, (no
  * References or In-Reply-To header in this message, and no previously


 Best wishes
 
 Mark
 
   
   /* If not part of any existing thread, generate a new thread ID. */
   if (thread_id == NULL) {
 
  ___
  notmuch mailing list
  notmuch@notmuchmail.org
  http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[WIP PATCH 4/4] lib: Add "lastmod:" queries for filtering by last modification

2014-10-13 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

XXX Includes reference to notmuch search --db-revision, which doesn't
exist.
---
 doc/man7/notmuch-search-terms.rst | 8 
 lib/database-private.h| 1 +
 lib/database.cc   | 4 
 3 files changed, 13 insertions(+)

diff --git a/doc/man7/notmuch-search-terms.rst 
b/doc/man7/notmuch-search-terms.rst
index 1acdaa0..df76e39 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -52,6 +52,8 @@ indicate user-supplied values):

 -  date:..

+-  lastmod:..
+
 The **from:** prefix is used to match the name or address of the sender
 of an email message.

@@ -118,6 +120,12 @@ The time range can also be specified using timestamps with 
a syntax of:
 Each timestamp is a number representing the number of seconds since
 1970-01-01 00:00:00 UTC.

+The **lastmod:** prefix can be used to restrict the result by the
+database revision number of when messages were last modified (tags
+were added/removed or filenames changed).  This is usually used in
+conjunction with the **--db-revision** argument to **notmuch search**
+to find messages that have changed since an earlier query.
+
 In addition to individual terms, multiple terms can be combined with
 Boolean operators ( **and**, **or**, **not** , etc.). Each term in the
 query will be implicitly connected by a logical AND if no explicit
diff --git a/lib/database-private.h b/lib/database-private.h
index 0977229..cbca1de 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -163,6 +163,7 @@ struct _notmuch_database {
 Xapian::TermGenerator *term_gen;
 Xapian::ValueRangeProcessor *value_range_processor;
 Xapian::ValueRangeProcessor *date_range_processor;
+Xapian::ValueRangeProcessor *last_mod_range_processor;
 };

 /* Prior to database version 3, features were implied by the database
diff --git a/lib/database.cc b/lib/database.cc
index 9bec170..f9aa45d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -913,6 +913,7 @@ notmuch_database_open (const char *path,
notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor 
(NOTMUCH_VALUE_TIMESTAMP);
notmuch->date_range_processor = new ParseTimeValueRangeProcessor 
(NOTMUCH_VALUE_TIMESTAMP);
+   notmuch->last_mod_range_processor = new 
Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_LAST_MOD, "lastmod:");

notmuch->query_parser->set_default_op (Xapian::Query::OP_AND);
notmuch->query_parser->set_database (*notmuch->xapian_db);
@@ -920,6 +921,7 @@ notmuch_database_open (const char *path,
notmuch->query_parser->set_stemming_strategy 
(Xapian::QueryParser::STEM_SOME);
notmuch->query_parser->add_valuerangeprocessor 
(notmuch->value_range_processor);
notmuch->query_parser->add_valuerangeprocessor 
(notmuch->date_range_processor);
+   notmuch->query_parser->add_valuerangeprocessor 
(notmuch->last_mod_range_processor);

for (i = 0; i < ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) {
prefix_t *prefix = _PREFIX_EXTERNAL[i];
@@ -991,6 +993,8 @@ notmuch_database_close (notmuch_database_t *notmuch)
 notmuch->value_range_processor = NULL;
 delete notmuch->date_range_processor;
 notmuch->date_range_processor = NULL;
+delete notmuch->last_mod_range_processor;
+notmuch->last_mod_range_processor = NULL;

 return status;
 }
-- 
2.1.0



[WIP PATCH 3/4] lib: API to retrieve database revision and UUID

2014-10-13 Thread Austin Clements
This exposes the committed database revision to library users along
with a UUID that can be used to detect when revision numbers are no
longer comparable (e.g., because the database has been replaced).
---
 lib/database-private.h |  1 +
 lib/database.cc| 11 +++
 lib/notmuch.h  | 18 ++
 3 files changed, 30 insertions(+)

diff --git a/lib/database-private.h b/lib/database-private.h
index 465065d..0977229 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -157,6 +157,7 @@ struct _notmuch_database {
  * under a higher revision number, which can be generated with
  * notmuch_database_new_revision. */
 unsigned long revision;
+const char *uuid;

 Xapian::QueryParser *query_parser;
 Xapian::TermGenerator *term_gen;
diff --git a/lib/database.cc b/lib/database.cc
index 45d32ab..9bec170 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -905,6 +905,8 @@ notmuch_database_open (const char *path,
notmuch->revision = 0;
else
notmuch->revision = Xapian::sortable_unserialise (last_mod);
+   notmuch->uuid = talloc_strdup (
+   notmuch, notmuch->xapian_db->get_uuid ().c_str ());

notmuch->query_parser = new Xapian::QueryParser;
notmuch->term_gen = new Xapian::TermGenerator;
@@ -1562,6 +1564,15 @@ DONE:
 return NOTMUCH_STATUS_SUCCESS;
 }

+unsigned long
+notmuch_database_get_revisison (notmuch_database_t *notmuch,
+   const char **uuid)
+{
+if (*uuid)
+   *uuid = notmuch->uuid;
+return notmuch->revision;
+}
+
 /* We allow the user to use arbitrarily long paths for directories. But
  * we have a term-length limit. So if we exceed that, we'll use the
  * SHA-1 of the path for the database term.
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 92594b9..898f7b9 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -433,6 +433,24 @@ notmuch_status_t
 notmuch_database_end_atomic (notmuch_database_t *notmuch);

 /**
+ * Return the committed database revision and UUID.
+ *
+ * The database revision number increases monotonically with each
+ * commit to the database.  Hence, all messages and message changes
+ * committed to the database (that is, visible to readers) have a last
+ * modification revision <= the committed database revision.  Any
+ * messages committed in the future will be assigned a modification
+ * revision > the committed database revision.
+ *
+ * The UUID is a NUL-terminated opaque string that uniquely identifies
+ * this database.  Two revision numbers are only comparable if they
+ * have the same database UUID.
+ */
+unsigned long
+notmuch_database_get_revisison (notmuch_database_t *notmuch,
+   const char **uuid);
+
+/**
  * Retrieve a directory object from the database for 'path'.
  *
  * Here, 'path' should be a path relative to the path of 'database'
-- 
2.1.0



[WIP PATCH 2/4] lib: Add per-message last modification tracking

2014-10-13 Thread Austin Clements
From: Austin Clements <amdra...@mit.edu>

This adds a new document value that stores the revision of the last
modification to message metadata, where the revision number increases
monotonically with each database commit.

An alternative would be to store the wall-clock time of the last
modification of each message.  In principle this is simpler and has
the advantage that any process can determine the current timestamp
without support from libnotmuch.  However, even assuming a computer's
clock never goes backward and ignoring clock skew in networked
environments, this has a fatal flaw.  Xapian uses (optimistic)
snapshot isolation, which means reads can be concurrent with writes.
Given this, consider the following time line with a write and two read
transactions:

   write  |-X-A--|
   read 1   |---B---|
   read 2  |---|

The write transaction modifies message X and records the wall-clock
time of the modification at A.  The writer hangs around for a while
and later commits its change.  Read 1 is concurrent with the write, so
it doesn't see the change to X.  It does some query and records the
wall-clock time of its results at B.  Transaction read 2 later starts
after the write commits and queries for changes since wall-clock time
B (say the reads are performing an incremental backup).  Even though
read 1 could not see the change to X, read 2 is told (correctly) that
X has not changed since B, the time of the last read.  In fact, X
changed before wall-clock time A, but the change was not visible until
*after* wall-clock time B, so read 2 misses the change to X.

This is tricky to solve in full-blown snapshot isolation, but because
Xapian serializes writes, we can use a simple, monotonically
increasing database revision number.  Furthermore, maintaining this
revision number requires no more IO than a wall-clock time solution
because Xapian already maintains statistics on the upper (and lower)
bound of each value stream.
---
 lib/database-private.h | 15 ++-
 lib/database.cc| 49 +++--
 lib/message.cc | 22 ++
 lib/notmuch-private.h  | 10 +-
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/lib/database-private.h b/lib/database-private.h
index 15e03cc..465065d 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -92,6 +92,12 @@ enum _notmuch_features {
  *
  * Introduced: version 3. */
 NOTMUCH_FEATURE_GHOSTS = 1 << 4,
+
+/* If set, messages store the revision number of the last
+ * modification in NOTMUCH_VALUE_LAST_MOD.
+ *
+ * Introduced: version 3. */
+NOTMUCH_FEATURE_LAST_MOD = 1 << 5,
 };

 /* In C++, a named enum is its own type, so define bitwise operators
@@ -137,6 +143,8 @@ struct _notmuch_database {

 notmuch_database_mode_t mode;
 int atomic_nesting;
+/* TRUE if changes have been made in this atomic section */
+notmuch_bool_t atomic_dirty;
 Xapian::Database *xapian_db;

 /* Bit mask of features used by this database.  This is a
@@ -145,6 +153,10 @@ struct _notmuch_database {

 unsigned int last_doc_id;
 uint64_t last_thread_id;
+/* Highest committed revision number.  Modifications are recorded
+ * under a higher revision number, which can be generated with
+ * notmuch_database_new_revision. */
+unsigned long revision;

 Xapian::QueryParser *query_parser;
 Xapian::TermGenerator *term_gen;
@@ -166,7 +178,8 @@ struct _notmuch_database {
  * databases will have it). */
 #define NOTMUCH_FEATURES_CURRENT \
 (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
- NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
+ NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS | \
+ NOTMUCH_FEATURE_LAST_MOD)

 /* Return the list of terms from the given iterator matching a prefix.
  * The prefix will be stripped from the strings in the returned list.
diff --git a/lib/database.cc b/lib/database.cc
index 6e51a72..45d32ab 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -101,6 +101,9 @@ typedef struct {
  *
  * SUBJECT:The value of the "Subject" header
  *
+ * LAST_MOD:   The revision number as of the last tag or
+ * filename change.
+ *
  * In addition, terms from the content of the message are added with
  * "from", "to", "attachment", and "subject" prefixes for use by the
  * user in searching. Similarly, terms from the path of the mail
@@ -304,6 +307,8 @@ static const struct {
   "exact folder:/path: search", "rw" },
 { NOTMUCH_FEATURE_GHOSTS,
   "mail documents for missing messages", "w"},
+{ NOTMUCH_FEATURE_LAST_MOD,
+  "modification tracking", "w"},
 };

 const char *
@@ -678,6 +683,23 @@ _notmuch_database_ensure_writable (notmuch_dat

  1   2   3   4   5   6   7   8   9   10   >