Yeah, you are right.
I find it useful to have a common "thread-id" but this is optional.
//Thomas
2008/11/27 Chris Anderson <[EMAIL PROTECTED]>
> On Thu, Nov 27, 2008 at 6:06 AM, Thomas Kerpe
> <[EMAIL PROTECTED]> wrote:
> >
> > function(doc) {
> > if (doc.type == "email") {
> > thread = [];
> > if (doc.header.references){
> > thread = doc.header.references.split(" ");
> > }
> > thread.push(doc.header['message-id']);
> > thread_id = thread[0];
> > emit([thread_id, thread,], doc.header.subject);
> > }
> > }
> >
> > //Thomas
> >
>
> I'd like to dig into this deeper. I'm not sure I understand what the
> second element in the emitted key is useful for. Could you just
> emit(thread, doc.header.subject) and skip the explicit thread_id?
> Group reduce should will interact nicely with that.
>
> I think this problem is similar to that of storing document versions
> (like for a wiki). Each version must link to it's parent, but also to
> the original version. I'm not familiar enough with email but
> doc.header.references sounds like it can do the job.
>
> I suppose the analog in versioned docs would be keeping a list of the
> chain of parents in each new doc (and not just the head of the chain
> and immediate parent.)
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>