Re: [Mailman-Users] Search by Message-ID, preserving Cc for direct recipients

2013-05-15 Thread Jed Brown
Mark Sapiro m...@msapiro.net writes:

 The Message-ID of the post is in the HTML page containing the post, but
 it is only in an In-Reply-To= fragment of a mailto: URL that isn't
 indexed in htdig. Also, it's URL encoded so ,  and @ are %3C, %3E and
 %40 respectively. The actual Message-ID: headers are in the periodic
 *.txt files.

 This leads to a few possibilities such as teaching htdig to index the
 .txt files (may be tricky, I just spent a couple of minutes looking at
 this and didn't see it), changing the noindex start and end tags in the
 list's archives/private/LIST/htdig/LIST.conf file so that everything in
 the HTML files including the URL encoded Message-ID is indexed or
 writing a separate CGI search script to search the .txt files for the
 Message-ID.

 Or, use mail-archive.com which is probably simplest.

Okay, thanks.  I'll talk with the others here and decide what to do.

 I've learned a lot in the last 7 years ;)

 The reason is to keep the Cc: list from growing excessively long in long
 threads involving many people (see the subsequent post(s) in that thread).

Yeah, I saw that, but I don't care how long the Cc list gets.  I would
rather allow people to filter aggressively and not worry about missing
posts that may be relevant to them.  It's common on other lists
(evidently not those managed by mailman, vger.kernel.org is a
high-profile example) to by convention, always Cc everyone that is
likely to be interested.  Asking recipients to write rules in terms of
thread ancestry isn't sufficient either: when we later do more work that
is somehow related, we might start a new thread and Cc everyone from
prior threads that were related.  If the list chronically drops Cc, it
can be hard to figure out everyone that should be Cc'd in a new topic.

Anyway, can I interpret your response as being that mailman always drops
Cc and there is no configuration option?
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Search by Message-ID, preserving Cc for direct recipients

2013-05-15 Thread Jed Brown
Mark Sapiro m...@msapiro.net writes:

 On 05/15/2013 12:47 PM, Jed Brown wrote:
 
 Anyway, can I interpret your response as being that mailman always drops
 Cc and there is no configuration option?


 I guess that depends on what you call a configuration option.

 You could put this in mm_cfg.py

 GLOBAL_PIPELINE.remove('AvoidDuplicates')

 That would just remove the Handler so every list member that is a direct
 recipient would receive both the list and the direct copy regardless of
 her avoid duplicates setting, 

That's a side-effect that we don't want.

 or you could apply the attached patch to
 Mailman/Handlers/AvoidDuplicates.py, or you could patch the module but
 name the patched module say Mailman/Handlers/MyAvoidDuplicates.py and
 put

 GLOBAL_PIPELINE.insert(GLOBAL_PIPELINE.index('AvoidDuplicates'),
 'MyAvoidDuplicates')
 GLOBAL_PIPELINE.remove('AvoidDuplicates')

 im mm_cfg.py. 

This looks reasonable.

I think this is sufficiently useful to justify supporting without
patching, but this patch isn't hard to carry.  Thanks for your detailed
answer.
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


[Mailman-Users] Search by Message-ID, preserving Cc for direct recipients

2013-05-14 Thread Jed Brown
I would like to be able to search the archives of a mailman list using
the Message-ID, ideally using a stable URL like

  http://mid.gmane.org/${message_id}
  http://mail-archive.com/search?l=midq=${message_id}

but preferably on our own host as we're not currently mirrored and would
rather link to our own archives when referencing on old discussion on
the list.  Our current archives (e.g., [1]) are searched using htdig,
but it doesn't seem to support query by Message-ID.  Your wiki page [2]
also suggests Swish, MnoGoSearch, and Namazu.  Can any of these search
by Message-ID, or is our best bet to get indexed by mail-archive.com and
direct people there?

Second question: Why are direct recipients dropped from the Cc header of
the copy sent via the list?  This seems partially addressed in the
archives [3], but I think it's important for high-volume lists when
people filter conversations based on whether they are a direct
recipient.  Is there an option somewhere to keep Cc headers intact
without changing other behavior?

[1] http://lists.mcs.anl.gov/pipermail/petsc-dev/
[2] http://wiki.list.org/display/DOC/How+do+I+make+the+archives+searchable
[3] http://mail.python.org/pipermail/mailman-developers/2006-May/018777.html
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org