Currently, addresses in bodies are ofuscated very lightly, 
using numerical entity references (view source on a recent message 
to see this). You are welcome to place more advanced mangling on your
list server if you can -- that's a better place for it anyway. Perhaps
one day mail-archive will support more serious mangling as well, but not
soon. (I am currently sweating over some hardware upgrade issues).

As for imports, it tends to work for maybe one in four gurus who
try, and is not something I really have time to help with. So don't
be too confident. :)

Good luck,
Jeff

On Sat, 2002-06-01 at 10:14, Rich Franzen wrote:
> Hi.  I am a member of four related mail lists.  There is no search
> capability, although they do archive the mail in monthly files available via
> FTP.  The Mail Archive FAQ explains how to transfer those archives, and I
> pretty-much understand it.
> 
> I plan to propose to the members of those lists that we take advantage of
> Mail Archive capabilities.  Your service does not require such permission,
> but these lists operate in a collegial fashion with many decisions made by
> consensus.
> 
> One concern they would have would be privacy and e-mail address mining. 
> Reviewing several existing lists, I see that e-mail addresses are removed
> from message headers, although they are left intact within the message
> bodies.
> 
> Can lists at the Mail Archive be set up so that the bodies are filtered,
> mangling the e-mail addresses?  I have written a shell script that uses
> 'sed' to slightly mangle each e-mail address it finds.  Here is an example:
> 
>         [EMAIL PROTECTED]
> becomes:
>         nobody at nowhere.com
> or (original paranoid form):
>         ftp://nobody.att.nowhere.com
> 
> The script can be modified to mangle all the files in a directory which
> match a desired pattern.  Or perhaps the 'sed' line itself might be useable
> from within MHonArc or some secondary filtering mechanism.  For the curious,
> here is that line:
> 
> sed -e
> 's/\([0-9A-Za-z._-]\)@\(\([0-9A-Za-z_-]\{1,\}\.\)\{1,\}[0-9A-Za-z_-]\)/\1
>  at \2/g' $filename >mangled/$filename
> 
> (The Yahoo mail client broke it into 3 lines, but it is supposed to be just
> one.)  Yeah, it's ugly as hell, but it works.
> 
> (FAQ note:  Your FAQ says that the GeoCrawler service easily accepts
> existing archives.  I followed the link, but there appeared to be no info
> describing how to do so.  They don't even explain how to add new lists. 
> Maybe they are not accepting new lists right now.  The Mail Archive seeems
> like a great choice, though, even if it might be a one-time pain to use your
> Perl program to bounce the e-mail within the existing archive files.)
> 
> 
> =====
> -- Rich
> -- http://rocq.home.att.net
> ==
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! - Official partner of 2002 FIFA World Cup
> http://fifaworldcup.yahoo.com
> 
> _______________________________________________
> Gossip mailing list
> [EMAIL PROTECTED]
> http://jab.org/cgi-bin/mailman/listinfo/gossip
> 



_______________________________________________
Gossip mailing list
[EMAIL PROTECTED]
http://jab.org/cgi-bin/mailman/listinfo/gossip

Reply via email to