We've (LucidWorks) got full indexing and search of the Mahout mail archives at http://find.searchub.org. We could probably add in IRC pretty easily if you want.
-Grant On Mar 22, 2014, at 2:06 AM, Andrew Musselman <andrew.mussel...@gmail.com> wrote: > I put up a parser for the IRC history logs here > https://github.com/andrewmusselman/util/blob/master/irc-parser.sh > > I'd like to write one for the user list too to figure out the most common > problems/questions so we can focus effort on repairs to bugs and docs. > > But the mail archives at > https://mail-archives.apache.org/mod_mbox/mahout-user/ are dynamic, loaded > in through JavaScript, so parsing them isn't that straightforward. > > Is it possible to get the mbox files directly? -------------------------------------------- Grant Ingersoll | @gsingers http://www.lucidworks.com