Seems like in Sept., after your crunch and Vincenzo returns from vacation,
that the two of you should merge your changes (your changes sound
parameterizable), and maybe get it into CVS.

If you want to send me a JAR and instructions, I've already got reposting
from mbox working, although I am finding some real-world issues, e.g.,
roughly half of the messages in the target set have

  To: [EMAIL PROTECTED]

instead of

  To: <[EMAIL PROTECTED]>

which is rejected by the MailAddress class.  None of them appear to be user
messages, but all of them seem to be bounce notices from places like
CompuServe and apps like CC Mail Server.  I'm curious to know from John Webb
or Steven Short if they are seeing similar problems with their mailing list
managers.

It is possible that the To: field is broken, but the RCPT TO was properly
formatted.  RFC 822 had some bad examples, although RFC 821 was clear from
the start, and RFC 2822 is clear.  We might want to account for this in our
Fetch services.

        --- Noel

-----Original Message-----
From: Danny Angus [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 26, 2003 11:23
To: Noel J. Bergman


> I can build the current James v2 SAR.  Are you using anything other than
> what Vincenzo has on his site, or should I just use the download from his
> site?

I've got a different take on Chris Means' submission, I've not tried
Vincenzos but in theory it should be much the same.

Mine is optimised to keep the corpus size low by ignoring tokens < 4 chars
and > 15 and ignoring tokens with probabilities in the range 4-6
(neutral)

d.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to