Re: ChatGPT > Spamassassin? :)

Bill Cole Tue, 25 Jun 2024 15:55:45 -0700

On 2024-06-25 at 17:38:28 UTC-0400 (Tue, 25 Jun 2024 17:38:28 -0400)
Mark London <m...@psfc.mit.edu>
is rumored to have said:

Bill - Thanks for the response. As an aside, it would be nice(though impossible?) for a spam filter to be more suspicious of emailscoming from a new email address, that is not in my Sent folder or myInbox. FWIW. - Mark

Matija's mention of AWL/TxRep is correct here. While some people find ita nuisance when it makes one FP into an ongoing series, I think it isworth enabling for most sites.

However, if you do enable either of those tools, you should have amechanism for feeding FPs into both a sitewide Bayes DB and into theAWL/TxRep DB by using the blocklist/welcomelist options of thespamassassin script.

On 6/25/2024 11:21 AM, Bill Cole wrote:
Mark London <m...@psfc.mit.edu>
is rumored to have said:
I received a spam email with the text below, that wasn't caught bySpamassasin (at least mine). The text actually looks likesomething that was generated using ChatGPT. In any event, I putthe text through ChatGPT, and asked if it looked like spam. At thebottom of this email , is it's analysis. I've not been fullyreading this group. Has there been any work to allow Spamassassinto use AI?
"Artificial intelligence" does not exist. It is a misnomer.
Large language models like ChatGPT have a provenance problem. There'sno way to know why exactly the model "says" anything. In a singleparagraph, ChatGPT is capable of making completely and directlyinconsistent assertions. The only way to explain that is that despiteappearances, a request to answer the ham/spasm question generatestext with no semantic connection to the original, but which seemslike an explanation.
SpamAssassin's code and rules all come from ASF committers, and thescores are determined by examining the scan results from contributorsand optimizing them to a threshold of 5.0. Every scan of a messageresults in a list of hits against documented rules. The results canbe analyzed and understood.
We know that ChatGPT and other LLMs that are publicly available havebeen trained on data to which they had no license. There is no way toremove any particular ingested data. There's no way to know where anyparticular LLM will have problems and no way to fix those problems.This all puts them outside of the boundaries we have as an ASFproject. However, we do have a plugin architecture, so it is possiblefor 3rd parties to create a plugin for LLM integration.



--
Bill Cole
b...@scconsult.com or billc...@apache.org

(AKA @grumpybozo@toad.social and many *@billmail.scconsult.comaddresses)

Not Currently Available For Hire

Re: ChatGPT > Spamassassin? :)

Reply via email to