Sv: Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread Anders Gustafsson
That is the way some email clients, say GroupWise does it. There is an option 
to put all mail from new
receivers in the junk folder for perusal. OTOH should simple graylisting do the 
same thing. But yes, an option
to deduct points from previously unseen senders would be useful.

-- 
Med vänlig hälsning

Anders Gustafsson, ingenjör
anders.gustafs...@pedago.fi  |  Support +358 18 12060  |  Direkt +358 9 315 45 
121  |  Mobil +358 40506 7099

Pedago interaktiv ab, Nygatan 7 B , AX-22100 MARIEHAMN, ÅLAND, FINLAND



>>> Mark London  2024-06-26 00:38 >>>
Bill - Thanks for the response.  As an aside, it would be nice (though 
impossible?) for a spam filter to be more suspicious of emails coming 
from a new email address, that is not in my Sent folder or my Inbox. 
FWIW. - Mark

On 6/25/2024 11:21 AM, Bill Cole wrote:
> Mark London 
> is rumored to have said:
>
>> I received a spam email with the text below, that wasn't caught by 
>> Spamassasin (at least mine).   The text actually looks like something 
>> that was generated using ChatGPT.  In any event,  I put the text 
>> through ChatGPT, and asked if it looked like spam.  At the bottom of 
>> this email , is it's analysis.  I've not been fully reading this 
>> group.  Has there been any work to allow Spamassassin to use AI?
>
> "Artificial intelligence" does not exist. It is a misnomer.
>
> Large language models like ChatGPT have a provenance problem. There's 
> no way to know why exactly the model "says" anything. In a single 
> paragraph, ChatGPT is capable of making completely and directly 
> inconsistent assertions. The only way to explain that is that despite 
> appearances, a request to answer the ham/spasm question generates text 
> with no semantic connection to the original, but which seems like an 
> explanation.
>
> SpamAssassin's code and rules all come from ASF committers, and the 
> scores are determined by examining the scan results from contributors 
> and optimizing them to a threshold of 5.0. Every scan of a message 
> results in a list of hits against documented rules. The results can be 
> analyzed and understood.
>
> We know that ChatGPT and other LLMs that are publicly available have 
> been trained on data to which they had no license. There is no way to 
> remove any particular ingested data. There's no way to know where any 
> particular LLM will have problems and no way to fix those problems. 
> This all puts them outside of the boundaries we have as an ASF 
> project. However, we do have a plugin architecture, so it is possible 
> for 3rd parties to create a plugin for LLM integration.
>
>



Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread Bill Cole

On 2024-06-25 at 17:38:28 UTC-0400 (Tue, 25 Jun 2024 17:38:28 -0400)
Mark London 
is rumored to have said:

Bill - Thanks for the response.  As an aside, it would be nice 
(though impossible?) for a spam filter to be more suspicious of emails 
coming from a new email address, that is not in my Sent folder or my 
Inbox. FWIW. - Mark


Matija's mention of AWL/TxRep is correct here. While some people find it 
a nuisance when it makes one FP into an ongoing series, I think it is 
worth enabling for most sites.


However, if you do enable either of those tools, you should have a 
mechanism for  feeding FPs into both a sitewide Bayes DB and into the 
AWL/TxRep DB by using the blocklist/welcomelist options of the 
spamassassin script.





On 6/25/2024 11:21 AM, Bill Cole wrote:

Mark London 
is rumored to have said:

I received a spam email with the text below, that wasn't caught by 
Spamassasin (at least mine).   The text actually looks like 
something that was generated using ChatGPT.  In any event,  I put 
the text through ChatGPT, and asked if it looked like spam.  At the 
bottom of this email , is it's analysis.  I've not been fully 
reading this group.  Has there been any work to allow Spamassassin 
to use AI?


"Artificial intelligence" does not exist. It is a misnomer.

Large language models like ChatGPT have a provenance problem. There's 
no way to know why exactly the model "says" anything. In a single 
paragraph, ChatGPT is capable of making completely and directly 
inconsistent assertions. The only way to explain that is that despite 
appearances, a request to answer the ham/spasm question generates 
text with no semantic connection to the original, but which seems 
like an explanation.


SpamAssassin's code and rules all come from ASF committers, and the 
scores are determined by examining the scan results from contributors 
and optimizing them to a threshold of 5.0. Every scan of a message 
results in a list of hits against documented rules. The results can 
be analyzed and understood.


We know that ChatGPT and other LLMs that are publicly available have 
been trained on data to which they had no license. There is no way to 
remove any particular ingested data. There's no way to know where any 
particular LLM will have problems and no way to fix those problems. 
This all puts them outside of the boundaries we have as an ASF 
project. However, we do have a plugin architecture, so it is possible 
for 3rd parties to create a plugin for LLM integration.






--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo@toad.social and many *@billmail.scconsult.com 
addresses)

Not Currently Available For Hire


Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread Matija Nalis


On Tue, Jun 25, 2024 at 05:38:28PM -0400, Mark London wrote:
> Bill - Thanks for the response.  As an aside, it would be nice (though
> impossible?) for a spam filter to be more suspicious of emails coming from a
> new email address, that is not in my Sent folder or my Inbox. FWIW. - Mark

Something similar is accomplished by TxRep (or to a lesser degree AWL)
Spamassassin plugin.

Ideally you should use it with mailbox learning (spam/ham training), 
see https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TxRep


-- 
Opinions above are GNU-copylefted.


Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread Mark London
Bill - Thanks for the response.  As an aside, it would be nice (though 
impossible?) for a spam filter to be more suspicious of emails coming 
from a new email address, that is not in my Sent folder or my Inbox. 
FWIW. - Mark


On 6/25/2024 11:21 AM, Bill Cole wrote:

Mark London 
is rumored to have said:

I received a spam email with the text below, that wasn't caught by 
Spamassasin (at least mine).   The text actually looks like something 
that was generated using ChatGPT.  In any event,  I put the text 
through ChatGPT, and asked if it looked like spam.  At the bottom of 
this email , is it's analysis.  I've not been fully reading this 
group.  Has there been any work to allow Spamassassin to use AI?


"Artificial intelligence" does not exist. It is a misnomer.

Large language models like ChatGPT have a provenance problem. There's 
no way to know why exactly the model "says" anything. In a single 
paragraph, ChatGPT is capable of making completely and directly 
inconsistent assertions. The only way to explain that is that despite 
appearances, a request to answer the ham/spasm question generates text 
with no semantic connection to the original, but which seems like an 
explanation.


SpamAssassin's code and rules all come from ASF committers, and the 
scores are determined by examining the scan results from contributors 
and optimizing them to a threshold of 5.0. Every scan of a message 
results in a list of hits against documented rules. The results can be 
analyzed and understood.


We know that ChatGPT and other LLMs that are publicly available have 
been trained on data to which they had no license. There is no way to 
remove any particular ingested data. There's no way to know where any 
particular LLM will have problems and no way to fix those problems. 
This all puts them outside of the boundaries we have as an ASF 
project. However, we do have a plugin architecture, so it is possible 
for 3rd parties to create a plugin for LLM integration.







Re: ChatGPT > Spamassassin? :) -- move along, this is not the reply you are looking for.

2024-06-25 Thread Grant Taylor via users

On 6/25/24 12:21 PM, Adam Bowen wrote:
I asked a well known chatbot: What would Bill Cole say if he was asked 
about integrating AI in to spamassassin?


LOL

I needed that laugh.

Thank you Adam.




--
Grant. . . .
unix || die



Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread Adam Bowen

On 25/06/2024 16:21, Bill Cole wrote:

On 2024-06-24 at 17:18:11 UTC-0400 (Mon, 24 Jun 2024 17:18:11 -0400)
Mark London 
is rumored to have said:

I received a spam email with the text below, that wasn't caught by 
Spamassasin (at least mine).   The text actually looks like something 
that was generated using ChatGPT.  In any event,  I put the text 
through ChatGPT, and asked if it looked like spam.  At the bottom of 
this email , is it's analysis.  I've not been fully reading this 
group.  Has there been any work to allow Spamassassin to use AI?


"Artificial intelligence" does not exist. It is a misnomer.

Large language models like ChatGPT have a provenance problem. There's 
no way to know why exactly the model "says" anything. In a single 
paragraph, ChatGPT is capable of making completely and directly 
inconsistent assertions. The only way to explain that is that despite 
appearances, a request to answer the ham/spasm question generates text 
with no semantic connection to the original, but which seems like an 
explanation.


SpamAssassin's code and rules all come from ASF committers, and the 
scores are determined by examining the scan results from contributors 
and optimizing them to a threshold of 5.0. Every scan of a message 
results in a list of hits against documented rules. The results can be 
analyzed and understood.


We know that ChatGPT and other LLMs that are publicly available have 
been trained on data to which they had no license. There is no way to 
remove any particular ingested data. There's no way to know where any 
particular LLM will have problems and no way to fix those problems. 
This all puts them outside of the boundaries we have as an ASF 
project. However, we do have a plugin architecture, so it is possible 
for 3rd parties to create a plugin for LLM integration.


I asked a well known chatbot: What would Bill Cole say if he was asked 
about integrating AI in to spamassassin?


Bill Cole, the maintainer of SpamAssassin, is likely to be 
enthusiastic about the potential of AI to improve spam detection.


Here's what he might say:

* Improved Spam Filtering: AI could be used to analyze spam emails 
with greater accuracy than traditional rule-based methods. 
SpamAssassin currently relies on matching email content to predefined 
rules, but AI could identify spam based on subtler cues like phrasing, 
sender behavior, and content structure.


* Adapting to Evolving Threats: Spammers constantly develop new 
tactics to bypass filters. AI can learn and adapt to these evolving 
threats much faster than manually updating rules.


* Mitigating False Positives: AI could help reduce the number of 
legitimate emails mistakenly flagged as spam (false positives). This 
is a major concern for SpamAssassin users.


However, Bill Cole might also mention some challenges:

* Training Data Bias: The effectiveness of AI depends on the quality 
and balance of training data. Biases in the training data could lead 
to the AI disproportionately flagging certain types of emails as spam.
* Transparency and Explainability: It can be difficult to understand 
how AI systems arrive at their decisions. This lack of transparency 
could make it hard to debug errors or identify potential biases.
* Computational Cost: Training and running AI models can require 
significant computational resources, which might not be feasible for 
all SpamAssassin users.


Overall, Bill Cole would likely be cautiously optimistic about the 
potential of AI to improve SpamAssassin. He would acknowledge the 
potential benefits but also highlight the challenges that need to be 
addressed.



:)


Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread Bill Cole

On 2024-06-24 at 17:18:11 UTC-0400 (Mon, 24 Jun 2024 17:18:11 -0400)
Mark London 
is rumored to have said:

I received a spam email with the text below, that wasn't caught by 
Spamassasin (at least mine).   The text actually looks like something 
that was generated using ChatGPT.  In any event,  I put the text 
through ChatGPT, and asked if it looked like spam.  At the bottom of 
this email , is it's analysis.  I've not been fully reading this 
group.  Has there been any work to allow Spamassassin to use AI?


"Artificial intelligence" does not exist. It is a misnomer.

Large language models like ChatGPT have a provenance problem. There's no 
way to know why exactly the model "says" anything. In a single 
paragraph, ChatGPT is capable of making completely and directly 
inconsistent assertions. The only way to explain that is that despite 
appearances, a request to answer the ham/spasm question generates text 
with no semantic connection to the original, but which seems like an 
explanation.


SpamAssassin's code and rules all come from ASF committers, and the 
scores are determined by examining the scan results from contributors 
and optimizing them to a threshold of 5.0. Every scan of a message 
results in a list of hits against documented rules. The results can be 
analyzed and understood.


We know that ChatGPT and other LLMs that are publicly available have 
been trained on data to which they had no license. There is no way to 
remove any particular ingested data. There's no way to know where any 
particular LLM will have problems and no way to fix those problems. This 
all puts them outside of the boundaries we have as an ASF project. 
However, we do have a plugin architecture, so it is possible for 3rd 
parties to create a plugin for LLM integration.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo@toad.social and many *@billmail.scconsult.com 
addresses)

Not Currently Available For Hire


Re: ChatGPT > Spamassassin? :)

2024-06-25 Thread John Hardin

On Mon, 24 Jun 2024, Mark London wrote:

I received a spam email with the text below, that wasn't caught by 
Spamassasin (at least mine).   The text actually looks like something that 
was generated using ChatGPT.  In any event,  I put the text through ChatGPT, 
and asked if it looked like spam.  At the bottom of this email , is it's 
analysis.  I've not been fully reading this group.  Has there been any work 
to allow Spamassassin to use AI?  Thanks.  - Mark


In a very limited manner. There is code in the repo that allows you to set 
up ham and spam corpora and scan the spam corpora to pick out common 
phrases and filter them via the ham corpora, then create rules.based on 
the phrases and (IIRC) combinations of them.


This was being used to generate dynamic fraud rulesets (the "sought" 
rules, still somewhat there as ADVANCE_FEE rules which I occasionally 
manually update) until Justin Mason left the project. It's been 
languishing since as he was providing the resources (infra and 
maintenance) to run it for those rules. I was feeding those corpora for a 
long time.


Take a look in the repo at the stuff under:

  https://svn.apache.org/viewvc/spamassassin/trunk/masses/rule-dev/

  https://svn.apache.org/viewvc/spamassassin/trunk/masses/evolve_metarule/

I don't know whether the project would be willing to set up infra to 
revive dynamic advance fee fraud (or more general) rule generation, but 
it's possible if someone was willing to bring that code up-to-date and 
figure out what was needed and corpora providers were available.



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Are you a mildly tech-literate politico horrified by the level of
  ignorance demonstrated by lawmakers gearing up to regulate online
  technology they don't even begin to grasp? Cool. Now you have a
  tiny glimpse into a day in the life of a gun owner.   -- Sean Davis
---
 9 days until the 248th anniversary of the Declaration of Independence