Re: why does SA without autolearn need bayes read-write?
On Sat, 31 Jan 2015 16:46:28 +0100 Reindl Harald wrote: > according to the documentation *it is* a bug: That's just a wiki entry. > http://wiki.apache.org/spamassassin/SiteWideBayesSetup > Please note this directory needs to be RWX for all users that > SpamAssassin will be executed as, or R-X if autolearning and > automatic expiry are disabled
Re: why does SA without autolearn need bayes read-write?
according to the documentation *it is* a bug: http://wiki.apache.org/spamassassin/SiteWideBayesSetup Please note this directory needs to be RWX for all users that SpamAssassin will be executed as, or R-X if autolearning and automatic expiry are disabled bayes_auto_expire 0 bayes_auto_learn 0 bayes_learn_during_report 0 rsyslog.conf for now masks it: :msg, contains, "bayes db update ignored: Read-only file system" stop Am 29.01.2015 um 18:34 schrieb RW: On Wed, 28 Jan 2015 15:58:56 +0100 Reindl Harald wrote: * first: it is a bug to write/lock when auto_expire / auto_learn is off As I said, it's not a bug. The updates are done in case you want to expire later with sa-learn --force-expire. Auto-expiry means performing the expiry automatically when the database goes over its configured token limit. Most people don't do this because the expiry is then done during a classification which can cause a timeout. Setting "auto_expire 0" is not a way of telling SA that you aren't going to expire the database. On Wed, 28 Jan 2015 01:03:37 +0100 Reindl Harald wrote: ... even if we decide to kill spam-spamles older than x months it needs to be done properly to the 50% spam / 50% ham ratio which is the reason the bayes works that good The ratio doesn't matter; it's a myth that it should be 50:50 or match the ratio in your mail. What's important is that you learn enough ham and enough spam, and that the training is correct and sufficiently representative. It is preferable that there isn't a big mismatch between the ham/spam ratio in the corpus as a whole and in recently added mail as that can skew the probabilities of new tokens. compared with autolearning setups where everyone i have seen in the past 8 years became worser each month until classify most ham as spam and let thorugh the real crap It works for some, but when it fails it's not because the ratio of spam to ham is wrong, it's because of a combination of mistraining, inadequate ham and poor choices in what's learned signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On Wed, 28 Jan 2015 15:58:56 +0100 Reindl Harald wrote: > * first: it is a bug to write/lock when auto_expire / auto_learn is > off As I said, it's not a bug. The updates are done in case you want to expire later with sa-learn --force-expire. Auto-expiry means performing the expiry automatically when the database goes over its configured token limit. Most people don't do this because the expiry is then done during a classification which can cause a timeout. Setting "auto_expire 0" is not a way of telling SA that you aren't going to expire the database. On Wed, 28 Jan 2015 01:03:37 +0100 Reindl Harald wrote: > ... even if we decide to kill spam-spamles older than x > months it needs to be done properly to the 50% spam / 50% ham > ratio which is the reason the bayes works that good The ratio doesn't matter; it's a myth that it should be 50:50 or match the ratio in your mail. What's important is that you learn enough ham and enough spam, and that the training is correct and sufficiently representative. It is preferable that there isn't a big mismatch between the ham/spam ratio in the corpus as a whole and in recently added mail as that can skew the probabilities of new tokens. > compared with > autolearning setups where everyone i have seen in the past 8 years > became worser each month until classify most ham as spam and let > thorugh the real crap It works for some, but when it fails it's not because the ratio of spam to ham is wrong, it's because of a combination of mistraining, inadequate ham and poor choices in what's learned.
Re: why does SA without autolearn need bayes read-write?
Am 29.01.2015 um 16:23 schrieb John Hardin: On Thu, 29 Jan 2015, Reindl Harald wrote: Am 29.01.2015 um 10:18 schrieb Matus UHLAR - fantomas: On 28.01.15 01:03, Reindl Harald wrote: > if understand you correctly we agree that there is no reason /var > can't be mounted read-only? I do not agree. The whole point of /var is to contain varying data and mounting it read-only defeats the whole purpose of /var. i am not talking about a own partition i am talking about a *systemd namespace* and the intention *not* have anything below /var writeable for a network facing service "no reason /var can't be mounted read-only" does *not* suggest that * the initial post makes it pretty clear * it was even quoted by fantomas first reply on this thread * i made that clear multiple times Weitergeleitete Nachricht ---- Betreff: Re: why does SA without autolearn need bayes read-write? Datum: Wed, 28 Jan 2015 15:04:26 +0100 Von: Reindl Harald An: users@spamassassin.apache.org no need for mount own partitions on recent linux systems that's what namespaces are for and systemd has easy interfaces Weitergeleitete Nachricht ---- Betreff: Re: why does SA without autolearn need bayes read-write? Datum: Tue, 27 Jan 2015 13:44:33 +0100 Von: Matus UHLAR - fantomas An: users@spamassassin.apache.org On 27.01.15 03:01, Reindl Harald wrote: > with "bayes_auto_learn 0" there is no reason to lock the bayes > database and the spamd-service should be happy with > "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them. Weitergeleitete Nachricht Betreff: why does SA without autolearn need bayes read-write? Datum: Tue, 27 Jan 2015 03:01:10 +0100 Von: Reindl Harald An: Mailing-List spamassassin IMHO that is a bug with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" training and sa-update is done on a shell independent of network aware services Jan 27 02:52:58 testserver spamd[2794]: bayes: cannot write to /var/lib/spamass-milter/.spamassassin/bayes_journal, bayes db update ignored: Read-only file system Jan 27 02:52:58 testserver spamd[2794]: spamd: clean message (0.5/5.5) for sa-milt:189 in 0.5 seconds, 804 bytes. Jan 27 02:52:58 testserver spamd[2794]: spamd: result: . 0 - ALL_TRUSTED,BAYES_50,T_RP_MATCHES_RCVD scantime=0.5,size=804,user=sa-milt,uid=189,required_score=5.5,rhost=localhost,raddr=127.0.0.1,rport=20782,mid=<54c6ef78.8090...@testserver.rhsoft.net>,bayes=0.40,autolearn=disabled signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On Thu, 29 Jan 2015, Reindl Harald wrote: Am 29.01.2015 um 10:18 schrieb Matus UHLAR - fantomas: On 28.01.15 01:03, Reindl Harald wrote: > if understand you correctly we agree that there is no reason /var > can't be mounted read-only? I do not agree. The whole point of /var is to contain varying data and mounting it read-only defeats the whole purpose of /var. i am not talking about a own partition i am talking about a *systemd namespace* and the intention *not* have anything below /var writeable for a network facing service "no reason /var can't be mounted read-only" does *not* suggest that. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Political Correctness is a doctrine which is based on the premise that it is possible, through nothing more than a suitable choice of words, to pick up a turd by the clean end. --- 3 days until the 12th anniversary of the loss of STS-107 Columbia
Re: why does SA without autolearn need bayes read-write?
Am 29.01.2015 um 10:18 schrieb Matus UHLAR - fantomas: On 28.01.15 01:03, Reindl Harald wrote: if understand you correctly we agree that there is no reason /var can't be mounted read-only? I do not agree. The whole point of /var is to contain varying data and mounting it read-only defeats the whole purpose of /var. i am not talking about a own partition i am talking about a *systemd namespace* and the intention *not* have anything below /var writeable for a network facing service frankly - can we stop to dicuss left and right? i asked for not touch bayes from the spamd service for good reasons, know the setup and there are well considered reasons why every piece is like it is - if it's not possible - can it made possible and is someone willing to implement it for money and how much money - that's it __ I see following possibilities for you: - move BAYES to a database of any kind for sure not, the bayes is build with a script and rsynced to other machines which have to work *independent* from each other and so there is no point in setup a database with replication, failovers and a lot of time-invest when things can be simple - set up SA to learn to journal, and use overlayfs for the journal (rememer to set bayes_journal_max_size big enough), droping it or syncing periodically it is big enough use_learner 1 use_bayes 1 use_bayes_rules 1 bayes_use_hapaxes 1 bayes_expiry_max_db_size 250 bayes_auto_expire 0 bayes_auto_learn 0 bayes_learn_during_report 0 bayes_learn_to_journal 1 >> the intention of this *global bayes* is *not* to learn or expire >> anything - the implemented "remove from bayes" method is just remove >> the message from the corpus folder and type "sa-learn.sh rebuild" > > I believe it's much more effective to expire old tokens > that are not appeating in mail than to purge old mail > from DB, when you don't know if the tokens > are still used or not. > > I'm afraid you got the expire issue wrong... i got nothing wrong i don't matter if tokens are not used for two months, 10 years expierience shows they re-appear sooner or later and i don't invest hundrets of work-hours to collect thousands of mail samples to have token expire automatically the bayes works *perfectly* and frankly as started with SA a large part of the spam bayes was built by years old archive data signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On 27.01.15 18:49, Reindl Harald wrote: the intention of this *global bayes* is *not* to learn or expire anything - the implemented "remove from bayes" method is just remove the message from the corpus folder and type "sa-learn.sh rebuild" I believe it's much more effective to expire old tokens that are not appeating in mail than to purge old mail from DB, when you don't know if the tokens are still used or not. I'm afraid you got the expire issue wrong... -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. "One World. One Web. One Program." - Microsoft promotional advertisement "Ein Volk, ein Reich, ein Fuhrer!" - Adolf Hitler
Re: why does SA without autolearn need bayes read-write?
On 28.01.15 01:03, Reindl Harald wrote: if understand you correctly we agree that there is no reason /var can't be mounted read-only? I do not agree. The whole point of /var is to contain varying data and mounting it read-only defeats the whole purpose of /var. I see following possibilities for you: - move BAYES to a database of any kind - set up SA to learn to journal, and use overlayfs for the journal (rememer to set bayes_journal_max_size big enough), droping it or syncing periodically -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Your mouse has moved. Windows NT will now restart for changes to take to take effect. [OK]
Re: why does SA without autolearn need bayes read-write?
On 01/28/2015 05:00 PM, Reindl Harald wrote: BTW it is not visible which users are core-developers on this list and which are not - until now i thought you are as example I am part of the dev team or as you say "core-developers" which doesn't mean I have to be a Perl monger. There's other tasks...
Re: why does SA without autolearn need bayes read-write?
Am 28.01.2015 um 16:52 schrieb Axb: On 01/28/2015 04:38 PM, Reindl Harald wrote: is AFAIK relevant in context of sa-learn to not re-train the same messages again and again - and it has it's own bugs becaue for a few messages it contains random parts of the message itself, fire sa-learn on the whole corpus would add these messages each time to "bayes_toks" see two example snippets below hence it is that large here -rw--- 1 sa-milt sa-milt 5,4K 2015-01-28 16:34 bayes_journal -rw--- 1 sa-milt sa-milt 1,3M 2015-01-28 16:12 bayes_seen -rw--- 1 sa-milt sa-milt 40M 2015-01-28 16:33 bayes_toks -rw--- 1 sa-milt sa-milt 98 2014-08-21 17:47 user_prefs _ something here does NOT make sense 1.3 MB of seen against 40MB tokens. someone please correct me if I'm wrong: afaik, this probably means you've deleted bayes_seen so bayes has lost it's record of what it has processed so it will relearn stuff you already fed it. no, i explained what happens in the part you stripped from the quote - it contains randomly complete message parts independent how often i delete *any file* in the userhome and rebuild from scratch if i delete "bayes_seen" than it happens by a complete reset with sa-learn.sh using sa-learn to *rebuild from scratch* based on the forever stored raw-mails in the folders "ham" and "spam" signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
Am 28.01.2015 um 16:39 schrieb Axb: On 01/28/2015 03:58 PM, Reindl Harald wrote: * third: if you would be a smart upstream in case of a company admin asking for a change instead "write a patch" you could make a offer talking about money to include the change in the next upstream version - we sponsored changes and maintainance of projects like DBMail, Netatalk and others multiple times in the last years - just because instead pertly responses a friendly "i am not that much interested but i guess the amount of time will be xx hours for xx € per hour and so i am open" I'm not a usable Perl programmer either but I put my cash where my mouth is and have $pon$ored several major features in SA. so would i sponsor things if it's worth Instead of complaining, ranting and/or being frustrated, it's way more productive to open a feature request in bugzilla and sweet talk one of the main devs to add your enhancement (if they consider it worthy) or get someone to code it for you so you can use it in your deployment/SA fork. I promise you, it works. the ranting and beeing frustrated comes from the hostile manner on that list as repsonse to *every question* starting with the reply to my first post last year in the style "go away, we don't care about milter" up to "you are outright lying" until i proved that my observations are right frankly the intention of writing first a mail to a mailing-list before make a feature request is to find out if theres a way to change behavior via configuration and all the "creep away"-style responses don't do anything good for starting "sweet talk" nor do they give the feeling any feature request is welcome at all another reason for first writing a mail to a list is my own developer expierience where users hundrets of times asked for things and i was able to change some behavior regression free while the user is still on the phone - not every change needs bureaucracy BTW it is not visible which users are core-developers on this list and which are not - until now i thought you are as example signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On 01/28/2015 04:38 PM, Reindl Harald wrote: is AFAIK relevant in context of sa-learn to not re-train the same messages again and again - and it has it's own bugs becaue for a few messages it contains random parts of the message itself, fire sa-learn on the whole corpus would add these messages each time to "bayes_toks" see two example snippets below hence it is that large here -rw--- 1 sa-milt sa-milt 5,4K 2015-01-28 16:34 bayes_journal -rw--- 1 sa-milt sa-milt 1,3M 2015-01-28 16:12 bayes_seen -rw--- 1 sa-milt sa-milt 40M 2015-01-28 16:33 bayes_toks -rw--- 1 sa-milt sa-milt 98 2014-08-21 17:47 user_prefs _ something here does NOT make sense 1.3 MB of seen against 40MB tokens. someone please correct me if I'm wrong: afaik, this probably means you've deleted bayes_seen so bayes has lost it's record of what it has processed so it will relearn stuff you already fed it. Also, a 40MB tokens DB file will not exactly help your speed. if you don't want to use Redis then at least use SDBM which is way faster. local.cf: bayes_store_module Mail::SpamAssassin::BayesStore::SDBM and restore/relearn your corpus
Re: why does SA without autolearn need bayes read-write?
On 01/28/2015 03:58 PM, Reindl Harald wrote: * third: if you would be a smart upstream in case of a company admin asking for a change instead "write a patch" you could make a offer talking about money to include the change in the next upstream version - we sponsored changes and maintainance of projects like DBMail, Netatalk and others multiple times in the last years - just because instead pertly responses a friendly "i am not that much interested but i guess the amount of time will be xx hours for xx € per hour and so i am open" I'm not a usable Perl programmer either but I put my cash where my mouth is and have $pon$ored several major features in SA. Instead of complaining, ranting and/or being frustrated, it's way more productive to open a feature request in bugzilla and sweet talk one of the main devs to add your enhancement (if they consider it worthy) or get someone to code it for you so you can use it in your deployment/SA fork. I promise you, it works. EOT
Re: why does SA without autolearn need bayes read-write?
Am 28.01.2015 um 16:24 schrieb Axb: On 01/28/2015 03:58 PM, Reindl Harald wrote: * first: it is a bug to write/lock when auto_expire / auto_learn is off bayes_seen is AFAIK relevant in context of sa-learn to not re-train the same messages again and again - and it has it's own bugs becaue for a few messages it contains random parts of the message itself, fire sa-learn on the whole corpus would add these messages each time to "bayes_toks" see two example snippets below hence it is that large here -rw--- 1 sa-milt sa-milt 5,4K 2015-01-28 16:34 bayes_journal -rw--- 1 sa-milt sa-milt 1,3M 2015-01-28 16:12 bayes_seen -rw--- 1 sa-milt sa-milt 40M 2015-01-28 16:33 bayes_toks -rw--- 1 sa-milt sa-milt 98 2014-08-21 17:47 user_prefs _ ^G^H^G^F^F<9A>^F<98>^Fb^F`^F*^F(^F^E^E^E^E<82>^E<80>^EJ^EH^E^R^E^P^E^D^D^D^Dj^Dh^D2^D0^D^C^C^C^C<8A>^C<88>^CR^CP^C^Z^C^X^CCORATION: underline =7D bgColor=3D=23ff> color=3D=23= ff>For Immediate Release color^Ah^Afe6ea55025493eb288d63d54543b277d5d112c74@sa_generated^Ah^Af9b8d0a0253cba315ff4852870be8fc1bad03318@sa_generated^Ah^Af89aef32ae61c7084c4043b1234f13c1e0da74c1 @sa_generated^Ah^Af6cf9fe43d4279181f91b88d2be31914290f664f@sa_generated^Ah^Aed8cc17c1c67d46bbb3ad34dd8cba7d4daa80249@sa_generated^Ah^Adeeec278351d9105bd116971465e502cc35becbc@sa_generated^Ah^Ad9d0a11654680fe56d3 7f10f4fcc4b7205e768a9@sa_generated^Ah^ _ iletilmesi gereken bir duyuru, haber ya da kampanya s�z konusu oldu�unda tasar�m, bask�, ajans, arama, da��t�m vb. zaman kay�plar� ya�amadan do�rudan hedef kitlenize ula�t�rabilirsiniz. Pratiktir : Mesaj�n t�keticiye ula�mas� neredeyse kesindir. Ula�t���nda ya da iletildi�inde rapor al�nabilir. Etklidir : G�nderdi�iniz toplu mesajlar ile ilgili an�nda geri d�n�� alabilirsiniz _ Toplu mesajla�ma di�er reklam ara�lar�na g�re olduk�a ekonomiktir. signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On 01/28/2015 03:58 PM, Reindl Harald wrote: * first: it is a bug to write/lock when auto_expire / auto_learn is off bayes_seen
Re: why does SA without autolearn need bayes read-write?
On Wed, 2015-01-28 at 15:04 +0100, Reindl Harald wrote: > no need for mount own partitions on recent linux systems > that's what namespaces are for and systemd has easy interfaces > Fair enough: I thought you were talking about some sort of site-wide read-only mount, but using systemd to limit the read-only access to SA is nice. Martin
Re: why does SA without autolearn need bayes read-write?
Am 28.01.2015 um 15:46 schrieb Axb: On 01/28/2015 03:18 PM, Kevin A. McGrail wrote: On 1/28/2015 9:04 AM, Reindl Harald wrote: my main point is that i don't want the locking IO when nothing then the self developed maintainance scripts for the bayes has a business to write anything there - it should be only read and in the best case from each spamc-forker only opened once in his lifetime for best performance A) I have a feeling using Redis will provide the fastest performance either way... afaik, Redis requires "bayes_auto_expire 1" but one can set a huge TTL for "bayes_token_ttl" & "bayes_seen_ttl" Of course, Redis also cause I/O when it dumps to disk but in all the SA noise I don't understand why Reindl is so scared of the Bayes file based I/O i am scared about the read-only-fs warnings cluttering the logs where there is no business to write anything Using modern hardware, the DB file type is slower than any I/O but then... Lets assume he's scared of speed coz he does scans during the smtp sessions AND he's using the default DB backend instead of the faster SDBM (or Redis) :) i avoid additional complexity and dependencies for damned good reasons and the last time i did not so in case of prosody (jabber server) and used sqlite instead plaintext defaults it took me a lot of wasted time after a distr-upgrade but then, he'll supply the patch. BAZINGA! what a hostile reaction to reports * first: it is a bug to write/lock when auto_expire / auto_learn is off * second: i am not a perl developer * third: if you would be a smart upstream in case of a company admin asking for a change instead "write a patch" you could make a offer talking about money to include the change in the next upstream version - we sponsored changes and maintainance of projects like DBMail, Netatalk and others multiple times in the last years - just because instead pertly responses a friendly "i am not that much interested but i guess the amount of time will be xx hours for xx € per hour and so i am open" point 3 is BTW the reason why DBMail 3.x still has the native autoreply feature - so you developers should consider acting somehow less hostile and more smart in context of user-requests and make even money with it signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On 01/28/2015 03:18 PM, Kevin A. McGrail wrote: On 1/28/2015 9:04 AM, Reindl Harald wrote: my main point is that i don't want the locking IO when nothing then the self developed maintainance scripts for the bayes has a business to write anything there - it should be only read and in the best case from each spamc-forker only opened once in his lifetime for best performance A) I have a feeling using Redis will provide the fastest performance either way... afaik, Redis requires "bayes_auto_expire 1" but one can set a huge TTL for "bayes_token_ttl" & "bayes_seen_ttl" Of course, Redis also cause I/O when it dumps to disk but in all the SA noise I don't understand why Reindl is so scared of the Bayes file based I/O. Using modern hardware, the DB file type is slower than any I/O but then... Lets assume he's scared of speed coz he does scans during the smtp sessions AND he's using the default DB backend instead of the faster SDBM (or Redis) :) but then, he'll supply the patch. BAZINGA!
Re: why does SA without autolearn need bayes read-write?
On 1/28/2015 9:04 AM, Reindl Harald wrote: my main point is that i don't want the locking IO when nothing then the self developed maintainance scripts for the bayes has a business to write anything there - it should be only read and in the best case from each spamc-forker only opened once in his lifetime for best performance A) I have a feeling using Redis will provide the fastest performance either way... B) Feel free to submit a patch for the feature request Regards, KAM
Re: why does SA without autolearn need bayes read-write?
Am 28.01.2015 um 12:11 schrieb Martin Gregorie: On Tue, 2015-01-27 at 16:40 -0800, John Hardin wrote: On Wed, 28 Jan 2015, Reindl Harald wrote: if understand you correctly we agree that there is no reason /var can't be mounted read-only? Other than the historical practice that /var is intended to contain varying data, and that implies read/write... Years ago I moved my Apache and my PostgreSQL installations from /var to /home. Both are happy in their new location, so I can't see why the same trick wouldn't work equally well for MySQL. Pick any place you want, e.g. its own partition, then you can mount it read-only and know you can't upset anything else by accident. I suspect that HR has done exactly that and symlinked the read-only partition into /var, which is another way to achieving the same end. The main reasons I moved Apache and PostgreSQL to /home was so I could back them up more easily and because /home has its own partition to make Fedora reinstalls/upgrades easier no need for mount own partitions on recent linux systems that's what namespaces are for and systemd has easy interfaces my main point is that i don't want the locking IO when nothing then the self developed maintainance scripts for the bayes has a business to write anything there - it should be only read and in the best case from each spamc-forker only opened once in his lifetime for best performance [root@testserver:~]$ cat /etc/systemd/system/spamassassin.service [Unit] Description=Spamassassin Daemon After=network.service systemd-networkd.service network-online.target Before=postfix.service [Service] Environment="TMPDIR=/tmp" PermissionsStartOnly=true ExecStartPre=/usr/bin/find /var/lib/spamassassin/ -type d -exec /bin/chmod 0755 "{}" \; ExecStartPre=/usr/bin/find /var/lib/spamassassin/ -type f -exec /bin/chmod 0644 "{}" \; ExecStart=/usr/bin/spamd -c -H --max-children=10 --min-children=1 --min-spare=1 --max-spare=3 --port=10028 ExecReload=/usr/bin/kill -HUP $MAINPID Environment="LANG=en_GB.UTF-8" User=sa-milt Group=sa-milt Nice=15 StandardOutput=null StandardError=null SyslogFacility=mail Restart=always RestartSec=1 PrivateTmp=yes PrivateDevices=yes NoNewPrivileges=yes CapabilityBoundingSet=~CAP_AUDIT_CONTROL CAP_AUDIT_WRITE CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_MODULE CAP_SYS_PTRACE ReadOnlyDirectories=/etc ReadOnlyDirectories=/usr ReadOnlyDirectories=/var/lib InaccessibleDirectories=-/var/lib/spamassassin-milter/training InaccessibleDirectories=-/boot InaccessibleDirectories=-/home InaccessibleDirectories=-/media InaccessibleDirectories=-/root InaccessibleDirectories=-/etc/dbus-1 InaccessibleDirectories=-/etc/modprobe.d InaccessibleDirectories=-/etc/modules-load.d InaccessibleDirectories=-/etc/postfix InaccessibleDirectories=-/etc/ssh InaccessibleDirectories=-/etc/sysctl.d InaccessibleDirectories=-/run/console InaccessibleDirectories=-/run/dbus InaccessibleDirectories=-/run/lock InaccessibleDirectories=-/run/mount InaccessibleDirectories=-/run/systemd/generator InaccessibleDirectories=-/run/systemd/system InaccessibleDirectories=-/run/systemd/users InaccessibleDirectories=-/run/udev InaccessibleDirectories=-/run/user InaccessibleDirectories=-/usr/lib64/dbus-1 InaccessibleDirectories=-/usr/lib64/xtables InaccessibleDirectories=-/usr/lib/dracut InaccessibleDirectories=-/usr/libexec/iptables InaccessibleDirectories=-/usr/libexec/openssh InaccessibleDirectories=-/usr/libexec/postfix InaccessibleDirectories=-/usr/lib/grub InaccessibleDirectories=-/usr/lib/kernel InaccessibleDirectories=-/usr/lib/modprobe.d InaccessibleDirectories=-/usr/lib/modules InaccessibleDirectories=-/usr/lib/modules-load.d InaccessibleDirectories=-/usr/lib/rpm InaccessibleDirectories=-/usr/lib/sysctl.d InaccessibleDirectories=-/usr/lib/udev InaccessibleDirectories=-/usr/local/scripts InaccessibleDirectories=-/var/db InaccessibleDirectories=-/var/lib/dbus InaccessibleDirectories=-/var/lib/dnf InaccessibleDirectories=-/var/lib/rpm InaccessibleDirectories=-/var/lib/systemd InaccessibleDirectories=-/var/lib/yum InaccessibleDirectories=-/var/spool signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On Tue, 2015-01-27 at 16:40 -0800, John Hardin wrote: > On Wed, 28 Jan 2015, Reindl Harald wrote: > > > if understand you correctly we agree that there is no reason /var can't be > > mounted read-only? > > Other than the historical practice that /var is intended to contain > varying data, and that implies read/write... > Years ago I moved my Apache and my PostgreSQL installations from /var to /home. Both are happy in their new location, so I can't see why the same trick wouldn't work equally well for MySQL. Pick any place you want, e.g. its own partition, then you can mount it read-only and know you can't upset anything else by accident. I suspect that HR has done exactly that and symlinked the read-only partition into /var, which is another way to achieving the same end. The main reasons I moved Apache and PostgreSQL to /home was so I could back them up more easily and because /home has its own partition to make Fedora reinstalls/upgrades easier. Martin
Re: why does SA without autolearn need bayes read-write?
On Wed, 28 Jan 2015, Reindl Harald wrote: Setting bayes_auto_expire 0 doesn't imply the database is not going to expired. The recommended way to expire is to turn-off auto-expiry and expire from cron. don't understand that completly * bayes_auto_expire 0 * which cronjob would expire The one you write to run the expiry. No such cron job is provided with base SA by default, though it's possible distro packagers may add one. * hopefully not sa-update Nope. That said, it's not really essential to have atime updates. Without them the tokens would still have reasonably sensible timestamps derived from the received headers of the mail used in training. It wouldn't break expiry if they could be turned-off if understand you correctly we agree that there is no reason /var can't be mounted read-only? Other than the historical practice that /var is intended to contain varying data, and that implies read/write... -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- The tree of freedom must be freshened from time to time with the blood of tyrants and tyrannosaurs. -- DW, commenting on the GM6 Lynx .50BMG bullpup --- Today: Wolfgang Amadeus Mozart's 259th Birthday
Re: why does SA without autolearn need bayes read-write?
Am 28.01.2015 um 00:55 schrieb RW: On Tue, 27 Jan 2015 18:49:23 +0100 Reindl Harald wrote: Am 27.01.2015 um 17:28 schrieb Matus UHLAR - fantomas: nobody expires or updates anything in a hand-maintained bayes the one you might use, but not without timestamps the intention of this *global bayes* is *not* to learn or expire anything - the implemented "remove from bayes" method is just remove the message from the corpus folder and type "sa-learn.sh rebuild" when i say "*_auto_*" then i mean that and hence the desired result is not write anything, just don't touch the bayes-db in normal operations and don't waste disk IO bayes_auto_expire 0 Setting bayes_auto_expire 0 doesn't imply the database is not going to expired. The recommended way to expire is to turn-off auto-expiry and expire from cron. don't understand that completly * bayes_auto_expire 0 * which cronjob would expire * hopefully not sa-update i really mean it serious that this bayes has no record which has to expire at any point of time until i say so for the specific message that's just because it was a hard work to get around 16 train messages over 5 onths, most if not all are unliekly to become obsolete and even if we decide to kill spam-spamles older than x months it needs to be done properly to keep the 50% spam / 50% ham ratio which is the reason the bayes works that good compared with autolearning setups where everyone i have seen in the past 8 years became worser each month until classify most ham as spam and let thorugh the real crap That said, it's not really essential to have atime updates. Without them the tokens would still have reasonably sensible timestamps derived from the received headers of the mail used in training. It wouldn't break expiry if they could be turned-off if understand you correctly we agree that there is no reason /var can't be mounted read-only? signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On Tue, 27 Jan 2015 18:49:23 +0100 Reindl Harald wrote: > > > Am 27.01.2015 um 17:28 schrieb Matus UHLAR - fantomas: > >> nobody expires or updates anything in a hand-maintained bayes > > > > the one you might use, but not without timestamps > > the intention of this *global bayes* is *not* to learn or expire > anything - the implemented "remove from bayes" method is just remove > the message from the corpus folder and type "sa-learn.sh rebuild" > > when i say "*_auto_*" then i mean that and hence the desired result > is not write anything, just don't touch the bayes-db in normal > operations and don't waste disk IO > > bayes_auto_expire 0 Setting bayes_auto_expire 0 doesn't imply the database is not going to expired. The recommended way to expire is to turn-off auto-expiry and expire from cron. That said, it's not really essential to have atime updates. Without them the tokens would still have reasonably sensible timestamps derived from the received headers of the mail used in training. It wouldn't break expiry if they could be turned-off.
Re: why does SA without autolearn need bayes read-write?
Am 27.01.2015 um 17:28 schrieb Matus UHLAR - fantomas: Am 27.01.2015 um 13:44 schrieb Matus UHLAR - fantomas: On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them On 27.01.15 14:23, Reindl Harald wrote: which expiration? nobody expires or updates anything in a hand-maintained bayes the one you might use, but not without timestamps the intention of this *global bayes* is *not* to learn or expire anything - the implemented "remove from bayes" method is just remove the message from the corpus folder and type "sa-learn.sh rebuild" when i say "*_auto_*" then i mean that and hence the desired result is not write anything, just don't touch the bayes-db in normal operations and don't waste disk IO bayes_auto_expire 0 bayes_auto_learn 0 _ use_learner 1 use_bayes 1 use_bayes_rules 1 bayes_use_hapaxes 1 bayes_expiry_max_db_size 250 bayes_auto_expire 0 bayes_auto_learn 0 bayes_learn_during_report 0 bayes_learn_to_journal 1 signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
Am 27.01.2015 um 13:44 schrieb Matus UHLAR - fantomas: On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them On 27.01.15 14:23, Reindl Harald wrote: which expiration? nobody expires or updates anything in a hand-maintained bayes the one you might use, but not without timestamps. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Due to unexpected conditions Windows 2000 will be released in first quarter of year 1901
Re: why does SA without autolearn need bayes read-write?
Am 27.01.2015 um 14:33 schrieb Axb: On 01/27/2015 02:23 PM, Reindl Harald wrote: Am 27.01.2015 um 13:44 schrieb Matus UHLAR - fantomas: On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them which expiration? nobody expires or updates anything in a hand-maintained bayes use_bayes 1 bayes_auto_expire 0 bayes_auto_learn 0 would this help? use_learner 0 no, it leads in not use bayes at all result: . -2 - ALL_TRUSTED scantime=0.1 result: . 0 - ALL_TRUSTED,BAYES_50 scantime=0.5 _ not further tested if it also has an impact to sa-learn on the shell too - IMHO when "bayes_auto_expire", "bayes_auto_learn" and "bayes_learn_during_report" are all 0 there should be no locking not only because the permissions, also because of wasted disk-IO _ that leads in bayes not used at all "use_learner 0" was the only difference to production use_learner 0 use_bayes 1 use_bayes_rules 1 bayes_use_hapaxes 1 bayes_expiry_max_db_size 250 bayes_auto_expire 0 bayes_auto_learn 0 bayes_learn_during_report 0 bayes_learn_to_journal 1 signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
On Tue, 27 Jan 2015, Reindl Harald wrote: nobody expires or updates anything in a hand-maintained bayes Just amessage from nobody (important) apparently ==John ff
Re: why does SA without autolearn need bayes read-write?
On 01/27/2015 02:23 PM, Reindl Harald wrote: Am 27.01.2015 um 13:44 schrieb Matus UHLAR - fantomas: On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them which expiration? nobody expires or updates anything in a hand-maintained bayes use_bayes 1 bayes_auto_expire 0 bayes_auto_learn 0 would this help? use_learner 0
Re: why does SA without autolearn need bayes read-write?
Am 27.01.2015 um 13:44 schrieb Matus UHLAR - fantomas: On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them which expiration? nobody expires or updates anything in a hand-maintained bayes use_bayes 1 bayes_auto_expire 0 bayes_auto_learn 0 signature.asc Description: OpenPGP digital signature
Re: why does SA without autolearn need bayes read-write?
Matus UHLAR - fantomas skrev den 2015-01-27 13:44: On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them. extension in spamassassin need to support seperate DBI:foo for WRITEONLY, and another for DBI:bar READONLY, sqlgrey can do it, but spamassasin not yet this can be extended to all DBI: databases used in spamassassin, should be fairly simple to make that work
Re: why does SA without autolearn need bayes read-write?
On 27.01.15 03:01, Reindl Harald wrote: with "bayes_auto_learn 0" there is no reason to lock the bayes database and the spamd-service should be happy with "ReadOnlyDirectories=/var/lib" the bayes databaase contains not only tokens, but also timestamps used for expiration. That's why you need to write to them. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. WinError #9: Out of error messages.