> > Why would human created clear text non-tokenized white list entries > > grow big? > > > Maybe my wording was/is not correct. I don't mean that a human > created list would grow big. I mean that human readable (clear text) > white list entries use more space then a hashed white list entry > (tokenized). I'm not an expert but I don't understand why it will use more space than token. Token are now char(20), email address would be max char(100) but usually it's not longer than char(60). For a long address : # echo "[EMAIL PROTECTED]" | wc -c 44
char(100) is bigger than char(20) but I don't think that will be a problem for the load of DSPAM. This will take a little more space but is it very bad ? So, if this is a problem, we can imagine truncate the address to stay with a little space. # echo "[EMAIL PROTECTED]" | cut -c0-40 [EMAIL PROTECTED] > > > > Or, are you saying that human created clear text white list > > entries would be tokenized entries? > > > No. They could but the problem is that tokenized entries can not easy > be converted back to a human readable format. > > > > In that case I don't see why the > > entries would need to be tokenized. > > > Less space usage, smaller indizes, faster processing, etc... faster processing... Without computing tokens, you recover time to match (truncated) entries in DB ;) > > > > In any case how big is big? > > > Depends on the number of entries. For example having 1'000 white list > entries in tokenized form could use 1'000 time the size of an integer > or long. Having the same 1'000 white list entries in clear text would > use more storage and memory space. > > > > I"m not familar with CSS. > > > It's well documented in CRM114. But that's not the issue. The main > issue I see is that we as the DSPAM community can not just ignore the > fact that we have more then one storage engine used in DSPAM. Adding > a new and vital function to just a bunch of the engines is not very > consistent. I know that we do already have such stuff (for example > the preference extension) but if we can avoid splitting > functionality, then we should aim for that goal (that's my own > personal opinion). > > > > Perhaps for whitelisting there could be a choice (a dspam.conf > > option?) to enable/seek per user whitelist information from file or > > DB. Then document both in the dspam.conf and docs that only file > > can be used for whitelist managment if CSS is the backend DB in use. > > > This sounds like a plan :) > -- .`'`. BONNETOT Jean-Daniel : ': : `. ` .` PRIVIANET `'` Sys & Net Admin !DSPAM:1011,48733e65150921729010758!
