> > Why would human created clear text non-tokenized white list entries
> > grow big?
> >
> Maybe my wording was/is not correct. I don't mean that a human
> created list would grow big. I mean that human readable (clear text)
> white list entries use more space then a hashed white list entry
> (tokenized).
I'm not an expert but I don't understand why it will use more space
than token. Token are now char(20), email address would be max
char(100) but usually it's not longer than char(60).
For a long address : 
# echo "[EMAIL PROTECTED]" | wc -c
44

char(100) is bigger than char(20) but I don't think that will be a
problem for the load of DSPAM. This will take a little more space but is
it very bad ?

So, if this is a problem, we can imagine truncate the address to stay
with a little space.
# echo "[EMAIL PROTECTED]" | cut -c0-40
[EMAIL PROTECTED]

> 
> 
> > Or, are you saying that human created clear text white list 
> > entries would be tokenized entries?
> >
> No. They could but the problem is that tokenized entries can not easy
> be converted back to a human readable format.
> 
> 
> > In that case I don't see why the 
> > entries would need to be tokenized. 
> > 
> Less space usage, smaller indizes, faster processing, etc...
faster processing...
Without computing tokens, you recover time to match (truncated) entries
in DB ;)

> 
> 
> > In any case how big is big?
> >
> Depends on the number of entries. For example having 1'000 white list
> entries in tokenized form could use 1'000 time the size of an integer
> or long. Having the same 1'000 white list entries in clear text would
> use more storage and memory space.
> 
>  
> > I"m not familar with CSS. 
> > 
> It's well documented in CRM114. But that's not the issue. The main
> issue I see is that we as the DSPAM community can not just ignore the
> fact that we have more then one storage engine used in DSPAM. Adding
> a new and vital function to just a bunch of the engines is not very
> consistent. I know that we do already have such stuff (for example
> the preference extension) but if we can avoid splitting
> functionality, then we should aim for that goal (that's my own
> personal opinion).
> 
> 
> > Perhaps for whitelisting there could be a choice (a dspam.conf
> > option?) to enable/seek per user whitelist information from file or
> > DB.  Then document both in the dspam.conf and docs that only file
> > can be used for whitelist managment if CSS is the backend DB in use.
> > 
> This sounds like a plan :)
> 
--
 .`'`.   BONNETOT Jean-Daniel
:  ': :  
`. ` .`  PRIVIANET
  `'`    Sys & Net Admin


!DSPAM:1011,48733e65150921729010758!


Reply via email to