[EMAIL PROTECTED] (Tom Fawcett) wrote:
>Ken Williams <[EMAIL PROTECTED]> wrote:
>> [EMAIL PROTECTED] (Tom Fawcett) wrote:
>> >It would be nice to have a numeric discretization module as well so
>> >these would work with mixed numerics and text, but that's probably
>> >asking for too much...
>>
>> I'm not sure what the term 'discretization' means - is it a conversion
>> of numerics to some other form, or a lumping, or something like that?
>
>Yes, it's basically taking a continuous valued attribute and creating
>appropriate bins for the value. So instead of C in [1,100] you have
>C_prime in {low,medium,high}. This is necessary for techniques like
>Naive Bayes which can't handle continuous attributes naturally.
>Figuring out the number of bins and their ranges is the trick. I guess
>there are some straightforward entropy based methods that are pretty
>easy to write. I'll implement one when I get some, um, spare time.
Cool. Keep us apprised.
>> By the way, the best place to discuss this work is on the perl-AI list,
>> at [EMAIL PROTECTED] . That's where I'm trying to coax discussions to
>> take place.
>
>OK, I've joined it.
I've cc'd this message there too.
------------------- -------------------
Ken Williams Last Bastion of Euclidity
[EMAIL PROTECTED] The Math Forum