>
> I just added a snippet that works with structured arrays to
> http://scipy-central.org/:
>
>
> http://scipy-central.org/item/35/1/convert-categorical-data-in-a-structure-numpy-array-to-boolean-fields
>
Great Warren, that's exactly what I was looking for. Now there's the
question: should something along the lines of that snippet be included in
sklearn.preprocessing?
If so, then a couple more questions remain. Does scikit-learn support
structured arrays, or do those need to be converted to 2-d arrays? Is it
important for some of the models that the booleans be represented a as
floats rather than as booleans? If so, then the default type for
`bool_dtype` in the snippet should be a floating point type.
*Another (separate) idea: automatically detection and conversion of
categorical attributes*
*
*
We could try to create a function that takes an arbitrary matrix of feature
vectors, and automatically converts the fields that appear to be
categorical into boolean fields. Of course, we won't be able to write a
function that always knows which fields are categorical and which are
numeric, but we could have default values that get it right most of the
time.
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general