>
> I just added a snippet that works with structured arrays to
> http://scipy-central.org/:
>
>
> http://scipy-central.org/item/35/1/convert-categorical-data-in-a-structure-numpy-array-to-boolean-fields
>

Great Warren, that's exactly what I was looking for.  Now there's the
question: should something along the lines of that snippet be included in
sklearn.preprocessing?

If so, then a couple more questions remain.  Does scikit-learn support
structured arrays, or do those need to be converted to 2-d arrays?  Is it
important for some of the models that the booleans be represented a as
floats rather than as booleans?  If so, then the default type for
`bool_dtype` in the snippet should be a floating point type.


*Another (separate) idea: automatically detection and conversion of
categorical attributes*
*
*
We could try to create a function that takes an arbitrary matrix of feature
vectors, and automatically converts the fields that appear to be
categorical into boolean fields.   Of course, we won't be able to write a
function that always knows which fields are categorical and which are
numeric, but we could have default values that get it right most of the
time.
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to