Hai all,
I have a very large pandas dataframe. Below is the sample
* Id description*
1 switvch for air conditioner transformer..............
2 control tfrmr...........
3 coling pad.................
4 DRLG machine
5 hair smothing kit...............
For further process, I will contruct doument-term matrix of above data
using Sckit-learn's countvectorizer
*countvec = CountVectorizer()*
*documenttermmatrix=countvec.fit_transform( dataset['description'])*
I have to correct misspelled features in description. Replacing wrongly
spelled word with correctly spelled word for large dataset is taking so
much of time.
So i thought of correcting features using features list in count
vectorizer given by code
*features_names= **countvec.get_feature_names()*
*Is it possible to rename features using above list and further use it for
classification process???*
Thanks
Ranjana
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn