Hello,

I don't have any formal education on predictive models, thus I hope my
questions are not too naive and that the terminology I use is correct
enough to make me understood.

I'm trying to implement simple text categorization of phrases of a few
words (the specific application is categorization of bank transaction
from payee names). Following the documentation I easily implemented a
solution based on the TF-IDF vectorizer and C-Support Vector machine
classification. However, the problem is such that for some input phrases
the classification prediction does not work that well.

I have a couple of (probably very basic questions):

- are my choices of algorithms the best to target this problem? Is there
something else I can try to experiment with to see if I can get better
results?

- is there a way to obtain the prediction likelihood such that I could
mark "bad" prediction for further inspection? I haven't found an (easy)
way to do that in the documentation.

Thank you in advance.

Cheers,
Dan

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to