Hi all, I just posted part 2 of a series on extracting text features for machine learning…
http://www.scaleunlimited.com/2013/07/21/text-feature-selection-for-machine-learning-part-2/ This covers the next step in generating good text-based features. I've also added some information on alternative approaches to text parsing (NLP vs. Solr tokenization). Thanks, -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr