Re: Top 5 high freq words - UpdateProcessorChain or DIH Script?

2012-07-09 Thread Erick Erickson
I think the second way is probably the most robust, and it's surprisingly un-complicated. You wouldn't really be using copyField in that case, just adding them to the proper field in the document. Anything you do outside of the update chain would suffer from having to be kept in synch with the

Top 5 high freq words - UpdateProcessorChain or DIH Script?

2012-07-08 Thread Pranav Prakash
Hi, I want to store top 5 high frequency non-stopwords words. I use DIH to import data. Now I have two approaches - 1. Use DIH JavaScript to find top 5 frequency words and put them in a copy field. The copy field will then stem it and remove stop words based on appropriate tokenizers.