Hi, I want to store top 5 high frequency non-stopwords words. I use DIH to import data. Now I have two approaches -
1. Use DIH JavaScript to find top 5 frequency words and put them in a copy field. The copy field will then stem it and remove stop words based on appropriate tokenizers. 2. Write a custom function for the same and add it to UpdateRequestProcessor Chain. Which of the two would be better suited? I find the first approach rather simple, but the issue is that I won't be having access to stop words/synonyms etc at the DIH time. In the second approach, if I add it to UpdateRequestProcessor Chain and insert the function after StopWordsFilterFactory and DuplicateRemoveFilterFactory, should be rather good way of doing this? -- *Pranav Prakash* "temet nosce"