Hi all,
We've got an internal Java library that allows us to do keyword extraction 
that seems like a great thing to turn into an integrated elasticsearch 
function. 
Ultimately, I want to be able to access the result of this library from 
search results/etc, but I wanted to do a sanity check to make sure my 
approach was right - or if I should be looking at doing a custom analyzer 
or something instead.

Given a string field, the type would become a multi-field, {name} and 
keywords/phrases as subfields. A plugin would be written to handle this 
keywords field, run the strings through the library and return a list of 
strings like:
"my_data":"Jack and Jill went up the hill, Jack fell down and bumped his 
crown, and Jill came tumbling after."
"my_data.keywords":["Jack", "Jack fell"]
That's a trivial example, of course, and the algorithm is more complex than 
the standard stopword filtering.

Ultimately, I want to be able to expose the my_data.keywords field as an 
actual list like above, so that we can use it in other things like facets 
down the line. 

So is a custom type plugin the right way to go here, or should I be looking 
at developing a more complex analyzer/tokenizer/stopword combo?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/44d3ba64-f0b3-4727-9a49-745a2167d34d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to