Thanks for the feedback, Ramdev.

What I noticed in my aggregation results is  a lot of Stopwords (a, an, 
> the, at, and, etc.) being included as significant terms. 
>

These sorts of terms shouldn't really need any sort of special treatment. 
If they are appearing as suggestions then I expect one of the following 
statements to be true:

1) You have a very small number of docs in the result set representing the 
"foreground" sample. Significant terms needs a reasonable number of docs in 
a sample to draw any real conclusions
2) You have query criteria that is not identifying a result set with any 
sense of cohesion e.g. a query for random docs
3) You have changed the set of stopwords in use in your index - what 
previously never used to appear at all is now suddenly common or 
vice-versa. 
4) You are querying across mixed indices or doc-types (one with stop-words, 
one without) and we fail to tune-out the stopwords as part of the results 
merging process because one small index reports them back as commonplace 
while another large index has them as missing or rare. In the merged stats 
they therefore appear to be highly correlated with your query request.

Please let me know if none of these scenarios explain your results.

 

> Another possible enhancement would be get a phrase significance (instead 
> of a single term, doing a multi term significance) would be nice. 
>


I outline some of the possibilities in creating phrases from significant 
terms, starting 51 mins into this recent video: 
https://skillsmatter.com/skillscasts/5175-revealing-the-uncommonly-common-with-elasticsearch
 

>
> Cheers and Thanks for all the fish
>

You're welcome and thanks again for the feedback
Mark 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/184ac2f6-12f4-47a8-86c4-9c49c04e17ac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to