Hi:
   I have been trying to use (and successfully did) the Significant terms 
aggregations in release 1.1.0. The blog posts about this feature
http://www.elasticsearch.org/blog/significant-terms-aggregation/ was 
extremely helpful. Since this feature is in experimental stage and the 
authors had requested feedback and me not knowing about how to provide 
feedback regarding specific features, I am restarting to posting on this 
group.

I had posted on a different thread regarding accessing the TFIDF scores for 
terms so that I could investigate ways in which I could enhance my queries. 
This lead me to look at the experimental Significant Terms Aggregation.  It 
does what it says  quite well. and I am glad this functionality exists. 
However, I would like to see some possibilities of enhancements:

What I noticed in my aggregation results is  a lot of Stopwords (a, an, 
the, at, and, etc.) being included as significant terms. perhaps having the 
possibility of including Stopword lists so that these stop words are not 
included in the signifiant term calculations.  (The significance is 
calculated based on how many times a term appears in the query result vs 
how many times it appears in whole index. ) For common stop words this 
 calculation i going to make them very significant. 

Another possible enhancement would be get a phrase significance (instead of 
a single term, doing a multi term significance) would be nice. 

In the blog post, a similar effect is obtained by highlighting the terms 
that are identified as significant.But it would be nice to just look at the 
buckets and determine that.


Cheers and Thanks for all the fish


Ramdev

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/95bec4ed-69c6-409d-b6b8-4bbe4c8da229%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to