Arthur, I am assuming that you will define a query/rule for each tag, so in your case yes, that would be the way to define the percolator queries.
Couple of things that you might want to be aware: 1) Percolation is CPU intensive 2) The lesser the queries you can percolate against, the better. So when you call the percolate API, see if you can also pass in a query criteria to limit the queries to percolate against. On Wednesday, January 22, 2014 5:12:54 AM UTC-5, Arthur Denning wrote: > > Hey Binh, Thanks a lot and it is really nice to hear from someone with > practical experience on this. Is it correct to say if I had a thousand > tags, I would need to make thousands of > > curl -XPUT 'localhost:9200/my-index1/.percolator/tagname1' > > to register each tags? In your implementation is there any pitfalls or > nice tricks that is worth noting? > > > > > On Wednesday, January 22, 2014 8:27:03 AM UTC+8, Binh Ly wrote: >> >> Arthur, >> >> You should be able to use filters in your percolator queries so for >> example you can use a term/terms filter. Also, in ES 1.0 you can shard the >> percolator query index out so that percolation can distribute that load >> around for better scalability. The best way is to experiment with it: >> http://www.elasticsearch.org/downloads/1-0-0-RC1. >> >> I actually worked for a company that did content classification this way, >> and the percolator was a perfect fit for that use-case. >> >> On Tuesday, January 21, 2014 10:01:36 AM UTC-5, Arthur Denning wrote: >>> >>> I am considering using the percolator API to classify document, namely, >>> by posting query like "football", "art" to the percolator, and then when >>> adding new documents, percolator should return the right tags. My concerns >>> is, suppose there is thousands of tag to be identified in this way, would >>> it be a performance nightmare? Is there thousands of query that is >>> implicitly running behind the scene? >>> >>> And what would be the recommended way to tackle these kind of >>> classification problem in Elasticsearch? >>> >>> It seems that Lucene has a classification api. Is it already integrated >>> elsewhere in Elasticsearch? Is there any roadmap concerning its >>> implementation? >>> >>> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b6707b03-734a-4518-a12d-0e34e09e01f7%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.