On Fri, 12 Aug 2011 00:02:09 +0200, Sanne Grinovero <sa...@hibernate.org> wrote:
> I just read the document (nice doc! where did you find it?) It was attached to the original Lucene issue related to faceting. > The commit requirements of this taxonomy index look like a mess, and > it also concerns me that it's totally impossible to remove stuff. Yeah, there are quite some rules around when to commit in relation to the main index writer. Good that there is Hibernate Search which can handle this for the user :-) Personally I am surprised that they introduced this new taxonomy index. Funny enough the actual indexed Documents also contain category (faceting) information. Hence also the need for the DocumentBuilder. I am sure that there are good reasons to introduce this new index, but I am surprised nevertheless. > Yes generally the architecture supports it (as far as how we linked > all components), but both the backend and the ReaderProvider would > need a custom implementation; while it looks like the ReaderProvider > needs an additional API method, I think we can avoid it on the > backend. I want to expose as little as possible of the underlying Lucene functionality. For power users we might want to offer some way to access the TaxonomyIndex/Reader directly. Not sure yet. We will also need to extend on the annotation side. Our approach allows to facet on any un-tokenized field. In the Lucene case we need to know for which fields we have to create faceting information. We could do this with an additional optional parameter to @Field or we introduce a new @Faceted (or something like this) annotation. Obviously the Lucene goes a step further with category path than our current faceting approach, but we don't have to extend our faceting DSL right away. > Also, so you know what kind of data structure expect TaxonomyWriter > and TaxonomyReader? we'll need clustering for that too, hopefully it's > similar to a Map, or reuses the Directory API. For clustering purposes I think we have to look at CategoryPath and how to serialize it. It should be just a bunch of strings, but I haven't seen the code yet. It would have been nice to get this stuff into Search 4 as well, but of course it depends on when the next version of Lucene (either 3.4 or 4) would be available. A Hibernate Search 4 bundled w/ Hibernate Core 4 and Lucene 4 would have been cool, but I don't think the timing will work out :-) --Hardy > 2011/8/11 Hardy Ferentschik <hibern...@ferentschik.de>: >> Hi, >> >> I was just reading the docs for the new Lucene faceting which makes use >> of a new index called taxonomy index. If we are going to use Lucene >> capabilities we have to make sure we can plug this into our current >> architecture. >> >> Reading the docs I can see quite some similarities between our >> terminology and theirs. That's good. However, the Lucene approach takes >> it much further. >> >> We might get a new candidate for serialization as well - CategoryPath. >> >> I uploaded the faceting API documentation to our shared dropbox >> directory. Have a look in case you are interested. >> >> --hardy _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev