Hi All, I think this question is appropriate for the Mahout mailing list but if not any pointers in the right direction or advise would be welcomed.
We have a taxonomy based navigation system where items in the navigation tree are made up of tag based queries (instead of natural language words) which are matched against content items tagged in a similar way. so we have a taxonomy tree with queries Id Label 001 Fruit (fid:123 or fid:675) AND -fid:(324 OR 678) ... 002 Round 003 Apple 004 Orange 006 Star 007 Star fruit .... Content pool "Interesting article on fruit" -> tagged with (123, 234, 675) "The mightly orange!" -> tagged with (123, 324, 678) hopefully you get the picture.. Now we bake these queries into our Solr index so instead of doing the Fruit query we have pre done it and just search for items in index that have id 001 the reasons for doing this are not really important but we have written a indexer for the purpose. Also content items are multi-surfacing so a item could appear at 001, 004 and 007 Although the indexer is ok at doing this pre bake job its not very fast and as the content and tree grows it gets slower. NOW for the actual Question!!! Is there a ML model that can quickly classify/identify where a new (or retagged) piece of content fits onto the tree. Oh the queries on the leaf nodes can change (less often) so a quick process to reclassify what is in score for that leaf would be useful. The reason I want this is because it would great have realtime feed back to an author applying tags to a document of where it fits in the site. Once I get this working I would love to add suggested tags or weighting based on content items with contextual similarity. I think it was Grant that was talking about a Solr external field that could be used to hook this together or maybe I am mistaken Hope this makes sense Thanks for you help/advise in advance Regards, Dave
