Hi Jay,
Thanks for the offer. Of course, it will be useful if you describe your
Solr tests. Just start a new paragraph (like the "Test with 58k text
files..." one). I also tested Solr before and it is interesting but has
no support for parallel processing yet.
This wiki is a draft summarising the different potential solutions for
large scale full-text indexing in Invenio. It's a work-in-progress
description and the performance results are only indicatory. Also the
tests were done on the same machine so relative speeds are still
informative. I've added basic hardware specs.
Fell free to also express your view on the requirements for the
full-text search, should it be different to CDS/INSPIRE one. I heard you
use different stemming. Full-text indexing for Invenio is my main task
at the moment and I'm more than happy to work with your team to make
sure that we don't duplicate effort and have a solution covering all needs.
Best Regards,
Jan
Jay Luker wrote:
Hi all,
Since we at ADS are starting to experiment with Solr I wondered if it
might be helpful if I added some embellishments to the content at
https://twiki.cern.ch/twiki/bin/view/CDS/TalkFullTextIndex. What is
the goal of that page exactly?
Also, I can't help pointing out that the performance numbers at the
bottom aren't very informative due to a lack of context (hardware
specs, etc.)