Hi Jay,

Thanks for the offer. Of course, it will be useful if you describe your Solr tests. Just start a new paragraph (like the "Test with 58k text files..." one). I also tested Solr before and it is interesting but has no support for parallel processing yet. This wiki is a draft summarising the different potential solutions for large scale full-text indexing in Invenio. It's a work-in-progress description and the performance results are only indicatory. Also the tests were done on the same machine so relative speeds are still informative. I've added basic hardware specs.

Fell free to also express your view on the requirements for the full-text search, should it be different to CDS/INSPIRE one. I heard you use different stemming. Full-text indexing for Invenio is my main task at the moment and I'm more than happy to work with your team to make sure that we don't duplicate effort and have a solution covering all needs.

Best Regards,
Jan


Jay Luker wrote:
Hi all,

Since we at ADS are starting to experiment with Solr I wondered if it
might be helpful if I added some embellishments to the content at
https://twiki.cern.ch/twiki/bin/view/CDS/TalkFullTextIndex. What is
the goal of that page exactly?

Also, I can't help pointing out that the performance numbers at the
bottom aren't very informative due to a lack of context (hardware
specs, etc.)


Reply via email to