Hi everyone!
I've been working on extricating the extensions we've made to Lucene here at MySpace that provide significant performance improvements, specifically around searching multiple indexes in parallel, and would like to know if there are any acceptance criteria around submitting these into the contrib. Things such as: 1) Target framework? 2) Unit tests - VS or NUnit? 3) Solutions, projects or just the individual code files? One of the things I'm packaging up is revised ParallelMultiSearcher as I refer to the current implementation as ParallelResourceHog. While the current implementation that is a direct port of the Java version may perform fine through Java, in .NET it is sub-par. Here are some initial numbers on using 10 indexes with identical schemas: 1) MultiSearcher (75 req/sec) : the indexes are searched serially, one after another 2) ParallelMultiSearcher (8 req/sec) : the indexes are searched in parallel, however there is a LOT of contention and thread creation. 3) WarpSearcher (86 req/sec) : this is my parallel search implementation that adds resource pooling through object re-use and using ThreadPool threads. The best part is that I'm not even done with it yet - there are still more items that can be pooled and contention that can be lifted to improve performance even more. We've been given direction here at MySpace to contribute these types of things back to the community, something I am very excited about, however I want to ensure I follow any guidelines necessary to do so. Cheers, Michael Michael Garski Sr. Search Architect 310.969.7435 (office) 310.251.6355 (mobile) www.myspace.com/michaelgarski
