Hi all, I've had a rotten day today because of Solr. I want to share my experience and perhaps see if we can do something to fix this particular situation in the future.
Solr currently has two ways to get grouped results (so far!). You can either use Result Grouping or you can use the Collapsing Query Parser. Result grouping seems like the obvious way to go. It's well documented, the parameters are clear, it doesn't use a bunch of weird syntax (ie, {!collapse blah=foo}), and it uses the feature name from SQL (so it comes up in Google). OTOH, if you use faceting with result grouping, which I imagine many people do, you get terrible performance. In our case it went from subsecond to 10-120 seconds for big queries. Insanely bad. Collapsing Query Parser looks like a good way forward for us, and we'll be investigating that, but it uses the Expand component that our library doesn't support, to say nothing of the truly bizarre syntax. So this will be a fair amount of effort to switch. I'm curious if there is anything we can do to clean up this situation. What I'd really like to do is: 1. Put a HUGE warning on the Result Grouping docs directing people away from the feature if they plan to use faceting (or perhaps directing them away no matter what?) 2. Work towards eliminating one or the other of these features. They're nearly completely compatible, except for their syntax and performance. The collapsing query parser apparently was only written because the result grouping had such bad performance -- In other words, it doesn't exist to provide unique features, it exists to be faster than the old way. Maybe we can get rid of one or the other of these, taking the best parts from each (syntax from Result Grouping, and performance from Collapse Query Parser)? Thanks, Mike PS -- For some extra context, I want to share some other reasons this is frustrating: 1. I just spent a week upgrading a third-party library so it would support grouped results, and another week implementing the feature in our code with tests and everything. That was a waste. 2. It's hard to notice performance issues until after you deploy to a big data environment. This creates a bad situation for users until you detect it and revert the new features. 3. The documentation *could* say something about the fact that a new feature was developed to provide better performance for grouping. It could say that using facets with groups is an anti-feature. It says neither. I only mention these because, like others, I've had a real rough time with solr (again), and these are the kinds of seemingly small things that could have made all the difference.