I have been using Apache Solr for many years in a live environment that
services queries at 3K rpm (unless there is a campaign in progress) and updates
from 3K to 10K rpm. The schema is quite robust with each record potentially
having 90 fields populated. The system stores 900 million records and is hosted
on several powerful server instances.
Back in 2015 I attended a session at Lucid Works called “Solr Unleashed”. When
I described the system that I was building I recall the presenter saying, “Good
luck with that.” We have had very good luck.
When I first started generating Solr query strings I did it with StringBuilder.
That became problematic when I wanted to change a boost or a constant score for
a query term group that had been generated previously. So, I eventually wrote
some Java classes to provide an object structure that I could manipulate and
navigate. It has been very helpful, and recently I revamped my query generation
and was glad I had objects to work with instead of strings.
My employer has often encouraged the development staff to participate in the
Open Source community and they are supportive of sharing this query generation
functionality.
I will attach a link to the fork of Apache Solr that I am using below.
I have some questions.
1) Do these Java classes provide functionality that the community would like to
have? Maybe there is functionality already available or similar.
2) I just made a guess in the project structure on where to put the
functionality. Maybe it should be in SolrJ, or maybe in Lucene, or somewhere
else.
The main or working java class is called QueryTermGrouper.
QueryTermGrouper aggregates QueryTerms and other QueryTermGroupers to form
complex queries that can be used in a Standard Solr Query
Example:
QueryTermGrouper grouper = new QueryTermGrouper().with(BooleanClause.
Occur. MUST).withBoost(1.4f);
grouper. addTerm(new QueryTerm("foo", "bar").withProximity(1));
String query = grouper. toString();
Output: +( foo:bar~1 )^1.4
Example:
QueryTermGrouper grouper = new
QueryTermGrouper().withConstantScore(5.0f);
grouper. addTerm(new QueryTerm("foo", "bar").withProximity(1));
String query = grouper. toString();
Output: ( foo:bar~1 )^=5
Instead of using string manipulation to create complex query strings the
QueryTermGrouper allows complex queries to be built inside an object model that
can be more easily changed.
If you need to generate a query like this:
(
(
cd:"back in black"
cd:"point of no return"
cd:"night at the opera"
)^0.3
(
record:destroyer
record:"the grand illusion"
)^0.5
)
The code to do so is as simple this:
QueryTermGrouper grouper = new QueryTermGrouper();
QueryTermGrouper cdGrouper = grouper. addGroup();
QueryTermGrouper recordsGroup = grouper. addGroup();
cdGrouper. addTerm(new QueryTerm("cd", "back in black"));
cdGrouper. addTerm(new QueryTerm("cd", "point of no return"));
cdGrouper. addTerm(new QueryTerm("cd", "night at the opera"));
cdGrouper. setBoost(0.3f);
recordsGroup. addTerm(new QueryTerm("record", "destroyer"));
recordsGroup. addTerm(new QueryTerm("record", "the grand illusion"));
recordsGroup. setBoost(0.5f);
The code can be found here:
https://github.com/gslinker/solr/tree/QUERY_TERM_GROUPER
Unit tests provide 100% coverage on all lines of code and on all branches in
the code.
Please share your thoughts.
Sincerely
Geoffrey Slinker
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]