Hello Geoffrey,
Thanks for your message and offer.
I think the overall idea is nice but if we got more serious, we'd want
to bike-shed on a number of details. Like naming ("grouper" is
dubious to me) and explore further simplifications. Like why
"addTerm(new QueryTerm("cd", "back in black")" when you could do
"addTerm("cd", "back in black")" ? And I suspect you are confusing a
"query term" (ultimately a TermQuery in Lucene) with a query/clause
generally. So much bikeshedding here that we'd probably start from
scratch to be honest.
If we hypothetically incorporated this immediately, where in the
codebase would it be used (give a specific example)? If it's nowhere
at all, it might be an awkward thing to include. Maybe there's 100%
test coverage but I suspect no tests for if the string is actually
parseable and parsed as-intended (i.e. has the Query structure). If
it's for users of Solr (which I believe is your intention), it should
live in SolrJ but you placed it in solr-core.
~ David
On Fri, Aug 16, 2024 at 3:48 PM Geoffrey Slinker
<[email protected]> wrote:
>
> I have been using Apache Solr for many years in a live environment that
> services queries at 3K rpm (unless there is a campaign in progress) and
> updates from 3K to 10K rpm. The schema is quite robust with each record
> potentially having 90 fields populated. The system stores 900 million records
> and is hosted on several powerful server instances.
>
> Back in 2015 I attended a session at Lucid Works called “Solr Unleashed”.
> When I described the system that I was building I recall the presenter
> saying, “Good luck with that.” We have had very good luck.
>
> When I first started generating Solr query strings I did it with
> StringBuilder. That became problematic when I wanted to change a boost or a
> constant score for a query term group that had been generated previously. So,
> I eventually wrote some Java classes to provide an object structure that I
> could manipulate and navigate. It has been very helpful, and recently I
> revamped my query generation and was glad I had objects to work with instead
> of strings.
>
> My employer has often encouraged the development staff to participate in the
> Open Source community and they are supportive of sharing this query
> generation functionality.
>
> I will attach a link to the fork of Apache Solr that I am using below.
>
> I have some questions.
>
> 1) Do these Java classes provide functionality that the community would like
> to have? Maybe there is functionality already available or similar.
> 2) I just made a guess in the project structure on where to put the
> functionality. Maybe it should be in SolrJ, or maybe in Lucene, or somewhere
> else.
>
> The main or working java class is called QueryTermGrouper.
>
> QueryTermGrouper aggregates QueryTerms and other QueryTermGroupers to form
> complex queries that can be used in a Standard Solr Query
> Example:
> QueryTermGrouper grouper = new
> QueryTermGrouper().with(BooleanClause. Occur. MUST).withBoost(1.4f);
> grouper. addTerm(new QueryTerm("foo", "bar").withProximity(1));
>
> String query = grouper. toString();
>
> Output: +( foo:bar~1 )^1.4
>
> Example:
> QueryTermGrouper grouper = new
> QueryTermGrouper().withConstantScore(5.0f);
> grouper. addTerm(new QueryTerm("foo", "bar").withProximity(1));
>
> String query = grouper. toString();
>
> Output: ( foo:bar~1 )^=5
>
> Instead of using string manipulation to create complex query strings the
> QueryTermGrouper allows complex queries to be built inside an object model
> that can be more easily changed.
> If you need to generate a query like this:
> (
> (
> cd:"back in black"
> cd:"point of no return"
> cd:"night at the opera"
> )^0.3
>
> (
> record:destroyer
> record:"the grand illusion"
> )^0.5
>
> )
>
> The code to do so is as simple this:
> QueryTermGrouper grouper = new QueryTermGrouper();
> QueryTermGrouper cdGrouper = grouper. addGroup();
> QueryTermGrouper recordsGroup = grouper. addGroup();
>
> cdGrouper. addTerm(new QueryTerm("cd", "back in black"));
> cdGrouper. addTerm(new QueryTerm("cd", "point of no return"));
> cdGrouper. addTerm(new QueryTerm("cd", "night at the opera"));
> cdGrouper. setBoost(0.3f);
>
> recordsGroup. addTerm(new QueryTerm("record", "destroyer"));
> recordsGroup. addTerm(new QueryTerm("record", "the grand
> illusion"));
> recordsGroup. setBoost(0.5f);
>
>
> The code can be found here:
>
> https://github.com/gslinker/solr/tree/QUERY_TERM_GROUPER
>
> Unit tests provide 100% coverage on all lines of code and on all branches in
> the code.
>
> Please share your thoughts.
>
> Sincerely
> Geoffrey Slinker
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]