Thanks for sharing, Ian.  Perhaps one day Solr will have a 3rd party plugin
registry of sorts where you could publish it.

It wasn't clear if the bugs you listed were in your plugin or in
Lucene/Solr.  If the latter, and if you have time, please file JIRA
issue(s).

On Sun, Nov 18, 2018 at 10:25 PM Ian Caldwell <[email protected]> wrote:

> We have been working on a search index that contains Archived Web Pages
> that has been collected over a number of years. This can result in the same
> page(url) being collected on many dates. The problem that we faced is that
> we wanted to group results by Site(domain) but this left us with the same
> page being found many times so we needed a second level of grouping.
>
>
>
> I have extended the SOLR 5.5.3 grouping code to allow for 2 level
> grouping, through discussions with some of the people that are involved
> with archiving websites it was requested that the code be shared with the
> SOLR developers. I have made the code public on github  SOLR Grouping
> <https://github.com/nla/solr-grouping>.
>
>
>
> When extending the SOLR grouping code I tried to keep the code generic so
> that it could possible used elsewhere but I did not try to make all
> existing features work, only focusing on the parts that we needed for our
> system. Along the way I found a couple of bugs that I fixed in this code
> (1. Integer overflow in holding the total record count & 2. Not searching
> all shards when performing the second phase of the query(get all records
> within a group)).
>
>
>
>
>
> Ian Caldwell
>
> National Library of Australia
>
-- 
Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Reply via email to