Re: solrdedup crashes if digest-field not compiled

lewis john mcgibbney Sun, 09 Oct 2011 14:39:44 -0700

Hi Richard,

Yes is the simple answer. We are aware for some time that Dedup is broken in
nutchgora [1], however Markus also reported an issue with current trunk
development [2]. Can you please review and comment if you can reproduce, or
alternatively browse though out indexer issues [3] and comment accordingly.
A patch would be excellent by any means. Thank you


[1] https://issues.apache.org/jira/browse/NUTCH-992
[2] https://issues.apache.org/jira/browse/NUTCH-1100
[3]
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+NUTCH+AND+resolution+%3D+Unresolved+AND+component+%3D+indexer+ORDER+BY+priority+DESC&mode=hide

On Sun, Oct 9, 2011 at 9:35 PM, Rich d'Rich <[email protected]> wrote:

> >>Dedup will not work without digest field. Perhaps we can extend solrdedup
> >>so
> >>it skips all documents
> >>with a digest field. Will that work for you?
> >You mean skip all documents *without* a digest field?
> >Yes, that would work.
> >But wouldn't it be better for performance reasons to query only against
> >documents with the field already compiled?
>
> I'm getting this issue as well - we've got a heterogenous SOLR index with
> various sources apart from Nutch, and the lack of a digest field crashes
> dedup when it hits a non-Nutch doc, as described by Matthias.
>
> Is there an issue logged for this? I might be making a patch just to keep
> us
> going.
>
> --
> Richard
>



-- 
*Lewis*

Re: solrdedup crashes if digest-field not compiled

Reply via email to