[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2010-03-24 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849225#action_12849225 ] Andrzej Bialecki commented on SOLR-799: This issue is closed - please use the mailin

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2010-03-24 Thread Thomas Heigl (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849091#action_12849091 ] Thomas Heigl commented on SOLR-799: --- Hello, For my current project I need to implement an

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2009-02-23 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676041#action_12676041 ] Hoss Man commented on SOLR-799: --- The separation of concerns between schema.xml and solrconfig.x

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2009-02-22 Thread Shalin Shekhar Mangar (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675770#action_12675770 ] Shalin Shekhar Mangar commented on SOLR-799: bq. I don't think signatureField i

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2009-02-22 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675748#action_12675748 ] Lance Norskog commented on SOLR-799: I came into Solr with no search experience and it wa

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-12-04 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653484#action_12653484 ] Yonik Seeley commented on SOLR-799: --- Why not plug in an entirely new chain? That is one of

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-12-03 Thread Ryan McKinley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652923#action_12652923 ] Ryan McKinley commented on SOLR-799: I'm not sure how you have the test set up, so I coul

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-12-03 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652837#action_12652837 ] Mark Miller commented on SOLR-799: -- Okay, I see. I was too intent on changing the current ch

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-12-03 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652836#action_12652836 ] Yonik Seeley commented on SOLR-799: --- bq. Now that I look to fix this, I am not understandin

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-12-03 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652815#action_12652815 ] Mark Miller commented on SOLR-799: -- I'm going to put up another patch for this soon. I'd lik

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-11-21 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649826#action_12649826 ] Mark Miller commented on SOLR-799: -- bq. There's probably no need for a separate test solrcon

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-11-10 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646417#action_12646417 ] Hoss Man commented on SOLR-799: --- bq. It seems like uniqueField should normally enforce uniquene

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-11-04 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645073#action_12645073 ] Mark Miller commented on SOLR-799: -- Ok. I cant muster up much of a defense for leaving it ou

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-11-04 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645061#action_12645061 ] Yonik Seeley commented on SOLR-799: --- bq. Maybe we just do overwrite dupe for now? +1, as l

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-23 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642245#action_12642245 ] Mark Miller commented on SOLR-799: -- I find the pluggable delete policy idea appealing, but

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-14 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639479#action_12639479 ] Otis Gospodnetic commented on SOLR-799: --- Thanks Yonik. Good thing I asked for the clar

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-14 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639474#action_12639474 ] Yonik Seeley commented on SOLR-799: --- Otis: this issue only handles the index side of things

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-14 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639470#action_12639470 ] Yonik Seeley commented on SOLR-799: --- "overwriting" is implemented and supported in Lucene n

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-14 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639456#action_12639456 ] Otis Gospodnetic commented on SOLR-799: --- Haven't looked at the patch yet. Have looked a

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-13 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639304#action_12639304 ] Hoss Man commented on SOLR-799: --- If we assume for a minute that users who want to prevent or ov

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-12 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638850#action_12638850 ] Mark Miller commented on SOLR-799: -- bq. 1. Prevent new insert - SignatureUpdateProcessor gen

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-09 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638427#action_12638427 ] Hoss Man commented on SOLR-799: --- some misc comments from a user perspective based on the curren

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-09 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638426#action_12638426 ] Hoss Man commented on SOLR-799: --- (disclaimer: haven't looked at the patch) bq. Though in some

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-09 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638217#action_12638217 ] Andrzej Bialecki commented on SOLR-799: +1 on the incremental sig calculation. Re:

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-08 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638048#action_12638048 ] Mark Miller commented on SOLR-799: -- bq.I agree that it is wise to separate the detection

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-08 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638009#action_12638009 ] Yonik Seeley commented on SOLR-799: --- Some thoughts... - How should different "types" be ha

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-08 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637976#action_12637976 ] Yonik Seeley commented on SOLR-799: --- bq. I agree that it is wise to separate the detection

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637965#action_12637965 ] Grant Ingersoll commented on SOLR-799: -- Haven't looked at the patch, but I agree that it

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-07 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637719#action_12637719 ] Mark Miller commented on SOLR-799: -- Thanks for the review Andrzej. I've made the first two c

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2008-10-07 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637649#action_12637649 ] Andrzej Bialecki commented on SOLR-799: Interesting development in light of NUTCH-44