[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508445 ] Doğacan Güney commented on NUTCH-289: - It seems this issue has kind of died down, but this would be a great

[jira] Commented: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code

2007-06-27 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508449 ] Sami Siren commented on NUTCH-499: -- +1, seems good to me Refactor LinkDb and LinkDbMerger to reuse code

JIRA email question

2007-06-27 Thread Doğacan Güney
Hi list, There is this sentence at the end of every JIRA message: You can reply to this email to add a comment to the issue online. But, replying to a JIRA message through nutch-dev doesn't add it as a comment. So you have to either reply to an email through JIRA (in which case, it looks like

[jira] Closed: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doğacan Güney closed NUTCH-434. --- Issue resolved and committed. Replace usage of ObjectWritable with something based on GenericWritable

[jira] Resolved: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doğacan Güney resolved NUTCH-499. - Resolution: Fixed Fix Version/s: 1.0.0 Committed in rev. 551098. Refactor LinkDb and

[jira] Closed: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doğacan Güney closed NUTCH-499. --- Issue resolved and committed. Refactor LinkDb and LinkDbMerger to reuse code

[jira] Commented: (NUTCH-479) Support for OR queries

2007-06-27 Thread Rob Young (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508479 ] Rob Young commented on NUTCH-479: - Hi I've found a bug in this patch. If I search for title:red ORtitle:blue I

[jira] Updated: (NUTCH-479) Support for OR queries

2007-06-27 Thread Rob Young (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Young updated NUTCH-479: Attachment: or.patch I've changed the patch slightly to work around the bug I mentioned earlier. Now the

[jira] Commented: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508505 ] Doğacan Güney commented on NUTCH-498: - I tested creating a linkdb from ~6M urls: Combine input records

[jira] Commented: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation

2007-06-27 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508506 ] Andrzej Bialecki commented on NUTCH-498: - +1. Use Combiner in LinkDb to increase speed of linkdb

[jira] Commented: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation

2007-06-27 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508508 ] Sami Siren commented on NUTCH-498: -- +1 Use Combiner in LinkDb to increase speed of linkdb generation

[jira] Resolved: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doğacan Güney resolved NUTCH-498. - Resolution: Fixed Fix Version/s: 1.0.0 Assignee: Doğacan Güney Committed in rev.

Re: JIRA email question

2007-06-27 Thread Doug Cutting
The problem is that nutch-dev (like most Apache mailing lists) sets the Reply-to header to be itself, so that responses don't go back to the sender. If you override this when responding (changing the To: line) and respond to the sender, then it should end up as a comment, which will be then

Re: NUTCH-119 :: how hard to fix

2007-06-27 Thread Kai_testing Middleton
wow, setting db.max.outlinks.per.page immediately fixed my problem. It looks like I totally mis-diagnosed things. May I pose two questions: 1) how did you view all the outlinks? 2) how severe is NUTCH-119 - does it occur on a lot of sites? - Original Message From: Doğacan Güney