[jira] Created: (NUTCH-527) MapWritable doesn't support all hadoops writable types
MapWritable doesn't support all hadoops writable types -- Key: NUTCH-527 URL: https://issues.apache.org/jira/browse/NUTCH-527 Project: Nutch Issue Type: Bug Affects Versions: 0.9.0 Environment: Tested on Solaris and Windows with Java 1.5 Reporter: Rob Young The map of classes which implement org.apache.hadoop.io.Writable is not complete. It does not, for example, include org.apache.hadoop.io.BooleanWritable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-527) MapWritable doesn't support all hadoops writable types
[ https://issues.apache.org/jira/browse/NUTCH-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Young updated NUTCH-527: Description: The map of classes which implement org.apache.hadoop.io.Writable is not complete. It does not, for example, include org.apache.hadoop.io.BooleanWritable. I would happily provide a patch if someone would explain what the Byte parameter is. (was: The map of classes which implement org.apache.hadoop.io.Writable is not complete. It does not, for example, include org.apache.hadoop.io.BooleanWritable) MapWritable doesn't support all hadoops writable types -- Key: NUTCH-527 URL: https://issues.apache.org/jira/browse/NUTCH-527 Project: Nutch Issue Type: Bug Affects Versions: 0.9.0 Environment: Tested on Solaris and Windows with Java 1.5 Reporter: Rob Young The map of classes which implement org.apache.hadoop.io.Writable is not complete. It does not, for example, include org.apache.hadoop.io.BooleanWritable. I would happily provide a patch if someone would explain what the Byte parameter is. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-527) MapWritable doesn't support all hadoops writable types
[ https://issues.apache.org/jira/browse/NUTCH-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Young updated NUTCH-527: Attachment: mapwritable.patch I am not sure what the second parameter is so this may not be right. However, it seems to work and it removes the error I was having. MapWritable doesn't support all hadoops writable types -- Key: NUTCH-527 URL: https://issues.apache.org/jira/browse/NUTCH-527 Project: Nutch Issue Type: Bug Affects Versions: 0.9.0 Environment: Tested on Solaris and Windows with Java 1.5 Reporter: Rob Young Attachments: mapwritable.patch The map of classes which implement org.apache.hadoop.io.Writable is not complete. It does not, for example, include org.apache.hadoop.io.BooleanWritable. I would happily provide a patch if someone would explain what the Byte parameter is. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (NUTCH-521) Modified injector to allow newly injected CrawlDatum to overwrite original
Modified injector to allow newly injected CrawlDatum to overwrite original -- Key: NUTCH-521 URL: https://issues.apache.org/jira/browse/NUTCH-521 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 0.9.0 Environment: Tested on Solaris and Windows with Java 1.5 Reporter: Rob Young Attachments: inject.patch Before this patch if a CrawlDatum is already in the crawldb then it will be used in preference to the CrawlDatum created by the newly injected url. This patch gives the user the ability to force the injected CrawlDatum to be used instead. The use case for this patch was the requirement for injected urls to jump to the top of the TopN list so that we can garuntee they will be crawled immediately (usefull for intranet crawling where changes can trigger injects). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-521) Modified injector to allow newly injected CrawlDatum to overwrite original
[ https://issues.apache.org/jira/browse/NUTCH-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Young updated NUTCH-521: Attachment: inject.patch Modified injector to allow newly injected CrawlDatum to overwrite original -- Key: NUTCH-521 URL: https://issues.apache.org/jira/browse/NUTCH-521 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 0.9.0 Environment: Tested on Solaris and Windows with Java 1.5 Reporter: Rob Young Attachments: inject.patch Before this patch if a CrawlDatum is already in the crawldb then it will be used in preference to the CrawlDatum created by the newly injected url. This patch gives the user the ability to force the injected CrawlDatum to be used instead. The use case for this patch was the requirement for injected urls to jump to the top of the TopN list so that we can garuntee they will be crawled immediately (usefull for intranet crawling where changes can trigger injects). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-479) Support for OR queries
[ https://issues.apache.org/jira/browse/NUTCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508479 ] Rob Young commented on NUTCH-479: - Hi I've found a bug in this patch. If I search for title:red ORtitle:blue I would expect it to be expanded to +title:red title:blue but in fact it expands to +title:red title:blue so there is no way to do term specific queries. Support for OR queries -- Key: NUTCH-479 URL: https://issues.apache.org/jira/browse/NUTCH-479 Project: Nutch Issue Type: Improvement Components: searcher Affects Versions: 1.0.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 1.0.0 Attachments: or.patch There have been many requests from users to extend Nutch query syntax to add support for OR queries, in addition to the implicit AND and NOT queries supported now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-479) Support for OR queries
[ https://issues.apache.org/jira/browse/NUTCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Young updated NUTCH-479: Attachment: or.patch I've changed the patch slightly to work around the bug I mentioned earlier. Now the queries look like this name:name value OR name:other value and are expanded to +name:name value name:other value Support for OR queries -- Key: NUTCH-479 URL: https://issues.apache.org/jira/browse/NUTCH-479 Project: Nutch Issue Type: Improvement Components: searcher Affects Versions: 1.0.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 1.0.0 Attachments: or.patch, or.patch There have been many requests from users to extend Nutch query syntax to add support for OR queries, in addition to the implicit AND and NOT queries supported now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-479) Support for OR queries
[ https://issues.apache.org/jira/browse/NUTCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507221 ] Rob Young commented on NUTCH-479: - How would this work in the following case? search phrase category:cat1 OR category:cat2 would it end up as (search phrase AND category:cat1) OR category:cat2 or as search phrase AND (category:cat1 OR category:cat2) Support for OR queries -- Key: NUTCH-479 URL: https://issues.apache.org/jira/browse/NUTCH-479 Project: Nutch Issue Type: Improvement Components: searcher Affects Versions: 1.0.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 1.0.0 Attachments: or.patch There have been many requests from users to extend Nutch query syntax to add support for OR queries, in addition to the implicit AND and NOT queries supported now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.