[
https://issues.apache.org/jira/browse/NUTCH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515461
]
Doug Cook commented on NUTCH-25:
> Can you provide a link on icu4j's language detection?
http://www.icu-project.org/a
On 7/25/07, Robert Young <[EMAIL PROTECTED]> wrote:
The message which was appearing in the logs is pasted below.
Basically, in org.apache.nutch.crawl.MapWritable#getKeyValueEntry the
Writable is instantiated. It's class is determined by a two byte code
(which is written to crawldb I guess), if t
[
https://issues.apache.org/jira/browse/NUTCH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515365
]
Doğacan Güney commented on NUTCH-25:
[snip snip]
> Internal to guessEncoding, we could certainly add the clue valu
The message which was appearing in the logs is pasted below.
Basically, in org.apache.nutch.crawl.MapWritable#getKeyValueEntry the
Writable is instantiated. It's class is determined by a two byte code
(which is written to crawldb I guess), if there is no entry for the
class it fails to create it,
[
https://issues.apache.org/jira/browse/NUTCH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515342
]
Doug Cook commented on NUTCH-25:
Doğacan,
Thanks for the quick feedback.
> * EncodingDetector api is way too open. IM
[
https://issues.apache.org/jira/browse/NUTCH-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515283
]
Doğacan Güney commented on NUTCH-527:
-
What was the error you were having? MapWritable supports reading and writin
I've been through the code of the CrawlDbReader class. I discovered the
method "processTopNJob" which use the class CrawlDbTopNMapper and
CrawlDbTopNReducer.
I'm wondering why do we have this function. Is it an old implementation that
was used before the Generator to get the TopN links to Fetch or
[
https://issues.apache.org/jira/browse/NUTCH-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Young updated NUTCH-527:
Attachment: mapwritable.patch
I am not sure what the second parameter is so this may not be right. However,
[
https://issues.apache.org/jira/browse/NUTCH-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515259
]
Doğacan Güney commented on NUTCH-524:
-
Have you tried playing with max.threads.per.host option instead? If you set
[
https://issues.apache.org/jira/browse/NUTCH-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Young updated NUTCH-527:
Description: The map of classes which implement
org.apache.hadoop.io.Writable is not complete. It does not,
MapWritable doesn't support all hadoops writable types
--
Key: NUTCH-527
URL: https://issues.apache.org/jira/browse/NUTCH-527
Project: Nutch
Issue Type: Bug
Affects Versions: 0.9.0
[
https://issues.apache.org/jira/browse/NUTCH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515230
]
Doğacan Güney commented on NUTCH-25:
Overall I think the idea behind EncodingDetector is very solid. I will take a
12 matches
Mail list logo