it is better for fetchItemQueues to select items from greater queues first
--
Key: NUTCH-1297
URL: https://issues.apache.org/jira/browse/NUTCH-1297
Project: Nutch
Issue
Thanks for the reply. I posted the same query on user@ as you mentioned but I
didn't get any reply.
MS LETOR Dataset can be found at
http://research.microsoft.com/en-us/projects/mslr/
http://research.microsoft.com/en-us/projects/mslr/ . My aim is to crawl
LETOR dataset present on the local file s
[
https://issues.apache.org/jira/browse/NUTCH-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221762#comment-13221762
]
gee commented on NUTCH-1084:
invoke nutch jdwp and used eclipse to debug nutch on distributed
[
https://issues.apache.org/jira/browse/NUTCH-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221661#comment-13221661
]
Sujit Pal commented on NUTCH-945:
-
Thanks Markus, and sorry, I should have mentioned that i
[
https://issues.apache.org/jira/browse/NUTCH-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221598#comment-13221598
]
Markus Jelsma commented on NUTCH-1282:
--
There is an issue for that. In my opinion wit
[
https://issues.apache.org/jira/browse/NUTCH-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
behnam nikbakht updated NUTCH-1270:
---
Attachment: NUTCH-1270.patch
> some of Deflate encoded pages not fetched
> --
[
https://issues.apache.org/jira/browse/NUTCH-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221580#comment-13221580
]
behnam nikbakht commented on NUTCH-1282:
another option is when we construct web g
[
https://issues.apache.org/jira/browse/NUTCH-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221568#comment-13221568
]
behnam nikbakht commented on NUTCH-1278:
i edit this patch to make this changes:
u
[
https://issues.apache.org/jira/browse/NUTCH-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
behnam nikbakht updated NUTCH-1278:
---
Attachment: NUTCH-1278-v.2.zip
> Fetch Improvement in threads per host
>
[
https://issues.apache.org/jira/browse/NUTCH-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
behnam nikbakht updated NUTCH-1269:
---
Attachment: NUTCH-1269-v.2.patch
> Generate main problems
> --
>
>
[
https://issues.apache.org/jira/browse/NUTCH-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
behnam nikbakht updated NUTCH-1269:
---
i rebuild a patch with current trank and upload it to solve these two problems:
1. when generate
11 matches
Mail list logo