Hi Adrash,
we did a search engine for a limited Web space : ~100M pages. Our
background is in semantic search - but first we needed to address all
the general crawl search issues as in a traditional search engine.
They are in no way less work than introducing some semantics. So - i'd
[
https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859116#action_12859116
]
Ilguiz Latypov commented on NUTCH-427:
--
I hesitate adding the .zip file because (a) it
[
https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ilguiz Latypov updated NUTCH-427:
-
Attachment: (was: protocol-smb.zip)
protocol-smb: plugin protocol implementing the CIFS/SMB
[
https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ilguiz Latypov updated NUTCH-427:
-
Attachment: protocol-smb-dist.zip
Applied my diff to simplify importing into the Subversion tree.