Re: Developing Nutch for semantic search

2010-04-20 Thread borislav popov
Hi Adrash, we did a search engine for a limited Web space : ~100M pages. Our background is in semantic search - but first we needed to address all the general crawl search issues as in a traditional search engine. They are in no way less work than introducing some semantics. So - i'd

[jira] Commented: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implme

2010-04-20 Thread Ilguiz Latypov (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859116#action_12859116 ] Ilguiz Latypov commented on NUTCH-427: -- I hesitate adding the .zip file because (a) it

[jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implment

2010-04-20 Thread Ilguiz Latypov (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilguiz Latypov updated NUTCH-427: - Attachment: (was: protocol-smb.zip) protocol-smb: plugin protocol implementing the CIFS/SMB

[jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implment

2010-04-20 Thread Ilguiz Latypov (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilguiz Latypov updated NUTCH-427: - Attachment: protocol-smb-dist.zip Applied my diff to simplify importing into the Subversion tree.