Hi Vadim B I am getting same error
org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=smb were u able to rectify this error... if yes, can u please tell me what you did which cleared this error.. already posted here all the details... http://www.nabble.com/Windows-Share-Crawling---searching-tf4277499.html#a12175266 I am using Linux not cygwin on windows thanx Bikram Hi, I am working on the same issue as you, So far I could crawl file:///C:/* but i am stucked on the smb part. It looks to me that this plugin isn't working properly so it needs to be fixed for the newer version of nutch. The error I get differs a bit from yours it is: 2007-05-25 18:06:29,573 INFO fetcher.Fetcher - fetching smb://mobidick/test/ 2007-05-25 18:06:29,573 INFO fetcher.Fetcher - fetch of smb://mobidick/test/ failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=smb I will dive into the plugin-smb and try out to narrow the problem Maybe we can work together to get a quick solution. ---SNIP--- # accept hosts in MY.DOMAIN.NAME # Standart +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ +^file:///C:/Policies/ <<-- why you put it here it doesn't make sense because the +^(file|smb) line above is already fitting so this will be skipped ---SNIP --- ---SNIP --- 2007-05-24 14:04:22,000 WARN crawl.PartitionUrlByHost - Malformed URL: 'smb://sql1/Sales/DATA/' //did you cuoted the url or is it displayed in the logs like this? I dont get this error ---SNIP --- try this in package org.apache.nutch.crawl.Crawl public static void main(String args[]) throws Exception { System.setProperty("java.protocol.handler.pkgs", "jcifs"); // new LOG.info("SMB Info: " + System.getProperty("java.protocol.handler.pkgs")); //new LOG.info("SMB Info: " + new java.util.PropertyPermission("java.protocol.handler.pkgs","read, write").toString());//new if (args.length < 1) { System.out.println ("Usage: Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN N]"); return; } ---SNIP--- check out this: http://java.sun.com/developer/onlineTraining/protocolhandlers/ -- View this message in context: http://www.nabble.com/WIN-XP-PRO--Djava.protocol*-file%3A---c%3A-folder--Crawling-Parents-tf3809966.html#a12269503 Sent from the Nutch - User mailing list archive at Nabble.com.
