Hi,

Could you please submit a JIRA issue and attach this (or perhaps the 
diff for whole plugin exluding the jcifs .jar because it is lgpl) in it.


René Treffer wrote:
> Hi,
> 
> I've just written an protocol-smb, it's really simple (code attached). 
> It uses the jcifs lib and seems to work - but there is some stuff I'd 
> like to discuss...
> 
> Nutch is glued to URL, which works if you write an URLHandler. No 
> Problem so far, but you can't install an URLHandler everywhere - have a 
> look at the jcifs FAQ ( http://jcifs.samba.org/src/docs/faq.html ). Most 
> important: It won't work in you war - so protocol plugins will be 
> useless in a web context! Might cause a lot of trouble.
> Moreover Nutch will never be able to handle \\192.168.0.1\ correctly 
> with URL....

Perhaps a custom URL parser (nutch currently uses URL class only for 
parsing urls) could do the job here. I have seen custom implementations 
at least in tomcat which we could perhaps borrow and extend if required.

> 
> Converting directories into html lists suck. And reproducing the code is 
> even worse. Perhaps a virtual mime-type could be added (e.g. 
> "nutch/dir"). Almost forgotten: tell my how I should index files with " 
> and ' in there name (currently I check for ' and change the href 
> quotes). Same problem for file://

There could perhaps be a different crawler implementation to crawl local 
filesystem and these shared windows resources (and perhaps webdav too) 
efficiently.

--
  Sami Siren

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to