Christian Herta wrote:
> I tried to Index my  local file system according to the FAQ: 
> http://wiki.apache.org/nutch/FAQ#head-c721b23b43b15885f5ea7d8da62c1c40a37878e6
> 
> But if I add the plugin into the nutch-site.xml file like this:
> 
>       <property>
>         <name>plugin.includes</name>
>        
> <value>protocol-file|protocol-http|parse-(text|html)|index-basic|query-(basic|site|url)</value>
>       </property>
> 

try with:

<value>protocol-(file|http)|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic</value>

if it does not work consult your log file logs/hadoop.log for more 
specific info about your problem.



> Additionally I have another question:
>  * Is there a possibility to use a directory of the HDFS Filesystem as a
> spool directory to index from?

Not directly, but if you can expose[1] hdfs via some available protocol 
then it is possible to index contents of hdfs also.

One could also write a protocol-hdfs plugin to do the job.

--
  Sami Siren


[1]http://issues.apache.org/jira/browse/HADOOP-4

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to