Trying to Add an new NutchDoc from plugin

2010-02-15 Thread UDd

Hi there,
Im new to the forum and nutch as well...
I wrote a plugin to nutch that implements the IndexingFilter...
Now i want to add a new Document to the index from the plugin (split the
current doc)
I tryed testing it from something like this

NutchIndexWriter[] Writers =
NutchIndexWriterFactory.getNutchIndexWriters(getConf());
Writers[0].write(doc);

the doc is the doc i get in the method not something new i created.(just
for testing)

And i get the error it doesn't make sense to have a field that is neither
indexed nor stored

Any suggestions?
-- 
View this message in context: 
http://old.nabble.com/Trying-to-Add-an-new-NutchDoc-from-plugin-tp27598076p27598076.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.



Re: Trying to Add an new NutchDoc from plugin

2010-02-15 Thread Sahil Shah
Maybe I can try...debugging an Indexing plugin is kinda tricky.
can you attach the req files and folders and tell me exactly what procedure
to follow?
Also any settings to be modified



On Tue, Feb 16, 2010 at 12:10 AM, UDd dekelu...@gmail.com wrote:


 Hi there,
 Im new to the forum and nutch as well...
 I wrote a plugin to nutch that implements the IndexingFilter...
 Now i want to add a new Document to the index from the plugin (split the
 current doc)
 I tryed testing it from something like this

 NutchIndexWriter[] Writers =
 NutchIndexWriterFactory.getNutchIndexWriters(getConf());
 Writers[0].write(doc);

 the doc is the doc i get in the method not something new i
 created.(just
 for testing)

 And i get the error it doesn't make sense to have a field that is neither
 indexed nor stored

 Any suggestions?
 --
 View this message in context:
 http://old.nabble.com/Trying-to-Add-an-new-NutchDoc-from-plugin-tp27598076p27598076.html
 Sent from the Nutch - Dev mailing list archive at Nabble.com.




Re: Trying to Add an new NutchDoc from plugin

2010-02-15 Thread UDd

Thx for the quick response,
Well i wrote a very simple plugin that tryes to the the same doc twice and
if there is and error
then put it in the orniginal doc custom field:

  public NutchDocument filter(NutchDocument doc, Parse parse, Text url,
  CrawlDatum datum, Inlinks inlinks) throws IndexingException {
  
  // filter out if url contains archive, label or feeds
  LOGGER.debug(Found Url:  + new String(url.getBytes())); 
  
  NutchIndexWriter[] Writers =
NutchIndexWriterFactory.getNutchIndexWriters(getConf());
  //doc.add(js, String.valueOf(Writers.length));
  try {
Writers[0].write(doc);
  } catch (Exception e) {
// TODO Auto-generated catch block
  LOGGER.debug(Error adding Doc  + e.getMessage()); 
  doc.add(js, e.getMessage());
  }
  doc.add(js, AfterTest); 
  //return doc;
  return doc;
  }

and after the nutch run i just look at the index with lukeall-1.0.0 ,
I added the compiled plugin jar if you can try to debug it... or
if you can tell me how to debug it will be great (I have the nutch working
from ecplise).




http://old.nabble.com/file/p27598879/myplugins.rar myplugins.rar 
-- 
View this message in context: 
http://old.nabble.com/Trying-to-Add-an-new-NutchDoc-from-plugin-tp27598076p27598879.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.