Re: Nutch Extension for realtime processing

2014-06-19 Thread Jake Dodd
>>> >>> >>> Looking forward to seeing your contributions! >>> >>> ++ >>> Chris Mattmann, Ph.D. >>> Chief Architect >>> Instrument Software and Science Data Systems Section (398) >>&

Re: Nutch Extension for realtime processing

2014-06-19 Thread Julien Nioche
ems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: chris.a.mattm...@nasa.gov >> WWW: http://sunset.usc.edu/~mattmann/ >> ++ >&

Re: Nutch Extension for realtime processing

2014-06-18 Thread Jake Dodd
ng >>> indexing filtering framework and plugins to enrich the data. If you then >>> couple the indexing logic to FetcherOutputFormat, you can skip the parse >>> (because this requires a parsing fetcher) and updatedb jobs, as well as >>> the separate indexing jo

Re: Nutch Extension for realtime processing

2014-06-18 Thread Julien Nioche
++ > > > > > > > -Original Message- > From: Markus Jelsma > Reply-To: "dev@nutch.apache.org" > Date: Tuesday, June 17, 2014 10:55 AM > To: "dev@nutch.apache.org" > Subject: R

Re: Nutch Extension for realtime processing

2014-06-17 Thread Jake Dodd
> > > > > > > -Original Message- > From: Markus Jelsma > Reply-To: "dev@nutch.apache.org" > Date: Tuesday, June 17, 2014 10:55 AM > To: "dev@nutch.apache.org" > Subject: RE: Nutch Extension for realtime processing > >>

Re: Nutch Extension for realtime processing

2014-06-17 Thread Mattmann, Chris A (3980)
: "dev@nutch.apache.org" Date: Tuesday, June 17, 2014 10:55 AM To: "dev@nutch.apache.org" Subject: RE: Nutch Extension for realtime processing >Hi Jake, > >It would be more pluggable if you just implement an indexer backend >plugin for your target (storm, spark) so yo

RE: Nutch Extension for realtime processing

2014-06-17 Thread Markus Jelsma
minutes. Markus -Original message- > From:Jake Dodd > Sent: Tuesday 17th June 2014 19:30 > To: dev@nutch.apache.org > Subject: Nutch Extension for realtime processing > > Hi all, > > My organization is mulling the creation of a Nutch Extension Point that would >

Nutch Extension for realtime processing

2014-06-17 Thread Jake Dodd
Hi all, My organization is mulling the creation of a Nutch Extension Point that would enable realtime processing of Nutch documents as they’re fetched. We have the desire to pass Nutch-fetched documents to a realtime framework such as Storm or Spark. Currently, it’s trivial to implement a custo