>> 2. It sounds like a pretty fundamental API shift in Nutch, to support a >> single type of content, RSS. Even if there are more content types that >> follow this model, as Doug and Renaud both pointed out, there aren't a >> multitude of them (perhaps archive files, but can you think of any others)?
> Also true. On the other hand, Nutch provides 98% of an RSS search > engine. It'd be a shame to have to re-invent everything else and it > would be great if Nutch could evolve to support RSS well. > > Could image search might also benefit from this? One could generate a > Parse for each image on a page whose text was from the page. Product > search too, perhaps. Another application could be splitting certain enterprise documents up, either based on passage retrieval algorithms or simply based on the table of content entries. For example, a long contract or user guide could be split up into separate searchable documents. Best regards, Alan _________________________ Alan Tanaman iDNA Solutions http://blog.idna-solutions.com ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
