The code developed was for integration on nutchwax. The link to the project is: https://webarchive.jira.com/wiki/display/SOC06/Text-based+image+search+capability+for+NutchWAX
The code has been made available to checkout, but it works on a previous version of nutch. http://archive-access.svn.sourceforge.net/svnroot/archive-access/trunk/archive-access/projects/nutchwax/imagesearch/ Correct svn revisions for the code to work: - nutch: REV 678533 - nutchwax: REV 2587 - imagesearch: HEAD On Wed, Aug 10, 2011 at 4:14 PM, Lewis John McGibbney (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082388#comment-13082388 > ] > > Lewis John McGibbney commented on NUTCH-296: > -------------------------------------------- > > Hi Simão, any chance we could obtain the code? If this s the case we will > reopen this issue and mark it somewhere down the line of things to deal with. > > Thank you for getting back to us on this one. > >> Image Search >> ------------ >> >> Key: NUTCH-296 >> URL: https://issues.apache.org/jira/browse/NUTCH-296 >> Project: Nutch >> Issue Type: New Feature >> Reporter: Thomas Delnoij >> Assignee: Lewis John McGibbney >> Priority: Minor >> >> Per the discussion in the Nutch-User mailing list, there is a wish for an >> "Image Search" add-on component that will index images. >> Must have: >> - retrieve outlinks to image files from fetched pages >> - generate thumbnails from images >> - thumbnails are stored in the segments as ImageWritable that contains the >> compressed binary data and some meta data >> Should have: >> - implemented as hadoop map reduce job >> - should be seperate from main Nutch codeline as it breaks general Nutch >> logic of one url == one index document. >> Could have: >> - store the original image in the segments >> Would like to have: >> - search interface for image index >> - parameterizable thumbnail generation (width, height, quality) > > -- > This message is automatically generated by JIRA. > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >