[jira] [Commented] (NUTCH-296) Image Search
[ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13162229#comment-13162229 ] Sanjib Narzary commented on NUTCH-296: -- i am working on a content based image retrieval that uses nutch as the main search engine, with the help of LIRe library.i will be happy if this project is on going. Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Assignee: Lewis John McGibbney Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-296) Image Search
[ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102085#comment-13102085 ] Lewis John McGibbney commented on NUTCH-296: Having had a look at this, it is not appropriate for inclusion in current Nutch implementations and would have suited a JSP based web application e.g. Nutch-1.2. I'm going to reclose the issue at this point in time, should we get another web application up and running at least there has been some recent correspondence and the code is available should anyone wish to pursue the issue further. Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Assignee: Lewis John McGibbney Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-296) Image Search
[ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082373#comment-13082373 ] Simão Fontes commented on NUTCH-296: The GSoC did generate some code. There have been no contributions to Nutch or Nutchwax for that matter, but the code is available. -1 Close Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Assignee: Lewis John McGibbney Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-296) Image Search
[ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082388#comment-13082388 ] Lewis John McGibbney commented on NUTCH-296: Hi Simão, any chance we could obtain the code? If this s the case we will reopen this issue and mark it somewhere down the line of things to deal with. Thank you for getting back to us on this one. Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Assignee: Lewis John McGibbney Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (NUTCH-296) Image Search
The code developed was for integration on nutchwax. The link to the project is: https://webarchive.jira.com/wiki/display/SOC06/Text-based+image+search+capability+for+NutchWAX The code has been made available to checkout, but it works on a previous version of nutch. http://archive-access.svn.sourceforge.net/svnroot/archive-access/trunk/archive-access/projects/nutchwax/imagesearch/ Correct svn revisions for the code to work: - nutch: REV 678533 - nutchwax: REV 2587 - imagesearch: HEAD On Wed, Aug 10, 2011 at 4:14 PM, Lewis John McGibbney (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082388#comment-13082388 ] Lewis John McGibbney commented on NUTCH-296: Hi Simão, any chance we could obtain the code? If this s the case we will reopen this issue and mark it somewhere down the line of things to deal with. Thank you for getting back to us on this one. Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Assignee: Lewis John McGibbney Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-296) Image Search
[ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081714#comment-13081714 ] Lewis John McGibbney commented on NUTCH-296: The parsing and extraction of metadata from images is handled by Apache Tika. If we were still working with a web app it would have been possible to get a plugin which combined metadata extraction with indexable thumbnail image snippets which would be available when searching, however this is not the case as search and indexing has been shifted to Solr. What is the status with this issue? Personally I am tempted to suggested we close it, reasoning being that it has not been given any attention in years, it reflects a requirement from an old generation of Nutch functionality, all image related processing is covered by parse-tika and finally there are far far more important issues to be dealt with. One last thing, there has been no code contribution from the 2008 GSoC therefore I'm guessing it was never pursued. Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-296) Image Search
[ https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081750#comment-13081750 ] Markus Jelsma commented on NUTCH-296: - Would be a nice feature but no patches. +1 close. Image Search Key: NUTCH-296 URL: https://issues.apache.org/jira/browse/NUTCH-296 Project: Nutch Issue Type: New Feature Reporter: Thomas Delnoij Priority: Minor Per the discussion in the Nutch-User mailing list, there is a wish for an Image Search add-on component that will index images. Must have: - retrieve outlinks to image files from fetched pages - generate thumbnails from images - thumbnails are stored in the segments as ImageWritable that contains the compressed binary data and some meta data Should have: - implemented as hadoop map reduce job - should be seperate from main Nutch codeline as it breaks general Nutch logic of one url == one index document. Could have: - store the original image in the segments Would like to have: - search interface for image index - parameterizable thumbnail generation (width, height, quality) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira