[jira] [Commented] (NUTCH-296) Image Search

2011-12-03 Thread Sanjib Narzary (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13162229#comment-13162229
 ] 

Sanjib Narzary commented on NUTCH-296:
--

i am working on a content based image retrieval that uses nutch as the main 
search engine, with the help of LIRe library.i will be happy if this project is 
on going.

 Image Search
 

 Key: NUTCH-296
 URL: https://issues.apache.org/jira/browse/NUTCH-296
 Project: Nutch
  Issue Type: New Feature
Reporter: Thomas Delnoij
Assignee: Lewis John McGibbney
Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data 
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-296) Image Search

2011-09-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102085#comment-13102085
 ] 

Lewis John McGibbney commented on NUTCH-296:


Having had a look at this, it is not appropriate for inclusion in current Nutch 
implementations and would have suited a JSP based web application e.g. 
Nutch-1.2.

I'm going to reclose the issue at this point in time, should we get another web 
application up and running at least there has been some recent correspondence 
and the code is available should anyone wish to pursue the issue further. 

 Image Search
 

 Key: NUTCH-296
 URL: https://issues.apache.org/jira/browse/NUTCH-296
 Project: Nutch
  Issue Type: New Feature
Reporter: Thomas Delnoij
Assignee: Lewis John McGibbney
Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data 
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-296) Image Search

2011-08-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082373#comment-13082373
 ] 

Simão Fontes commented on NUTCH-296:


The GSoC did generate some code. There have been no contributions to Nutch or 
Nutchwax for that matter, but the code is available.
-1 Close

 Image Search
 

 Key: NUTCH-296
 URL: https://issues.apache.org/jira/browse/NUTCH-296
 Project: Nutch
  Issue Type: New Feature
Reporter: Thomas Delnoij
Assignee: Lewis John McGibbney
Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data 
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-296) Image Search

2011-08-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082388#comment-13082388
 ] 

Lewis John McGibbney commented on NUTCH-296:


Hi Simão, any chance we could obtain the code? If this s the case we will 
reopen this issue and mark it somewhere down the line of things to deal with.

Thank you for getting back to us on this one.

 Image Search
 

 Key: NUTCH-296
 URL: https://issues.apache.org/jira/browse/NUTCH-296
 Project: Nutch
  Issue Type: New Feature
Reporter: Thomas Delnoij
Assignee: Lewis John McGibbney
Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data 
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (NUTCH-296) Image Search

2011-08-10 Thread Simão Fontes
The code developed was for integration on nutchwax. The link to the project is:
https://webarchive.jira.com/wiki/display/SOC06/Text-based+image+search+capability+for+NutchWAX

The code has been made available to checkout, but it works on a
previous version of nutch.
http://archive-access.svn.sourceforge.net/svnroot/archive-access/trunk/archive-access/projects/nutchwax/imagesearch/

Correct svn revisions for the code to work:
  - nutch: REV 678533
  - nutchwax: REV 2587
  - imagesearch: HEAD

On Wed, Aug 10, 2011 at 4:14 PM, Lewis John McGibbney (JIRA)
j...@apache.org wrote:

    [ 
 https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082388#comment-13082388
  ]

 Lewis John McGibbney commented on NUTCH-296:
 

 Hi Simão, any chance we could obtain the code? If this s the case we will 
 reopen this issue and mark it somewhere down the line of things to deal with.

 Thank you for getting back to us on this one.

 Image Search
 

                 Key: NUTCH-296
                 URL: https://issues.apache.org/jira/browse/NUTCH-296
             Project: Nutch
          Issue Type: New Feature
            Reporter: Thomas Delnoij
            Assignee: Lewis John McGibbney
            Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

 --
 This message is automatically generated by JIRA.
 For more information on JIRA, see: http://www.atlassian.com/software/jira





[jira] [Commented] (NUTCH-296) Image Search

2011-08-09 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081714#comment-13081714
 ] 

Lewis John McGibbney commented on NUTCH-296:


The parsing and extraction of metadata from images is handled by Apache Tika. 
If we were still working with a web app it would have been possible to get a 
plugin which combined metadata extraction with indexable thumbnail image 
snippets which would be available when searching, however this is not the case 
as search and indexing has been shifted to Solr.

What is the status with this issue? Personally I am tempted to suggested we 
close it, reasoning being that it has not been given any attention in years, it 
reflects a requirement from an old generation of Nutch functionality, all image 
related processing is covered by parse-tika and finally there are far far more 
important issues to be dealt with. 

One last thing, there has been no code contribution from the 2008 GSoC 
therefore I'm guessing it was never pursued.

  

 Image Search
 

 Key: NUTCH-296
 URL: https://issues.apache.org/jira/browse/NUTCH-296
 Project: Nutch
  Issue Type: New Feature
Reporter: Thomas Delnoij
Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data 
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-296) Image Search

2011-08-09 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081750#comment-13081750
 ] 

Markus Jelsma commented on NUTCH-296:
-

Would be a nice feature but no patches. +1 close.

 Image Search
 

 Key: NUTCH-296
 URL: https://issues.apache.org/jira/browse/NUTCH-296
 Project: Nutch
  Issue Type: New Feature
Reporter: Thomas Delnoij
Priority: Minor

 Per the discussion in the Nutch-User mailing list, there is a wish for an 
 Image Search add-on component that will index images.
 Must have:
 - retrieve outlinks to image files from fetched pages
 - generate thumbnails from images
 - thumbnails are stored in the segments as ImageWritable that contains the 
 compressed binary data and some meta data 
 Should have:
 - implemented as hadoop map reduce job
 - should be seperate from main Nutch codeline as it breaks general Nutch 
 logic of one url == one index document.
 Could  have:
 - store the original image in the segments
 Would like to have:
 - search interface for image index
 - parameterizable thumbnail generation (width, height, quality)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira