[ https://issues.apache.org/jira/browse/TIKA-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895726#comment-15895726 ]
Aayush Kumar Singha commented on TIKA-2262: ------------------------------------------- Hello everyone, I am from SRM University and currently in my final year. I have worked on python, java and different ML problems. Currently, I am an intern at Amazon and my tenure ends at May 5th. So, will be fully available to contribute. Meanwhile, I will look into keras and deeplearning4j. I have used scikit till now for most of my ML works. I'm looking forward to contribute. > Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types > ------------------------------------------------------------------------ > > Key: TIKA-2262 > URL: https://issues.apache.org/jira/browse/TIKA-2262 > Project: Tika > Issue Type: Improvement > Components: parser > Reporter: Thamme Gowda > Labels: deeplearning, gsoc2017, machine_learning > > h2. Background: > Image captions are a small piece of text, usually of one line, added to the > metadata of images to provide a brief summary of the scenery in the image. > It is a challenging and interesting problem in the domain of computer vision. > Tika already has a support for image recognition via [Object Recognition > Parser, TIKA-1993| https://issues.apache.org/jira/browse/TIKA-1993] which > uses an InceptionV3 model pre-trained on ImageNet dataset using tensorflow. > Captioning an image is a very useful feature since it helps text based > Information Retrieval(IR) systems to "understand" the scenery in images. > h2. Technical details and references: > * Google has long back open sourced their 'show and tell' neural network and > its model for autogenerating captions. [Source Code| > https://github.com/tensorflow/models/tree/master/im2txt], [Research blog| > https://research.googleblog.com/2016/09/show-and-tell-image-captioning-open.html] > * Integrate it the same way as the ObjectRecognitionParser > ** Create a RESTful API Service [similar to this| > https://wiki.apache.org/tika/TikaAndVision#A2._Tensorflow_Using_REST_Server] > ** Extend or enhance ObjectRecognitionParser or one of its implementation > h2. {skills, learning, homework} for GSoC students > * Knowledge of languages: java AND python, and maven build system > * RESTful APIs > * tensorflow/keras, > * deeplearning > ---- > Alternatively, a little more harder path for experienced: > [Import keras/tensorflow model to > deeplearning4j|https://deeplearning4j.org/model-import-keras ] and run them > natively inside JVM. > h4. Benefits > * no RESTful integration required. thus no external dependencies > * easy to distribute on hadoop/spark clusters > h4. Hurdles: > * This is a work in progress feature on deeplearning4j and hence expected to > have lots of troubles on the way! -- This message was sent by Atlassian JIRA (v6.3.15#6346)