This is awesome. Thanks :-) *--* *Thamme Gowda* TG | @thammegowda <https://twitter.com/thammegowda> ~Sent via somebody's Webmail server!
On Wed, Apr 19, 2017 at 1:43 PM, Kranthi Kiran G V < kkran...@student.nitw.ac.in> wrote: > Hello mentors, > > I have released a trained model of the neural image captioning system, > im2txt. > It can be found here: > https://github.com/KranthiGV/Pretrained-Show-and-Tell-model > > I am hopeful it would benefit both the researchers community and Apache > Tika's > community for the image captioning. > > Have a lot at it! > > Thank you, > Kranthi Kiran GV, > CS 3/4 Undergrad, > NIT Warangal > > On Wed, Mar 29, 2017 at 6:50 PM, Mattmann, Chris A (3010) < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> Sounds great, and understood. Please prepare your proposal and share with >> Thamme and I for >> feedback as your (potential) mentors. >> >> >> >> Thanks much. >> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++ >> >> Chris Mattmann, Ph.D. >> >> Principal Data Scientist, Engineering Administrative Office (3010) >> >> Manager, NSF & Open Source Projects Formulation and Development Offices >> (8212) >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> >> Office: 180-503E, Mailstop: 180-503 >> >> Email: chris.a.mattm...@nasa.gov >> >> WWW: http://sunset.usc.edu/~mattmann/ >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++ >> >> Director, Information Retrieval and Data Science Group (IRDS) >> >> Adjunct Associate Professor, Computer Science Department >> >> University of Southern California, Los Angeles, CA 90089 USA >> >> WWW: http://irds.usc.edu/ >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++ >> >> >> >> >> >> *From: *Kranthi Kiran G V <kkran...@student.nitw.ac.in> >> *Date: *Wednesday, March 29, 2017 at 9:17 AM >> *To: *Thamme Gowda <thammego...@apache.org> >> *Cc: *Chris Mattmann <mattm...@apache.org>, "dev@tika.apache.org" < >> dev@tika.apache.org> >> *Subject: *Re: Regarding Image Captioning in Tika for Image MIME Types >> >> >> >> Hello, >> >> 1) I have submitted a PR which can be found here >> <https://github.com/apache/tika/pull/163>. >> >> 2) After working on the Show and Tell model since a week, I realized that >> the amount of computation resources I have are enough to take up the >> challenge. >> >> Here is a sample caption I generated after a few days of training. >> >> INFO:tensorflow:Loading model from checkpoint: >> /media/timberners/magicae/models/im2txt/im2txt/model/train/ >> model.ckpt-174685 >> INFO:tensorflow:Successfully loaded checkpoint: model.ckpt-174685 >> Captions for image COCO_val2014_000000224477.jpg: >> 0) a man riding a wave on top of a surfboard . (p=0.016002) >> 1) a man riding a surfboard on a wave in the ocean . (p=0.007747) >> 2) a man riding a wave on a surfboard in the ocean . (p=0.007673) >> >> The evaluation is on the image in the example at im2txt's page >> <https://github.com/tensorflow/models/tree/master/im2txt#generating-captions>. >> >> >> I'm excited to release the pre-trained model (if I'm allowed to) to the >> public during my GSoC journey to enable everyone to use it even though they >> do not have enough resources. I think it would be a great contribution to >> both Apache Tika and Computer Vision community as a whole. >> >> 3) I am working on the schedule. I would be submitting a draft in GSoC >> page. Should I send it here, too? >> >> Regarding my other commitments, I would be working with Amazon India >> Development Centre during May 10th to July 10th. They offer flexible >> working hours. >> >> I would be able to dedicate 40-45 hours per week. My ability to balance >> both of them can be showcased by how I am working at Deep Learning Research >> Group - NITW currently in the college. >> >> What do you think? >> >> >> >> On Mon, Mar 27, 2017 at 11:00 PM, Thamme Gowda <thammego...@apache.org> >> wrote: >> >> Hi Kranthi Kiran, >> >> >> >> 1. Thanks for the update. I look forward to your PR. >> >> >> >> 2. I don't have complete details about compute resources from GSoC. I >> think google offers free credits (Approx. 300$) when students signup to >> Google Compute Engine. I am not worried about it at this time, we can sort >> it out later. >> >> >> >> 3. Great to know!' >> >> >> >> Best, >> >> TG >> >> >> *--* >> >> *Thamme Gowda* >> >> TG | @thammegowda <https://twitter.com/thammegowda> >> >> ~Sent via somebody's Webmail server! >> >> >> >> On Fri, Mar 24, 2017 at 10:42 PM, Kranthi Kiran G V < >> kkran...@student.nitw.ac.in> wrote: >> >> Apologies if I was ambiguous. >> >> >> >> 1) I have already started working on the improvement. The general method >> is working. I'll send a merge request after I port the REST method, too. >> >> >> >> 2) I was mentioning about the computational resources to train the final >> layer of im2txt to output the captions. Google hasn't released a >> pre-trained model. >> >> >> >> 3) I would update the developer community with a tentative GSoC schedule >> by tonight. It would be great if the community gives me suggestions. >> >> >> >> On Mar 25, 2017 12:06 AM, "Thamme Gowda" <thammego...@apache.org> wrote: >> >> Hi Kranthi Kiran, >> >> >> >> Please find my replies below: >> >> >> >> Let me know if you have more questions. >> >> >> >> Thanks, >> >> TG >> >> *--* >> >> *Thamme Gowda* >> >> TG | @thammegowda <https://twitter.com/thammegowda> >> >> ~Sent via somebody's Webmail server! >> >> >> >> On Tue, Mar 21, 2017 at 12:21 PM, Kranthi Kiran G V < >> kkran...@student.nitw.ac.in> wrote: >> >> Hello Thamme Gowda, >> >> Thank you for letting me know of the developer mailing list. I have >> created an issue [1] and I would be working on it. >> >> The change is not straightforward since Inception V3 pre-trained model >> has a graph while the Inception V3 pre-trained model is packaged in the >> form of a check-point (ckpt) [2]. >> >> >> >> Okay, I see Inception-V3 has a graph, V4 has a checkpoint. >> >> I assume there should be a way to restore model from checkpoint? Please >> refer https://www.tensorflow.org/programmers_guide/variables >> #checkpoint_files >> >> >> >> >> >> What do you think of using Keras to implement the Inception V4 model? It >> would make the job of scaling it on CPU clusters easier if we can use >> deeplearning4j's model import. >> >> >> >> Should I proceed in that direction? >> >> >> >> Regarding GSoC, what kind of computation resources are we given access >> to? We would have to train the show and tell network. It takes a lot of >> computation resources. >> >> >> >> If GPUs are not used, we would have to use a CPU cluster. So, the code >> has to be re-written (from the Google implementation of Inception V4). >> >> >> >> >> Training IncpetionV4 from scratch requires too much effort, time, and >> resources. We are not aiming for such things, atleast not as part of Tika >> and GSoC. The suggestion i mentioned earlier was to upgrade IncpetionV3 >> model with Inception V4 pretrained model/checkpoint since that will be more >> benificial to Tika users community :-) >> >> >> >> >> >> >> >> [1] https://issues.apache.org/jira/browse/TIKA-2306 >> >> [2] https://github.com/tensorflow/models/tree/master/slim# >> pre-trained-models >> >> >> >> >> >> >> On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <thammego...@apache.org> >> wrote: >> >> Hi Kranthi Kiran, >> >> >> >> Welcome to Tika Community. we are glad you are interested in working on >> the issue. >> >> Please remember to CC dev@tika mailing list for future discussions >> related to tika. >> >> >> >> *Should the model be trainable by the user?* >> >> The basic minimum requirement is to provide a pre-trained model and make >> the parser work out of the box without Training (expect no GPUs; expect a >> JVM and nothing else). >> >> Of course, the parser configuration should have options to change the >> models by changing the path. >> >> >> >> As part of this GSoC project, integration isn't enough work. If you go >> through the links provided in the Jira page you will notice that there >> models for image recognition but no ready-made models for captioning. We >> will have to train the im2text network from the dataset and make it >> available. Thus we will have to open source the training utilities, >> documentation or any supplementary tools we build along the way. We will >> have to document all these in Tika wiki for the advanced users! >> >> >> >> This is a GSoC issue and thus we expect to work on it during the summer. >> >> >> >> For now, if you want a small task to familiarise yourself with Tika, I >> have a suggestion: >> >> Currently, Tika uses InceptionV3 model from Google for image recognition. >> >> The InceptionV4 model is out recently which proved to be more accurate >> than V3. >> >> >> >> How about upgrading tika to use newer Inception model? >> >> >> >> Let me know if you have more questions. >> >> >> >> Cheers, >> >> TG >> >> >> *--* >> >> *Thamme Gowda* >> >> TG | @thammegowda <https://twitter.com/thammegowda> >> >> ~Sent via somebody's Webmail server! >> >> >> >> On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V < >> kkran...@student.nitw.ac.in> wrote: >> >> Hello, >> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a >> member of Deep Learning research group at out college. I'm interested to >> take up the issue. I believe it would be a great contribution to the Apache >> Tika community. >> >> This is what I have done until now: >> >> 1) Build Tika from source using maven and explore it. >> 2) Tried the object recognition module from the command line. (I should >> probably start using the docker version to speed up my progress.) >> >> I am yet to import a keras model in dl4j. I have some doubts regarding >> the requirements since I'm new to this community. *Should the model be >> trainable by the user?* This is important because the Inception v3 model >> without re-training has performed poorly for me (I'm currently training it >> with less number of steps due to limited computational resources I have -- >> GTX 1070). >> >> TODO (Before submitting the proposal): >> >> 1) Create a test REST API for Tika >> >> 2) Import a few models in dl4j. >> >> 3) Train im2txt on my computer. >> >> Thank you, >> >> Kranthi Kiran >> >> >> >> >> >> >> >> >> >> >> > >