Hello Alexandru

Now I understand. I wanted to keep only two objectives and complete the
pipeline in such a way that further plugins could be integrated later.

I understand object detection would not work if classes are predefined.
Could we focus on some classes in the beginning ? Classification based
approaches can use a pre-trained model. If we find relevant objects good,
if not we can give more importance to text we extract from video.  This
way, it could be scalable (since model training could be done offline,
perhaps are regular intervals).

I would start by using standard object detection approaches and audio
extraction. Get some idea about the statistics of the data in Wikipedia.
And prioritize which approach to build on. This perhaps could be a warm up
task ? Setting up these repositories, getting these articles (with videos
and images) from dump ?

As for the approaches to rank objects, interestingly, i think we will have
to develop something specific to wikipedia. We can exploit web-page based
text features (page representation) and rank objects with respect to this
page representation. Does this sound doable.

Does this project sound worth proposing ?

Best
Manisha

On Tue, Mar 10, 2015 at 10:04 PM, Alexandru Todor <to...@inf.fu-berlin.de>
wrote:

> Hi Manisha,
>
> A lot of the videos and pictures are about certain people. Face detection
> and tracking can be very useful since we want to know which people they are
> about and where in the video they appear. Basically people need to have the
> square around the face to annotate it and we don't want to annotate that
> face again. Plus this is actually one of the easiest things we can do.
>
> Object recognition and tracking is hard and error prone. For a lot of the
> dialogue rich videos it's a lot more helpful to use speech recognition to
> extract some text out of it, even if it's not very good. Once we have some
> text to work with we can extract named entities and index the video a lot
> better than with CV approaches (depends on the video material).
>
> Could we scope it the following way :
>>
>> Take the existing videos
>> Recognize and rank objects. I imagine not all objects will be equally
>> relevant.
>> Build user interface for annotation via crowdflower.
>>
>
> What kind of approach do you want to use for object recognition.
> Traditional approaches use classifiers trained for one class of objects and
> don't scale that well to multiple classes. To achieve any kind of good
> results we'd need to use a Deep Learning toolkit like Caffe [1] .
> But it seems you've worked in this field, which approaches are you
> comfortable with ?
>
> Cheers,
> Alexandru
>
> 1. http://caffe.berkeleyvision.org/
>
>
> On Tue, Mar 10, 2015 at 7:27 PM, manisha verma <manishaverma...@gmail.com>
> wrote:
>
>> Hi Alexandru
>>
>> I am not a pro at image processing. But Ill give it a go.
>> There are some questions (sorry if they are naive!)
>>
>> 1. Why is face detection or face tracking required ?
>> 2. How will voice recognition help ?
>>
>> Last question, are we enriching dbpedia with video being the object and
>> data extracted from the video as its attributes ?
>>
>> The rest is interesting. I checked there are ~6K documents with videos in
>> wikipedia.
>>
>> Could we scope it the following way :
>>
>> Take the existing videos
>> Recognize and rank objects. I imagine not all objects will be equally
>> relevant.
>> Build user interface for annotation via crowdflower.
>>
>>
>> I think I can do the above. I have knowledge of JAVA and hadoop. I can
>> come to terms with OpenCV.
>>
>> Best
>> Manisha
>>
>>
>>
>> On Tue, Mar 10, 2015 at 5:46 PM, Alexandru Todor <to...@inf.fu-berlin.de>
>> wrote:
>>
>>> Hi Manisha,
>>>
>>> We had a media linkage project in the GSOC last year, and were planning
>>> to go into extracting semantic information from media this year. I gave up
>>> on writing it out as an idea for GSoC since it's a bit out of the scope of
>>> a 3-4 months project and really requires deeper knowledge of Computer
>>> Vision.
>>>
>>> The Idea was to build a semi-automatic annotation system that would go
>>> over the videos in Wikimedia Commons, these are already linked (sometimes)
>>> to articles in Wikipedia and fairly representative. Using YouTube or other
>>> sources is also possible.
>>> The system would:
>>>
>>> 1) Do shot detection and divide the video into a series of
>>> representative shots.
>>> 2) Do voice recognition
>>> 3) Use existing trained classifiers to recognize certain object classes
>>> 4) Present annotation options so that crowd workers can annotate objects
>>> that haven't been recognized yet. The output of the annotation step would
>>> be used to train new classifiers.
>>> 5) Do face recognition/face tracking
>>>
>>> I've implemented parts of the system using JavaCV but it would basically
>>> need to be rewritten to use HIPI instead.
>>> Technologies used would be HIPI [1] for CV tasks, CMU Sphinx [2] for
>>> voice recognition and CrowdFlower[3] for the crowd-sourcing/annotation
>>> component.
>>> Implementing this, even partially would require good knowledge of
>>> Java/Scala and experience with Hadoop/HiPi and OpenCV.
>>>
>>> Which parts of the system do you think you could implement, or to what
>>> extent, in the available time ?
>>>
>>> 1. http://hipi.cs.virginia.edu/about.html
>>> 2. http://cmusphinx.sourceforge.net/
>>> 3. http://www.crowdflower.com/
>>>
>>> On Mon, Mar 9, 2015 at 8:33 PM, manisha verma <manishaverma...@gmail.com
>>> > wrote:
>>>
>>>> Hello everyone
>>>>
>>>> I am Manisha Verma, a phd student in Information retrieval. I was
>>>> wondering if GSOC projects has to be one of the listed ideas or students
>>>> can propose their own projects too. I understand that mentoring is
>>>> volunteer effort, however, I just wanted to run by an idea, if anyone would
>>>> find it useful. I would be willing to work on it and submit a project.I
>>>> could finish some warm-up tasks as well.
>>>>
>>>> I have gone through the guidelines, and I wished to know 'how big the
>>>> project needs to be ? Could it be just a prototype ? Or it has to be an
>>>> end-to-end system that runs at scale.'
>>>>
>>>>
>>>> So here it is.
>>>>
>>>> I understand that wikipedia is fairly text based (there are some
>>>> articles with pictures and audio files). Would it be feasible to integrate
>>>> into DBPedia's ontology videos as well ? There are several categories of
>>>> articles that have tremendous amounts of videos available on the internet.
>>>> Video content would capture parts or entire article. For example, an
>>>> article on SVM, could use one of the introductory video from Youtube.
>>>> Similarly, for people like Sachin tendulkar, there are some documentaries.
>>>>
>>>> The focus is is to enrich existing articles.There are several things
>>>> that need to be taken into account here.
>>>>
>>>> 1. First is the collection of Videos itself. For the project, I would
>>>> start with a focused datsaet of videos.
>>>> 2. Second is either their transcription or their metadata linking with
>>>> text. Basically finding the most appropriate video for an article.
>>>> 3. This could be tested with some ground truth, that could be built
>>>> using Mturk. It would cost some amount, but I think I can cover that.
>>>> 4. Lastly, is integrating it with articles. Would you link parts of
>>>> text with the video or not. OR will it just be part of infobox.
>>>>
>>>>
>>>> Within 4 months, a prototype could be generated. I am not sure if it
>>>> will be at a huge scale.
>>>>
>>>> Sorry for such a lengthy email.
>>>>
>>>> Best
>>>> Manisha
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Dive into the World of Parallel Programming The Go Parallel Website,
>>>> sponsored
>>>> by Intel and developed in partnership with Slashdot Media, is your hub
>>>> for all
>>>> things parallel software development, from weekly thought leadership
>>>> blogs to
>>>> news, videos, case studies, tutorials and more. Take a look and join the
>>>> conversation now. http://goparallel.sourceforge.net/
>>>> _______________________________________________
>>>> Dbpedia-gsoc mailing list
>>>> Dbpedia-gsoc@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>>>>
>>>>
>>>
>>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to