Re: [Dbpedia-gsoc] GSoC 2017

2017-03-13 Thread Marco Fossati
Hi Agneet and welcome.

Warm-up tasks for the DBTax project are publicly available here:
http://wiki.dbpedia.org/ideas/idea/291/unsupervised-learning-of-a-dbpedia-taxonomy-dbtax/

Cheers,

Marco

On 2/27/17 23:19, Agneet Chatterjee wrote:
> Hello everyone,
>  I am a second year CS undergrad at Jadavpur University.
> I am comfortable with C/C++/Python/Java.
> I've worked with Python libraries such as numpy, tensorflow, opencv.
> I have an avid interest in machine learning and Deep Learning. And I
> have implemented a few models on the same.
> I have gone through the GSoC 2017 ideas lists and I found interest over
> two projects, "Knowledge base embeddings for DBPedia" and "Unsupervised
> Learning for DBPedia Taxonomy"
>
>
> Could someone get me started or assign me a starting task ?
>
> Thank you,
> Agneet Chatterjee
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>
>
>
> ___
> Dbpedia-gsoc mailing list
> Dbpedia-gsoc@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Query regarding List_Extractor

2017-03-05 Thread Marco Fossati
Hi guys,

On 3/5/17 12:08, Federica Baiocchi wrote:
> He told me that he was planning to implement an automatic restart sooner
> or later
I fear he would have never done that, so I've just fixed it.
The service now is scheduled to restart once a month.
Please use it for low-traffic requests however.
> look for a way to include the JSONpedia framework in the code to
> use it offline.
Absolutely yes, it would be a much more efficient solution.
> I did try to do it at first, but I encountered many
> obstacles and compatibility problems with Python and Java and wasted a
> lot of time.
Did you try with Jython with no success?
Or maybe we should translate the current code from Python to Java.
We could give a try with something like this:
https://github.com/chrishumphreys/p2j

What do you think?
Best,

Marco

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] DBTax Project

2017-02-14 Thread Marco Fossati
Hi Shashank and thanks for your work,

The codebase is currently a scattered set of scripts: the goal itself of 
the project is to make it ready for production use.
The former GSoC project code can be found at:
https://github.com/kkasunperera/dbpedia/tree/processcategories/WikipediaCategoryProcessor
And here is an how-to:
https://docs.google.com/document/d/1IMD8UhqXI3iX9jvZ38gHe1NlJo1X5Zrj3YKRx3wq_2s/edit?usp=sharing

I will upload these links to the ideas page as well.
Keep up with the good work.
Best,

Marco

On 2/12/17 15:47, Shashank Motepalli wrote:
> Hi All,
>
> I have found the project DBTax interesting. I have gone through the
> paper and also completed the Warm-Up
> tasks(https://github.com/sm86/DBpediaTask).
>
> I have cloned the project from bit bucket, but I couldn't figure out how
> to run the project? Can someone help me getting started wit the project.
>
> Thanks in advance.
> Regards,
> Shashank Motepalli
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>
>
>
> ___
> Dbpedia-gsoc mailing list
> Dbpedia-gsoc@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Re-Introduction

2016-03-01 Thread Marco Fossati
Hey Felix, welcome back!

Marco

On 3/1/16 02:13, Felix Sonntag wrote:
> Hi everyone,
>
> I’m Felix, I already introduced myself last year, but I guess I’ll shortly 
> reintroduce myself. I’m a Master student in Informatics at TUM in Munich. I’m 
> pretty excited about the DBpedia project: I’m using Spotlight for a project 
> about analyzing artist data at the moment, and I’ve used DBpedia data for an 
> app last year. I already tried to participate in GSOC with you last year, 
> unfortunately it didn’t work out (apparently it was really close :P).
>
> I’ve just finished my first Master semester with putting a focus on ML, Data 
> Analytics and NLP in my studies.
>
> There are some project ideas I’m keen on working, but I’ll directly post on 
> the project sites.
>
> One general question: for the Spotlight project there exists only a rough 
> idea by Philipp, and there are also no warm up tasks. Can we expect more from 
> that? :)
>
> Best,
> Felix
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Dbpedia-gsoc mailing list
> Dbpedia-gsoc@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Hello All

2016-03-01 Thread Marco Fossati
Hi Aditya and welcome!

You can go ahead and look at our ideas for a project:
http://wiki.dbpedia.org/ideas/ideas/scope:all/sort:activity-desc/tags:gsoc/

Cheers,

Marco

On 2/24/16 15:31, Aditya Nambiar wrote:
> I am final year undergraduate student pursuing Computer Science and
> Engineering at IIT Bombay ,and I would like to be a part of Dbpedia in
> GSoC 2016.
>
>   I have already been a part of Google Summer of Code 2014 and was able
> to successfully complete my project which later went into production. I
> find that my interests are a lot aligned with what Dbpedia is all about
> and hence am excited about being a part of it.
>
> I have had previous experience with working on wikipedia and had
> published a paper "Concept Hierarchies and Human Navigation" in the IEEE
> BigData Conference 2015
>  and have a
> pretty good experience in Scala.
>
> You can find my resume at - https://www.cse.iitb.ac.in/~adityan/Resume.pdf
>
> Thanks & Regards,
> Aditya Nambiar
>
>
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
>
>
>
> ___
> Dbpedia-gsoc mailing list
> Dbpedia-gsoc@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] DBpedia GSoC 2016 Table Extractor

2016-03-01 Thread Marco Fossati
Hi Nitish and welcome!

FYI, I have updated the idea with a big warm-up task that everyone 
should take to show your skills:
http://wiki.dbpedia.org/ideas/idea/59/the-table-extractor/?answer=90#post-id-90
Cheers,

Marco

On 2/22/16 17:11, Nitish Jain wrote:
> Hi,
>
> I am a student from India currently pursuing my Bachelors degree in
> Computer Science and Engineering from IIIT Hyderabad. I am currently
> doing my research in the field of Data retrieval and Clustering. I am
> interested to work upon the idea of "The Table Extractor" as a part of
> GSoC 2016. I have gone through the code that was suggested in the idea
> and I would like to further work upon it. Can someone please guide me
> for the warm-up tasks required for the same?
>
> Thanks,
> Nitish Jain
>
>
>
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
>
>
>
> ___
> Dbpedia-gsoc mailing list
> Dbpedia-gsoc@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Introduction

2015-03-23 Thread Marco Fossati
Hi Felix,

On 3/22/15 11:59 PM, Felix Sonntag wrote:
 Hi everybody,

 I’m Felix from Germany and I’m excited to meet all of you! I’m a CS student 
 with a focus on computational linguistics. I have used DBPedia before in an 
 app I built for a Hackathon, which produced an automatic generated quiz from 
 DBPedia data. So I would love to contribute to the main project!

 I will graduate in the next few months and afterwards I’m planning to start 
 my master studies. Right now I’m writing my bachelor’s thesis on topic models 
 used in social networks.

  From the given project ideas, I can see myself working on 5.1 or 5.9. I 
 already looked into the warm-up tasks and links provided (thumbs up for the 
 Germany national football team). Unfortunately I’m really busy at the moment 
 due to my thesis, but I will do my best to make small contributions.

 I already have a question about the 5.1 task:
 How would you suggest to do the verb ranking and what would you use it for? 
 Spontaneously I would think about frequency ranking. But I’m not sure how 
 that could be used for. For deciding which facts are really meaningful?
Frequency-based ranking is a baseline.
For a better weighted ranking, I computed the TF/IDF matrix of each 
verbal token against the corpus, and ranked by lemma via the standard 
deviation of the sum of the TF/IDF scores.
It's a big monster, but may yield interesting verbs.
 You also mentioned that the construction of the training set will be done by 
 with crowdsourcing. FrameNet already provides 170,000 annotated sentences. So 
 why produce additional data?
Good point.
For the soccer use case, I couldn't find any frames that could be reused 
from FrameNet. Can you?
Instead, http://kicktionary.de/ is a FrameNet for soccer, but it seems 
overspecific.
Anyway, it would be interesting to investigate further.
I'll file an issue for that.

Cheers!

 I’ll look forward to take a closer look at the code!

 Best,
 Felix
 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Another GSoC Introduction

2015-03-17 Thread Marco Fossati
Hi Philipp,

Sounds good! Feel free to post your thoughts here and to submit pull 
requests to the repos you want to be involved into.
Cheers,

On 3/17/15 4:31 PM, Philipp Dowling wrote:
 Hey everyone,

 My name is Philipp, I'm from Germany and I'm happy to meet you guys! I
 mostly do work in computational linguistics and NLP, so DBpedia was one
 of the most interesting projects in GSoC this year for me. My main
 strengths and/or interests are continuous space vector models, neural
 networks and information retrieval.

 A little bit about me: I'm just about finished with my undergrad studies
 in Munich, and will start my masters next. Most recently, I was in Hong
 Kong for my B.Sc. thesis, conducting research on semantic MT evaluation.
 I also work at a local startup, building data mining and knowledge
 discovery systems.

 To be more specific about my interests for GSoC: I'm most interested in
 tasks 5.15, 5.1, 5.9 and 5.12 (roughly in that order).
 5.15 especially overlaps a lot with research I've been doing for my
 thesis, where I investigated the performance of continuous space models
 such as Word2Vec as a replacement for discrete context vectors, with
 very positive results. I got very familiar with different vector models
 from this, and would love to now continue working on something like this
 in a knowledge mining context.
 I also got exposed to frame semantics a little in the same context, and
 I'm currently working on knowledge mining, so 5.1 would also be a very
 interesting project.
 I'll come back with more specific questions when I've gotten a chance to
 look at everything else in detail, but overall I'm very excited to start
 getting to work!

 I'll get into some of the warm up tasks as soon as I get a chance. I
 haven't worked with DBpedia much before, so it'll be interesting to dive
 into the code base.

 Cheers,
 Philipp Dowling


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSOC newbie

2015-03-16 Thread Marco Fossati
Hi Amitrajit,

On 3/14/15 8:12 PM, Amitrajit Sarkar wrote:
 hi Dimitris,

 thanks for the tip. in the meanwhile I shall read up and keep working on
 the warm up tasks, then..

 cheers,
 Amitrajit

 On Sat, Mar 14, 2015 at 8:09 PM, Dimitris Kontokostas jimk...@gmail.com
 mailto:jimk...@gmail.com wrote:

 Hi Amitrajit  welcome,

 Until the application period we keep all conversations public and we
 discuss any question about the project in this mailing list.
 There is already some discussion on this project if you search the
 ml archives and you are welcome to ask more detailed questions.
 Marco, the mentor for this project will happily guide you

 Cheers,
 Dimitris

 On Sat, Mar 14, 2015 at 4:22 PM, Amitrajit Sarkar
 aaiijm...@gmail.com mailto:aaiijm...@gmail.com wrote:

 hi..

 my name is Amitrajit. I am a CS undergraduate student from
 Jadavpur University, India. Im fluent in C, C++, Java, Python,
 and have been working on Natural Language Processing and
 Artificial Intelligence for a while now. but this is my first
 time applying for Google Summer of Code. the ideas: 'fact
 extraction from Wikipedia text' and 'reverse engineering and
 aligning Freebase with DBpedia' caught my attention. I dropped
 an email yesterday introducing myself. since then, I went ahead
 and tried out one of the warmup tasks on dbpedia/fact-extractor.
 Ive issued a pull request on GitHub..
Saw that, thanks!
Check out my comments directly on the pull request conversation.

 to the best of my understanding (which may not be much), fact
 extraction would be an unsupervised (or semisupervised)
 dependency parsed pattern interpretation,
Nope, it will be fully supervised, eventually backed by distant 
supervision, as another potential candidate has interestingly pointed out.
Dependency parsing is currently not needed, since it's more a matter of 
chunking/entity linking.
 whereas database
 alignment would be a matter of finding and linking (and
 sometimes creating) common vertices and edges on the knowledge
 graph. but I was hoping Id be able to talk to someone about the
 projects..

 any help would be welcome. thank you..

 
 --
 Dive into the World of Parallel Programming The Go Parallel
 Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is
 your hub for all
 things parallel software development, from weekly thought
 leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and
 join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




 --
 Kontokostas Dimitris




 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSOC Introduction

2015-03-16 Thread Marco Fossati
Hi Ankush,

On 3/14/15 10:12 PM, Ankush Jindal wrote:
 Hi Mentors,

 I read about the fact extractor project from the idea page, and I would
 like to work on it.
Sounds ggod.
The project has received quite a lot of interest, so please read first 
all the ongoing discussions in this mailing list.
 I am Ankush Jindal, pursuing bachelors in Computer Science from IIT
 Mandi. I have worked this winters on Natural Language Processing
 (sentiment analysis and feature extraction for Hotel reviews) and I feel
 that I would be fit for this project.
 I have a little hesitation that I am a little late in discussing the
 project and I am really sorry for it. However, I make sure that I will
 work on the warm-up tasks and go through the code as soon as I could. If
 you could advise me some particular warm-up tasks or some other exercise
 that you require from me, given the time-frame, I would be more than
 happy to do so.
Everything is here:
https://github.com/dbpedia/fact-extractor/issues
Since lots of people took issue #1, I would suggest to focus on the 
other ones, especially those that have a higher difficulty (check out 
the tags).

Cheers!

 //
 /Ankush Jindal/
 /Student, IIT Mandi, India
 Phone: +91-9805901195/
 /Facebook: @/jindalankush95 http://facebook.com/jindalankush95
 Github: @travis-bickle https://github.com/travis-bickle


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] [Dbpedia-discussion] GSOC candidate

2015-03-13 Thread Marco Fossati
Hi John,

Please post your inquiries in the specific mailing list (in CC).
Cheers!

On 3/13/15 4:31 PM, john okorie wrote:
 Hi everyone,
 My name is John Okorie, a student at Regent University College of
 Science and Technology, Ghana. I am pursuing an undergraduate course of
 computer engineering. I would like to work for this organisation on
 keyword search on DBpedia. I am excited to be here

 *John Okorie*
 *Google Student Ambassador*
 *Regent University College Of Science And Technology*

 *MyBlog www.chikaokorie.blogspot.com http://www.chikaokorie.blogspot.com*


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-discussion mailing list
 dbpedia-discuss...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Introduction - GSoC 2015

2015-03-12 Thread Marco Fossati
Hi Nina,

I see you have already submitted a pull request for the project repo, 
thanks!
I've just merged it. Keep up with the good work!
Cheers,

On 3/12/15 2:48 PM, Nina Wan wrote:
 Hi all,

 My name is Nina Wan. I am a first-year graduate student of George
 Washington University. My major is Computer Science, focusing on
 Software Engineering.

 I'm interested in the following project:
 *5.1.* Fact Extraction from Wikipedia Text

 I have about 4-year experience in Java programming and several popular
 relational databases. I am a beginner with NPL and Open source project.
 I'm working on the warm-up tasks to get familiar with the technologies
 used in the above project. Hope I can learn from and make a contribution
 to the GSoC projects of DBpedia.

 Thanks!
 Nina


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] 404 on GSOC- 2015 ideas Page

2015-03-11 Thread Marco Fossati
Hi Naveen,

Try to check it now.
Cheers!

On 3/10/15 6:56 PM, Naveen Madhire wrote:
 Hi,

 I am getting a 404 on the GSOC ideas page,

 Please look into this.

 http://wiki.dbpedia.org/gsoc2015/ideas


 Thanks,
 Naveen


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


[Dbpedia-gsoc] [Fact Extraction from Wikipedia text] Repo + Warm-up Tasks

2015-03-09 Thread Marco Fossati
Hi everyone,

Here is the repo containing the preliminary codebase for the Fact 
Extraction from Wikipedia Text idea 
(http://wiki.dbpedia.org/gsoc2015/ideas#h460-7):
https://github.com/dbpedia/fact-extractor

DISCLAIMER: the code is completely unpolished, it's up to you to review 
and take care of it!

I'm currently populating the 'issues' section of the repo with a list of 
warm-up tasks.

Those potential candidates who have expressed their interest in the idea 
can take one or more tasks, in a first-come first-served fashion.
Cheers!
-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC 2015 Fact Extraction from Wikipedia Text

2015-03-09 Thread Marco Fossati
Hi Kenji,

Find my answers inline.

On 3/9/15 10:56 AM, Kenji Yamauchi wrote:
 Hello, everyone.

 Could you answer the following questions to clarify the project Fact
 Extraction from Wikipedia Text and could you suggest the existing
 warm-up tasks related to this project, if exist?
I dropped a mail with the link, repasting it:
https://github.com/dbpedia/fact-extractor
 (I've chosen Relation Extraction as my master thesis topic and I would
 like to combine it with this project, i.e. this project would be
 practical part of my master thesis, If I could.)

 Firstly, just a confirmation, is it the aim of this project that we
 enrich the existing datasets? For instance, given the thing in [1], do
 we finally add new facts about the thing by using the developed
 framework?
Yes, you got it.

 Secondly, is it fixed that we use FrameNet for the fact extraction?
 Could I use other approaches, such as Distant Supervision[2][3][4], to
 extract facts if they are proper?
I would prefer a Frame-based representation (as I'm confident with it), 
but I am completely open to distant supervision.
So feel free to motivate and expand this, both here and in your proposal.
Can you repaste reference 4, as I don't see it in the list below?

 Finally, if we use FrameNet, how about you consider following 3 points?
 1. As far as I understand, the project description shows only the flow
 of extracting frames from the source text. How should I use the
 frames? Do we directly use the frames as new relations between the
 entities, or that we convert the Frames to existing DBpedia's RDF
 properties (such as dbpedia-owl:successor) after the extraction?
Good point.
I would prioritize what emerges from the corpus, in a data-driven way, 
and model the frames accordingly.
Whenever possible, we should reuse existing resources like FrameNet, and 
existing DBpedia (or other aligned schemas) ontology properties.
In other words, this means:
A. get some results first
B. model them to fit into DBpedia

 2. On the step Verb extraction and ranking, which verbs is the
 target of ranking?
 All of the verbs in the article, as Peresa said? Or can I propose other 
 targets?
All the domain-specific verbs that emerge from a weighted measure (i.e., 
TF-IDF) against the corpus.

 3. Creating frames seems to depend on each language. Will I develop
 the framework on English articles?
Frames should be language-agnostic actually.
However,we should start from a specific domain (i.e., soccer) in a 
specific language (i.e., English), and see if the learned frames can be 
generalized to all the languages DBpedia extracts.

Cheers!

 Thank you and regards,
 Kenji Yamauchi

 [1] http://www.dbpedia.org/page/It%C5%8D_Hirobumi
 [2] http://nlp.stanford.edu/software/mimlre.shtml
 [3] http://nlp.stanford.edu/pubs/emnlp2012-mimlre.pdf

 -
 Kenji Yamauchi
 Master's course on Kyoto University

 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Gsoc '15 participant mailing list introduction

2015-03-09 Thread Marco Fossati
Hi Rishi,

you should focus on one idea, read the references and start expanding it 
for your proposal.
Cheers!

On 3/8/15 6:18 AM, Rishi Mittal wrote:
 I am M.Tech student at International Institute of Information
 technology,Hyderabad(IIIT-H) ,
 I have found following projects aligning to my interest.

 5.1 Fact Extraction from Wikipedia Text
 5.2 New Dynamic Extractors from Wikipedia Content with JSONpedia Faceted
 Browsing
 5.9 Keyword Search on DBpedia

 Please find this as my linkedIn profile : in.linkedin.com/in/
 http://in.linkedin.com/in/rishimittalneo
 Please do tell me the procedure to start.

 Thanks  Regards
 Rishi Mittal
 M.Tech ( IIIT-Hyderabad)


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoc-Student-15

2015-03-09 Thread Marco Fossati
Hi Ayushi,

You should pick one idea and start expanding it.
Cheers!

On 3/8/15 8:15 AM, Ayushi Pandey wrote:
 Dear DBpedia

 I am a Master's student of Computational Linguistics at The EFL
 University, Hyderabad. I'm interested in
 1) Fact Extraction from Wikipedia Text
 2)Keyword Search on DBpedia, as suggested on your ideas webpage.

 I have worked within the area of topic-modelling on an under-resourced
 language using Latent Dirichlet Allocation (LDA) and have also performed
 Information Retrieval for text-based classification using Machine Learning.

 In addition to Python coding based tasks, I would also like to use my
 knowledge from Linguistics to contribute to the projects.

 Thank you,
 Ayushi Pandey
 Masters Student (Sem IV)
 Dept. of  Computational Linguistics
 The EFL- University
 Hyderabad
 500 065



 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC Self Introduction

2015-03-09 Thread Marco Fossati
Hi Kenji,

There are a couple of threads in this mailing list expanding that idea. 
Have a look at them!
Cheers!

On 3/8/15 9:12 AM, Kenji Yamauchi wrote:
 Hello, my name is Kenji Yamauchi.
 I am a master's course's student at Kyoto University and my major is NLP.
 I've joined a research about Semantic Web during the 4th grade on my
 undergraduate course.

 I am interested in the idea Fact Extraction from Wikipedia Text.
 I'll follow existing papers and discussions.

 Thanks and regarsds

 
 Kenji Yamauchi

 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC 2015

2015-03-09 Thread Marco Fossati
Hi Oleksandr,

There is no ranking for the ideas, sou you should pick one and start 
expanding it.
Cheers!

On 3/8/15 10:59 PM, Oleksandr Olgashko wrote:
 Hello,

 I'd like to investigate possibilities to participate in GSoC as part of
 DBpedia organizations. Since I never participated in GSoC before, some
 questions may sound naive.

 My name is Oleksandr Olgashko, I'm a first year master's student in
 Taras Shevchenko National University of Kyiv (Ukraine). Some links about me:
 https://github.com/dveim
 https://www.linkedin.com/in/olgashko
 https://www.coursera.org/user/i/d5878dc26bfe6cbe456d0e119d96e551

 My primary interests are machine learning (particularly, natural
 language processing, what I was doing on previous project) and data
 analysis, also I'm a fan of Scala programming language. DBpedia has most
 natural combination of those skills.

 On your ideas page I've found several interesting projects, like 5.3,
 5.7, 5.14, 5.16, 5.17. Which of them are more relevant, so I can start
 research deeper?

 Thanks for answers




 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC 2015 Introduction

2015-03-09 Thread Marco Fossati



 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc





 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Fact extraction from wikipedia text

2015-03-02 Thread Marco Fossati
Hi Emilio,

On 3/2/15 11:40 AM, Emilio Dorigatti wrote:
 Hello,
 I am also interested in working in the project about fact extraction
 from wikipedia text, I would like to ask for some clarifications about
 the machine learning part of it. The core of the project is to train a
 classifier using a training set built following the approaches described
 in the linked papers. As I understood it, the following tasks are
 needed; given a sentence

   1a. Identify all the LUs using NLP techniques;
   2b. Identify all the entities in the sentence which may represent FEs
 using again NLP techniques (ASRL perhaps?)
Entity linking is the way to go.
   2. Use the FrameNet definition for the identified LUs to find the
 required FEs;
FrameNet may be either too specific or too complex for crowdsourcing.
Hence, we should adapt/simplify the frame and FEs definitions accordingly.
   3. Ask the user whether a certain entity fits a certain FE (for all
 entities and FEs);
   4. Understand which is the correct LU based on the meanings given in
 step (3).
The correct LU should be already there, and we want to minimize LU 
ambiguity, i.e., how many frames can be triggered by one LU.
Thus, the selection of LU via verb ranking will be a VERY important step.

 In the linked papers few is mentioned about steps (1a) and (1b) (but
 clarification has already been asked for), step (2) is straightforward
 and step (4) has already been implemented, the classifier is needed for
 step (3). Thus, it has to answers questions such as can this entity be
 this FE? or is this entity this FE in this context? (the latter being
 a lot harder in my opinion). It is not clear to me, though, which
 features should be used to train this classifier.
Good point.
I already have a baseline including linguistic features other than the 
FEs and frames themselves (that will come as output of the crowdsourced 
annotation).
We should first test it, and then tune the features if needed.

 Frequently, in text classification, there is an one-to-one mapping
 between words and features; in this case  FEs have to be used instead of
 words (FrameNet currently recognizes slightly more than 10k FEs). There
 is also a need for features identifying the possible entities, but
 clearly we cannot use the whole DBpedia knowledge base (roughly 4.6
 million entities) for this. I see that FEs belonging to a frame are
 usually of different types, so I think using /classes/ instead of
 /instances/ could be a promising alternative (DBpedia has 685 classes).
+1 for the entity types. This feature is actually implemented as a 
suggestion mechanism in the referenced workshop paper, and we could 
reuse it as an extra feature.
But first we need to focus on something that works, then we can tune.
 Probably other features are needed though.

 Sorry for the long wall of text, I tried to express my thoughts in the
 shortest way I could. What do you think?
That's a great feedback, please keep up with it!
Cheers!

 Emilio.


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Fwd: GSOC_2015 Fact Extraction from Wikipedia Text

2015-03-02 Thread Marco Fossati
Hi Kasun and thanks for the feedback on the project idea!
You can find my answers inline.
Cheers!

On 3/2/15 5:10 AM, kasun perera wrote:

 Forwarding my last email since I didn't get any feedback.
 Thanks

 -- Forwarded message --

 Hi Marco and others

 I like to work on the Gsoc project Fact Extraction from Wikipedia Text
 during this summer.

 I went through the project description and the research papers mentioned
 under the description. I have few questions to clarify.

 1- As mentioned in the project idea the main objective is the
 implementation of a new text extractor. Will this need to be implemented
 inside the current extraction-framework?
Ideally yes.
 Or would it be a completely new
 tool?

 2- Also it mentioned the use of NLP techniques to process Wikipedia
 text. Does this means extraction of Dependency relationships to get the
 frame elements (FE) and lexical unit(LU)?
Dependency parsing may not be needed, since entity linking can be 
applied to fulfill the task.
 There are several NLP
 libraries like Stanford parser, RelEx, NLTK etc. Is there any decision
 made which NLP library to use?
NLTK could be a way to go if we decide to use Python, but there is no 
constraint on libraries.
The ones that serve our purposes are the good ones. :-)

 3- Also regarding the content of a Wikipedia page; do we use all the
 sentences from the Wikipedia page? My idea is it's better if we can use
 important sentences rather than all the sentences. If that is the better
 idea we have to come up with a criteria to select important sentences.
Good point.
I would first proceed with a domain-specific use case (i.e., soccer) to 
assess the feasibility of the idea. Then, we can generalize.
Hence, we want to extract specific facts from sentences that may trigger 
soccer-related frames.
Verb extraction and ranking (i.e., step A of the idea) would cater for 
this task.

Cheers!



 --
 Regards

 Kasun Perera




 --
 Regards

 Kasun Perera


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


[Dbpedia-gsoc] Fwd: Re: [Dbpedia-discussion] Getting started with DBpedia (GSoC 2015)

2015-03-02 Thread Marco Fossati
Forwarding this to the specific mailing list.
@Alberto, please continue the conversation here.
Cheers!


 Forwarded Message 
Subject: Re: [Dbpedia-discussion] Getting started with DBpedia (GSoC 2015)
Date: Mon, 02 Mar 2015 12:25:37 +0100
From: Marco Fossati hell.j@gmail.com
To: Dimitris Kontokostas jimk...@gmail.com, Alberto Nicoletti 
alby...@gmail.com
CC: dbpedia-discuss...@lists.sourceforge.net 
dbpedia-discuss...@lists.sourceforge.net

Hi Alberto,

have a look at our idea page:
http://wiki.dbpedia.org/gsoc2015/ideas#h460-6

Cheers!

On 2/27/15 9:52 AM, Dimitris Kontokostas wrote:
 Hi Alberto and welcome to DBpedia

 please look at the suggested topics we provided. Then, depending on you
 preferences we could give you some warm-up tasks related to the topics
 of your interest.

 For everyone, a very good introduction to DBpedia is the following article:
 Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris
 Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey,
 Patrick van Kleef, Sören Auer, Christian Bizer. DBpedia – A Large-scale,
 Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web
 Journal, Vol. 6 No. 2, pp 167–195, 2015.

 Cheers,
 Dimtiris

 On Tue, Feb 24, 2015 at 8:44 PM, Alberto Nicoletti alby...@gmail.com
 mailto:alby...@gmail.com wrote:

 Hi everyone,
 I'm a Computer Science student from University of Bologna, in Italy,
 i'm looking forward to this year's Google Summer of Code and i've
 seen DBpedia has been selected in 2013 and 2014 so, hoping it will
 be selected this year too, i'm interested in this organization :)

 I just wanted to ask if you could give me some advice to get me
 started and some documentation I can read to comprehend your work.

 I noticed there are some issues on GitHub tagged as GSoC Warmup
 task, so I think they are some little issues that can be resolved
 from a newbie like me, am I right?

 I'm also new to open source development in a real organization so if
 there is something I should know, I please you to let me know.

 Thank you very much in forward,
 Alberto Nicoletti

 
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your
 hub for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-discussion mailing list
 dbpedia-discuss...@lists.sourceforge.net
 mailto:dbpedia-discuss...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




 --
 Kontokostas Dimitris


 --
 Dive into the World of Parallel Programming The Go Parallel Website, sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for all
 things parallel software development, from weekly thought leadership blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/



 ___
 Dbpedia-discussion mailing list
 dbpedia-discuss...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j



--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Fine grained massive extraction of Wikipedia content warm up task

2014-04-14 Thread Marco Fossati
Yep, English is an exception (it's the international one).
Cheers!

On 4/11/14, 7:36 PM, Roberto Bampi wrote:
 Ok! Is english an exception to this or I shall write en.dbpedia.org
 http://en.dbpedia.org every time the language is en ?
 Rob


 On 11 April 2014 18:38, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Roberto,

 You can go ahead with the next warmup task.
 Can you please take into account the comment I just made [1] on your
 pull request for the next task as?
 Cheers!

 [1]
 
 https://github.com/dbpedia/__extraction-framework/pull/199#__discussion-diff-11540045
 
 https://github.com/dbpedia/extraction-framework/pull/199#discussion-diff-11540045


 On 4/11/14, 2:25 PM, Roberto Bampi wrote:

 I pushed the code for pull request #190 [1].
 If this is ok I will proceed with pr #195.
 Roberto

 [1] https://github.com/dbpedia/__extraction-framework/pull/199
 https://github.com/dbpedia/extraction-framework/pull/199


 On 10 April 2014 17:27, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:



  On 4/10/14, 5:18 PM, Roberto Bampi wrote:

  Ok, I'm almost done with pull request #190 [1].
  I just have a little question: the url you provided on
 the pull
  request
  returns an empty reply. The only way I managed to make
 it work
  was to
  change the url from [2] to [3].
  Is this correct ?

  Yep, that's right, Michele has recently implemented regex-based
  filters and this must be due to the JAVA regex engine.
  I updated the issue instructions.

  Thanks,
  Roberto

  [1]
 https://github.com/dbpedia/extraction-framework/issues/190
 https://github.com/dbpedia/__extraction-framework/issues/__190

 https://github.com/dbpedia/__extraction-framework/issues/__190
 https://github.com/dbpedia/extraction-framework/issues/190
  [2]
 
 http://json.it.dbpedia.org/annotate/resource/json/en%3ARamones?filter=url:jpgprocs=Extractors,Structure
 
 http://json.it.dbpedia.org/__annotate/resource/json/en%__3ARamones?filter=url:jpg__procs=Extractors,Structure

 
 http://json.it.dbpedia.org/__annotate/resource/json/en%__3ARamones?filter=url:jpg__procs=Extractors,Structure
 
 http://json.it.dbpedia.org/annotate/resource/json/en%3ARamones?filter=url:jpgprocs=Extractors,Structure
  [3]
 
 http://json.it.dbpedia.org/annotate/resource/json/en%3ARamones?filter=url:.*jpgprocs=Extractors,Structure
 
 http://json.it.dbpedia.org/__annotate/resource/json/en%__3ARamones?filter=url:.*jpg__procs=Extractors,Structure


 
 http://json.it.dbpedia.org/__annotate/resource/json/en%__3ARamones?filter=url:.*jpg__procs=Extractors,Structure
 
 http://json.it.dbpedia.org/annotate/resource/json/en%3ARamones?filter=url:.*jpgprocs=Extractors,Structure


  On 10 April 2014 14:12, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
  mailto:hell.j@gmail.com mailto:hell.j@gmail.com
  mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com__ wrote:

   Create your own branch of course, the link was
 meant for the
   location, not the branch.


   On 4/9/14, 3:09 PM, Marco Fossati wrote:

   If no one objects, I think you can put them here:

 
 https://github.com/dbpedia/__extraction-framework/tree/__master/scripts
 
 https://github.com/dbpedia/extraction-framework/tree/master/scripts

 
 https://github.com/dbpedia/extraction-framework/tree/master/scripts
 
 https://github.com/dbpedia/__extraction-framework/tree/__master/scripts




 
 https://github.com/dbpedia/extraction-framework/tree/master/scripts
 
 https://github.com/dbpedia/__extraction-framework/tree/__master/scripts

 
 https://github.com/dbpedia/__extraction-framework/tree/__master/scripts
 
 https://github.com/dbpedia/extraction-framework/tree/master/scripts

   You don't need to run the extraction for that.
   Cheers!

   On 4/8/14, 8:12 PM, Roberto Bampi wrote:

   Where should I put the scripts then ?
   Thanks,
   Rob


   On 8 April 2014 14:24, Marco Fossati
  hell.j@gmail.com

Re: [Dbpedia-gsoc] Fine grained massive extraction of Wikipedia content warm up task

2014-04-11 Thread Marco Fossati
Hi Roberto,

You can go ahead with the next warmup task.
Can you please take into account the comment I just made [1] on your 
pull request for the next task as?
Cheers!

[1] 
https://github.com/dbpedia/extraction-framework/pull/199#discussion-diff-11540045

On 4/11/14, 2:25 PM, Roberto Bampi wrote:
 I pushed the code for pull request #190 [1].
 If this is ok I will proceed with pr #195.
 Roberto

 [1] https://github.com/dbpedia/extraction-framework/pull/199


 On 10 April 2014 17:27, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:



 On 4/10/14, 5:18 PM, Roberto Bampi wrote:

 Ok, I'm almost done with pull request #190 [1].
 I just have a little question: the url you provided on the pull
 request
 returns an empty reply. The only way I managed to make it work
 was to
 change the url from [2] to [3].
 Is this correct ?

 Yep, that's right, Michele has recently implemented regex-based
 filters and this must be due to the JAVA regex engine.
 I updated the issue instructions.

 Thanks,
 Roberto

 [1]
 https://github.com/dbpedia/__extraction-framework/issues/__190
 https://github.com/dbpedia/extraction-framework/issues/190
 [2]
 
 http://json.it.dbpedia.org/__annotate/resource/json/en%__3ARamones?filter=url:jpg__procs=Extractors,Structure
 
 http://json.it.dbpedia.org/annotate/resource/json/en%3ARamones?filter=url:jpgprocs=Extractors,Structure
 [3]
 
 http://json.it.dbpedia.org/__annotate/resource/json/en%__3ARamones?filter=url:.*jpg__procs=Extractors,Structure
 
 http://json.it.dbpedia.org/annotate/resource/json/en%3ARamones?filter=url:.*jpgprocs=Extractors,Structure


 On 10 April 2014 14:12, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:

  Create your own branch of course, the link was meant for the
  location, not the branch.


  On 4/9/14, 3:09 PM, Marco Fossati wrote:

  If no one objects, I think you can put them here:

 
 https://github.com/dbpedia/extraction-framework/tree/master/scripts
 
 https://github.com/dbpedia/__extraction-framework/tree/__master/scripts


 
 https://github.com/dbpedia/__extraction-framework/tree/__master/scripts
 https://github.com/dbpedia/extraction-framework/tree/master/scripts

  You don't need to run the extraction for that.
  Cheers!

  On 4/8/14, 8:12 PM, Roberto Bampi wrote:

  Where should I put the scripts then ?
  Thanks,
  Rob


  On 8 April 2014 14:24, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
  mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com
  mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com__

  wrote:

   Hi Roberto,

   You should not use the extraction framework at
 all for
  those tasks.
   They are made to get acquainted with *JSONpedia*.

   Marco

   On 4/8/14, 11:08 AM, Roberto Bampi wrote:
 Hello everyone,
 I am trying to work on the warmup tasks
 [1,2] for my
  proposal but
   I am
 having an hard time setting up the extraction
  framework so I can
   make it
 run.
 I managed to download the dump for the english
  wikipedia by
  following
 the instructions provided on github [3],
 but when I
  try to run the
 extraction step it gets stuck complaining
 about the
  fact that it
   can't
 find the ontology[4]. My configuration is
 basically
  a copy-paste
   of the
 examples in the dump directory [5].
 Could you please help me out ?
 Thanks,
 Rob

 [1]
 https://github.com/dbpedia/extraction-framework/issues/190
 https://github.com/dbpedia/__extraction-framework/issues/__190

 https://github.com/dbpedia/__extraction-framework/issues/__190

Re: [Dbpedia-gsoc] Fine grained massive extraction of Wikipedia content warm up task

2014-04-10 Thread Marco Fossati
Create your own branch of course, the link was meant for the location, 
not the branch.

On 4/9/14, 3:09 PM, Marco Fossati wrote:
 If no one objects, I think you can put them here:

 https://github.com/dbpedia/extraction-framework/tree/master/scripts

 You don't need to run the extraction for that.
 Cheers!

 On 4/8/14, 8:12 PM, Roberto Bampi wrote:
 Where should I put the scripts then ?
 Thanks,
 Rob


 On 8 April 2014 14:24, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Roberto,

 You should not use the extraction framework at all for those tasks.
 They are made to get acquainted with *JSONpedia*.

 Marco

 On 4/8/14, 11:08 AM, Roberto Bampi wrote:
   Hello everyone,
   I am trying to work on the warmup tasks [1,2] for my proposal but
 I am
   having an hard time setting up the extraction framework so I can
 make it
   run.
   I managed to download the dump for the english wikipedia by
 following
   the instructions provided on github [3], but when I try to run the
   extraction step it gets stuck complaining about the fact that it
 can't
   find the ontology[4]. My configuration is basically a copy-paste
 of the
   examples in the dump directory [5].
   Could you please help me out ?
   Thanks,
   Rob
  
   [1] https://github.com/dbpedia/extraction-framework/issues/190
   [2] https://github.com/dbpedia/extraction-framework/issues/190
   [3]
  

 https://github.com/dbpedia/extraction-framework/wiki/Extraction-Instructions

   [4] http://pastebin.com/AMWzUKjw
   [5] http://pastebin.com/PsyyrTsk
  
  
  

 --

   Put Bad Developers to Shame
   Dominate Development with Jenkins Continuous Integration
   Continuously Automate Build, Test  Deployment
   Start a new project now. Try Jenkins in the cloud.
   http://p.sf.net/sfu/13600_Cloudbees
  
  
  
   ___
   Dbpedia-gsoc mailing list
   Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
  

 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j


 --

 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Fine grained massive extraction of Wikipedia content warm up task

2014-04-09 Thread Marco Fossati
If no one objects, I think you can put them here:

https://github.com/dbpedia/extraction-framework/tree/master/scripts

You don't need to run the extraction for that.
Cheers!

On 4/8/14, 8:12 PM, Roberto Bampi wrote:
 Where should I put the scripts then ?
 Thanks,
 Rob


 On 8 April 2014 14:24, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Roberto,

 You should not use the extraction framework at all for those tasks.
 They are made to get acquainted with *JSONpedia*.

 Marco

 On 4/8/14, 11:08 AM, Roberto Bampi wrote:
   Hello everyone,
   I am trying to work on the warmup tasks [1,2] for my proposal but
 I am
   having an hard time setting up the extraction framework so I can
 make it
   run.
   I managed to download the dump for the english wikipedia by following
   the instructions provided on github [3], but when I try to run the
   extraction step it gets stuck complaining about the fact that it
 can't
   find the ontology[4]. My configuration is basically a copy-paste
 of the
   examples in the dump directory [5].
   Could you please help me out ?
   Thanks,
   Rob
  
   [1] https://github.com/dbpedia/extraction-framework/issues/190
   [2] https://github.com/dbpedia/extraction-framework/issues/190
   [3]
  
 
 https://github.com/dbpedia/extraction-framework/wiki/Extraction-Instructions
   [4] http://pastebin.com/AMWzUKjw
   [5] http://pastebin.com/PsyyrTsk
  
  
  
 
 --
   Put Bad Developers to Shame
   Dominate Development with Jenkins Continuous Integration
   Continuously Automate Build, Test  Deployment
   Start a new project now. Try Jenkins in the cloud.
   http://p.sf.net/sfu/13600_Cloudbees
  
  
  
   ___
   Dbpedia-gsoc mailing list
   Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
  

 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j

 
 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc



-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-20 Thread Marco Fossati
3 months, cf. [1].
Cheers!

[1] http://www.google-melange.com/gsoc/events/google/gsoc2014

On 3/19/14, 7:41 PM, Abhijit Pratap Singh Tomar wrote:
 Thanks Marco. Can you tell me how many weeks the project is expected to
 last ?


 On Wed, Mar 19, 2014 at 2:00 PM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Abhijit,

 CCing the TOOSO project founders, who can give you more details.
 Cheers!


 On 3/19/14, 5:14 PM, Abhijit Pratap Singh Tomar wrote:

 Hello,

 Could you please tell me more about the TOOSO project and how it
 fits
 into our project ?

 Thanks,

 Abhijit

 (P.S. I realize that i am sending a lot of emails in quick
 succession
 but please bear with me. I intend to finish the first draft
 today and
 submit it to you for review. I apologize for any inconvenience)


 On Wed, Mar 19, 2014 at 12:29 AM, Dimitris Kontokostas
 jimk...@gmail.com mailto:jimk...@gmail.com
 mailto:jimk...@gmail.com mailto:jimk...@gmail.com wrote:

  Hi Abhijit


  On Mar 18, 2014 6:00 PM, Abhijit Pratap Singh Tomar
  apt...@nyu.edu mailto:apt...@nyu.edu
 mailto:apt...@nyu.edu mailto:apt...@nyu.edu wrote:
   
Hi Marco,
   
I am trying to create a first draft of my application.
 Could you
  please give me a few pointers for these questions :
   
Please describe a tentative project architecture or an
 approach
  to it:

  In this idea, you don't have to create a new architecture from
  scratch. You can reuse the existing ones: DBpedia+Open QA
 framework.
  The point here is to show up that you understand what you
 are going
  to implement and mention where  your contribution will be.

Please detail an expected project plan and timeline with
 milestones:

  Break your summer coding time in weekly or biweekly tasks.
 Eg week
  1: do X, week 2 do Y, ...

Please include in your plan how will you evaluate the
 performance
  of your contribution (in terms of time, or accuracy, or both),
  as well as which data sets you will use for that evaluation.

  The form is general and does not apply as is in all
 projects. In
  this case the idea mentors can help you.
  However note that #1#2 are more important.

  Best,
  Dimitris

Thanks,
   
Abhijit
   
   
On Mon, Mar 17, 2014 at 5:18 AM, Marco Fossati
  hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:
   
Hi Abhijit,
   
Understanding of graph structured data, specifically
 RDF syntax
  and SPARQL, is a mandatory requirement for working with us.
I suggest you to first read the Wikipedia article about
 RDF [1],
  then to dig into the official specs [2, 3, 4].
   
Cheers!
   
[1]
 http://en.wikipedia.org/wiki/__Resource_Description_Framework
 http://en.wikipedia.org/wiki/Resource_Description_Framework
[2] http://www.w3.org/TR/rdf11-__concepts/
 http://www.w3.org/TR/rdf11-concepts/
[3] http://www.w3.org/TR/sparql11-__query/
 http://www.w3.org/TR/sparql11-query/
[4] http://www.w3.org/TR/owl-__overview/
 http://www.w3.org/TR/owl-overview/
   
   
On 3/17/14, 3:54 AM, Abhijit Pratap Singh Tomar wrote:
   
Hi Marco,
   
I was going over the tutorial about SPARQL that you
 mentioned
  in your
last email. Could you explain what is meant by RDF ?
 Also, I
  seem to run
into trouble a few times while trying to execute some
 queries. For
example,on slide 9 when i try to run a query on
http://sparql.org/sparql.html nothing happens. Is the data
  returned for
all queries in JSON format ?
   
Finally, is it essential that I go over the complete
 tutorial
  before
making my proposal ?
   
Thanks,
   
Abhijit
   
   
On Tue, Mar 11, 2014 at 8:44 AM, Marco Fossati
  hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j

Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-20 Thread Marco Fossati
This is quite a complex problem to tackle, but I would give high 
priority to that. The TOOSO guys have a formal semantics approach, we 
can get help from them.
Qakis is a different project, you should investigate how to take the 
best out of several projects, not to implement a part of one project 
into another.
Cheers!

On 3/20/14, 3:55 AM, Abhijit Pratap Singh Tomar wrote:
 Hi Marco,

 Could you tell me how essential it is for the project to be
 language-agnostic ? Are we going to model this property over QAkis ?

 thanks,

 Abhijit


 On Wed, Mar 19, 2014 at 2:00 PM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Abhijit,

 CCing the TOOSO project founders, who can give you more details.
 Cheers!


 On 3/19/14, 5:14 PM, Abhijit Pratap Singh Tomar wrote:

 Hello,

 Could you please tell me more about the TOOSO project and how it
 fits
 into our project ?

 Thanks,

 Abhijit

 (P.S. I realize that i am sending a lot of emails in quick
 succession
 but please bear with me. I intend to finish the first draft
 today and
 submit it to you for review. I apologize for any inconvenience)


 On Wed, Mar 19, 2014 at 12:29 AM, Dimitris Kontokostas
 jimk...@gmail.com mailto:jimk...@gmail.com
 mailto:jimk...@gmail.com mailto:jimk...@gmail.com wrote:

  Hi Abhijit


  On Mar 18, 2014 6:00 PM, Abhijit Pratap Singh Tomar
  apt...@nyu.edu mailto:apt...@nyu.edu
 mailto:apt...@nyu.edu mailto:apt...@nyu.edu wrote:
   
Hi Marco,
   
I am trying to create a first draft of my application.
 Could you
  please give me a few pointers for these questions :
   
Please describe a tentative project architecture or an
 approach
  to it:

  In this idea, you don't have to create a new architecture from
  scratch. You can reuse the existing ones: DBpedia+Open QA
 framework.
  The point here is to show up that you understand what you
 are going
  to implement and mention where  your contribution will be.

Please detail an expected project plan and timeline with
 milestones:

  Break your summer coding time in weekly or biweekly tasks.
 Eg week
  1: do X, week 2 do Y, ...

Please include in your plan how will you evaluate the
 performance
  of your contribution (in terms of time, or accuracy, or both),
  as well as which data sets you will use for that evaluation.

  The form is general and does not apply as is in all
 projects. In
  this case the idea mentors can help you.
  However note that #1#2 are more important.

  Best,
  Dimitris

Thanks,
   
Abhijit
   
   
On Mon, Mar 17, 2014 at 5:18 AM, Marco Fossati
  hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:
   
Hi Abhijit,
   
Understanding of graph structured data, specifically
 RDF syntax
  and SPARQL, is a mandatory requirement for working with us.
I suggest you to first read the Wikipedia article about
 RDF [1],
  then to dig into the official specs [2, 3, 4].
   
Cheers!
   
[1]
 http://en.wikipedia.org/wiki/__Resource_Description_Framework
 http://en.wikipedia.org/wiki/Resource_Description_Framework
[2] http://www.w3.org/TR/rdf11-__concepts/
 http://www.w3.org/TR/rdf11-concepts/
[3] http://www.w3.org/TR/sparql11-__query/
 http://www.w3.org/TR/sparql11-query/
[4] http://www.w3.org/TR/owl-__overview/
 http://www.w3.org/TR/owl-overview/
   
   
On 3/17/14, 3:54 AM, Abhijit Pratap Singh Tomar wrote:
   
Hi Marco,
   
I was going over the tutorial about SPARQL that you
 mentioned
  in your
last email. Could you explain what is meant by RDF ?
 Also, I
  seem to run
into trouble a few times while trying to execute some
 queries. For
example,on slide 9 when i try to run a query on
http://sparql.org/sparql.html nothing happens. Is the data
  returned for
all queries in JSON format ?
   
Finally, is it essential that I go over the complete
 tutorial

Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-19 Thread Marco Fossati
Hi Abhijit,

On 3/19/14, 5:10 PM, Abhijit Pratap Singh Tomar wrote:
 Hi Dimitris, Marco

 How essential is Quepy for this project ? Do I need to learn it before
 writing my proposal ? Can i put Quepy down as one of the milestones in
 my proposal ?
Compared to the other referenced projects, Quepy is more like a toy. It 
basically uses regexes to model natural language, so it's quite naive.
However, it can be useful as a very first milestone of the engine.
If you dig a bit into the library, you will have a clearer idea.

 Kindly respond at your earliest convenience.

 Thanks,

 Abhijit


 On Wed, Mar 19, 2014 at 12:29 AM, Dimitris Kontokostas
 jimk...@gmail.com mailto:jimk...@gmail.com wrote:

 Hi Abhijit


 On Mar 18, 2014 6:00 PM, Abhijit Pratap Singh Tomar
 apt...@nyu.edu mailto:apt...@nyu.edu wrote:
  
   Hi Marco,
  
   I am trying to create a first draft of my application. Could you
 please give me a few pointers for these questions :
  
   Please describe a tentative project architecture or an approach
 to it:

 In this idea, you don't have to create a new architecture from
 scratch. You can reuse the existing ones: DBpedia+Open QA framework.
 The point here is to show up that you understand what you are going
 to implement and mention where  your contribution will be.

   Please detail an expected project plan and timeline with milestones:

 Break your summer coding time in weekly or biweekly tasks. Eg week
 1: do X, week 2 do Y, ...

   Please include in your plan how will you evaluate the performance
 of your contribution (in terms of time, or accuracy, or both),
 as well as which data sets you will use for that evaluation.

 The form is general and does not apply as is in all projects. In
 this case the idea mentors can help you.
 However note that #1#2 are more important.

 Best,
 Dimitris

   Thanks,
  
   Abhijit
  
  
   On Mon, Mar 17, 2014 at 5:18 AM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com wrote:
  
   Hi Abhijit,
  
   Understanding of graph structured data, specifically RDF syntax
 and SPARQL, is a mandatory requirement for working with us.
   I suggest you to first read the Wikipedia article about RDF [1],
 then to dig into the official specs [2, 3, 4].
  
   Cheers!
  
   [1] http://en.wikipedia.org/wiki/Resource_Description_Framework
   [2] http://www.w3.org/TR/rdf11-concepts/
   [3] http://www.w3.org/TR/sparql11-query/
   [4] http://www.w3.org/TR/owl-overview/
  
  
   On 3/17/14, 3:54 AM, Abhijit Pratap Singh Tomar wrote:
  
   Hi Marco,
  
   I was going over the tutorial about SPARQL that you mentioned
 in your
   last email. Could you explain what is meant by RDF ? Also, I
 seem to run
   into trouble a few times while trying to execute some queries. For
   example,on slide 9 when i try to run a query on
   http://sparql.org/sparql.html nothing happens. Is the data
 returned for
   all queries in JSON format ?
  
   Finally, is it essential that I go over the complete tutorial
 before
   making my proposal ?
  
   Thanks,
  
   Abhijit
  
  
   On Tue, Mar 11, 2014 at 8:44 AM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
   mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:
  
   Hi Abhijit (in CC the mailing list for transparency),
  
  
  
  
   On 3/9/14, 11:54 PM, Abhijit Pratap Singh Tomar wrote:
  
   Hi Marco, how are you ?
  
   Sorry about not responding sooner (schoolwork :( ). I
 have been
   looking
   at the documentation of the various modules on
  
   http://okbqa.org/
  
   I intended to take a look at some queries on google
 drive, but i
   think i
   might need permission form you guys.
  
   @Christina, could you take care of this please?
  
   Anyway, could you tell me a bit
   more about SPARQL ? is it a special query processing
 tool ? what
   level
   of competency is required in SPARQL if I want to make a
 meaningful
   contribution to the project. I hope I am making sense
 here !
  
   SPARQL knowledge is vital for this project, since you must
 be able
   to map natural language to SPARQL queries.
   Have a look at this tutorial [1].
  
  
   Finally, could you give me an idea as to what the
 degree of my
   contribution going to be in this project, if I get
 selected ?
  
   You will be an official committer of the DBpedia codebase.
 Also, if
   the project has some interesting

Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-19 Thread Marco Fossati
Hi Abhijit,

CCing the TOOSO project founders, who can give you more details.
Cheers!

On 3/19/14, 5:14 PM, Abhijit Pratap Singh Tomar wrote:
 Hello,

 Could you please tell me more about the TOOSO project and how it fits
 into our project ?

 Thanks,

 Abhijit

 (P.S. I realize that i am sending a lot of emails in quick succession
 but please bear with me. I intend to finish the first draft today and
 submit it to you for review. I apologize for any inconvenience)


 On Wed, Mar 19, 2014 at 12:29 AM, Dimitris Kontokostas
 jimk...@gmail.com mailto:jimk...@gmail.com wrote:

 Hi Abhijit


 On Mar 18, 2014 6:00 PM, Abhijit Pratap Singh Tomar
 apt...@nyu.edu mailto:apt...@nyu.edu wrote:
  
   Hi Marco,
  
   I am trying to create a first draft of my application. Could you
 please give me a few pointers for these questions :
  
   Please describe a tentative project architecture or an approach
 to it:

 In this idea, you don't have to create a new architecture from
 scratch. You can reuse the existing ones: DBpedia+Open QA framework.
 The point here is to show up that you understand what you are going
 to implement and mention where  your contribution will be.

   Please detail an expected project plan and timeline with milestones:

 Break your summer coding time in weekly or biweekly tasks. Eg week
 1: do X, week 2 do Y, ...

   Please include in your plan how will you evaluate the performance
 of your contribution (in terms of time, or accuracy, or both),
 as well as which data sets you will use for that evaluation.

 The form is general and does not apply as is in all projects. In
 this case the idea mentors can help you.
 However note that #1#2 are more important.

 Best,
 Dimitris

   Thanks,
  
   Abhijit
  
  
   On Mon, Mar 17, 2014 at 5:18 AM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com wrote:
  
   Hi Abhijit,
  
   Understanding of graph structured data, specifically RDF syntax
 and SPARQL, is a mandatory requirement for working with us.
   I suggest you to first read the Wikipedia article about RDF [1],
 then to dig into the official specs [2, 3, 4].
  
   Cheers!
  
   [1] http://en.wikipedia.org/wiki/Resource_Description_Framework
   [2] http://www.w3.org/TR/rdf11-concepts/
   [3] http://www.w3.org/TR/sparql11-query/
   [4] http://www.w3.org/TR/owl-overview/
  
  
   On 3/17/14, 3:54 AM, Abhijit Pratap Singh Tomar wrote:
  
   Hi Marco,
  
   I was going over the tutorial about SPARQL that you mentioned
 in your
   last email. Could you explain what is meant by RDF ? Also, I
 seem to run
   into trouble a few times while trying to execute some queries. For
   example,on slide 9 when i try to run a query on
   http://sparql.org/sparql.html nothing happens. Is the data
 returned for
   all queries in JSON format ?
  
   Finally, is it essential that I go over the complete tutorial
 before
   making my proposal ?
  
   Thanks,
  
   Abhijit
  
  
   On Tue, Mar 11, 2014 at 8:44 AM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
   mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:
  
   Hi Abhijit (in CC the mailing list for transparency),
  
  
  
  
   On 3/9/14, 11:54 PM, Abhijit Pratap Singh Tomar wrote:
  
   Hi Marco, how are you ?
  
   Sorry about not responding sooner (schoolwork :( ). I
 have been
   looking
   at the documentation of the various modules on
  
   http://okbqa.org/
  
   I intended to take a look at some queries on google
 drive, but i
   think i
   might need permission form you guys.
  
   @Christina, could you take care of this please?
  
   Anyway, could you tell me a bit
   more about SPARQL ? is it a special query processing
 tool ? what
   level
   of competency is required in SPARQL if I want to make a
 meaningful
   contribution to the project. I hope I am making sense
 here !
  
   SPARQL knowledge is vital for this project, since you must
 be able
   to map natural language to SPARQL queries.
   Have a look at this tutorial [1].
  
  
   Finally, could you give me an idea as to what the
 degree of my
   contribution going to be in this project, if I get
 selected ?
  
   You will be an official committer of the DBpedia codebase.
 Also, if
   the project has some interesting outcome from a research
   perspective, we may follow-up for a publication.
  
   Cheers

Re: [Dbpedia-gsoc] Gsoc2014(Automated Wikidata mappings to DBpedia ontology)

2014-03-19 Thread Marco Fossati
Hi Nitesh,

Have a look at the following threads [1] for more details.
Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32095225/

On 3/18/14, 5:54 PM, Dimitris Kontokostas wrote:
 Hi Nitesh  welcome to our community,

 We discussed this project with some students already.
 You can look for related discussions and warm up tasks on our mailing
 list archive [1].

 What you should definately do is explore the dbpedia mappings wiki [2]
 and how Wikidata properties are used.
 A good overview of DBpedia can be found on our latest publication [3]

 Best,
 Dimtiris

 [1]
 http://sourceforge.net/p/dbpedia/mailman/search/?q=wikidatamail_list=dbpedia-gsoclimit=100
 [2] mappings.dbpedia.org http://mappings.dbpedia.org
 [3] dbpedia.org/publications http://dbpedia.org/publications


 On Tue, Mar 18, 2014 at 1:44 PM, Nitesh Garg garg93nit...@gmail.com
 mailto:garg93nit...@gmail.com wrote:

 Hi,

 I am a pre-final year undergraduate student at The LNM Institute o
 Information Technology, Jaipur where I have been pursuing my
 B.Tech.(Hons.) in Computer Science. I am really interested in doing
 the project titled as Automated Wikidata mappings to DBpedia
 ontology. I have been studying on this topic for few days. In one
 of my project i have successfully implemented Content Based Mailing
 Classifier i.e. with this design we can classify user’s Gmail inbox
 in a better way and further it can be classified into more
 categories like spam, non spam, private, public, professional etc.
 For this purpose I used Naive Bayes Classifier algorithm.

 I am currently facing some problems writing the proposal for the
 project so it will help me if you could give me some more references
 or samples to read about the proceeding of project step by step.
 Please also suggest me some warm up task so that i can get a feel to
 code for this as soon as possible.

 Thanks

 Nitesh


 
 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and
 their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




 --
 Kontokostas Dimitris


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-17 Thread Marco Fossati
Hi Abhijit,

Understanding of graph structured data, specifically RDF syntax and 
SPARQL, is a mandatory requirement for working with us.
I suggest you to first read the Wikipedia article about RDF [1], then to 
dig into the official specs [2, 3, 4].

Cheers!

[1] http://en.wikipedia.org/wiki/Resource_Description_Framework
[2] http://www.w3.org/TR/rdf11-concepts/
[3] http://www.w3.org/TR/sparql11-query/
[4] http://www.w3.org/TR/owl-overview/

On 3/17/14, 3:54 AM, Abhijit Pratap Singh Tomar wrote:
 Hi Marco,

 I was going over the tutorial about SPARQL that you mentioned in your
 last email. Could you explain what is meant by RDF ? Also, I seem to run
 into trouble a few times while trying to execute some queries. For
 example,on slide 9 when i try to run a query on
 http://sparql.org/sparql.html nothing happens. Is the data returned for
 all queries in JSON format ?

 Finally, is it essential that I go over the complete tutorial before
 making my proposal ?

 Thanks,

 Abhijit


 On Tue, Mar 11, 2014 at 8:44 AM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Abhijit (in CC the mailing list for transparency),




 On 3/9/14, 11:54 PM, Abhijit Pratap Singh Tomar wrote:

 Hi Marco, how are you ?

 Sorry about not responding sooner (schoolwork :( ). I have been
 looking
 at the documentation of the various modules on

 http://okbqa.org/

 I intended to take a look at some queries on google drive, but i
 think i
 might need permission form you guys.

 @Christina, could you take care of this please?

 Anyway, could you tell me a bit
 more about SPARQL ? is it a special query processing tool ? what
 level
 of competency is required in SPARQL if I want to make a meaningful
 contribution to the project. I hope I am making sense here !

 SPARQL knowledge is vital for this project, since you must be able
 to map natural language to SPARQL queries.
 Have a look at this tutorial [1].


 Finally, could you give me an idea as to what the degree of my
 contribution going to be in this project, if I get selected ?

 You will be an official committer of the DBpedia codebase. Also, if
 the project has some interesting outcome from a research
 perspective, we may follow-up for a publication.

 Cheers!

 [1]
 
 https://www.__cambridgesemantics.com/__semantic-university/sparql-by-__example
 https://www.cambridgesemantics.com/semantic-university/sparql-by-example


 Thanks,

 Abhijit


 On Thu, Mar 6, 2014 at 4:47 AM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:

  Hi Abhijit and welcome on board!

  Have a look at the following threads [1, 2] for more detailed
  discussions about the project.
  Don't hesitate to come back to us once you get a clearer view.
  Cheers!

  [1]
 http://sourceforge.net/p/__dbpedia/mailman/message/__32024743/
 http://sourceforge.net/p/dbpedia/mailman/message/32024743/
  [2]
 http://sourceforge.net/p/__dbpedia/mailman/message/__32026164/
 http://sourceforge.net/p/dbpedia/mailman/message/32026164/


  On 3/5/14, 11:12 PM, Abhijit Pratap Singh Tomar wrote:
Hi,
   
My name is Abhijit. I am a Computer Science graduate
 student at
  NYU. I
am really interested in knowing more about the project
 Natural
  language
question answering engine  .
   
I have studied Natural Language Processing in my last
 semester. I
created an Email Summarization Tool and later Extended
 it to General
Text Summarization. This semester I am taking two
 courses; Machine
Learning and Computational Geometry. I would like to
 implement those
techniques, if needed.
   
Below is a link to my work on Natural Language
 Processing on my
  github
profile.
   
https://github.com/abtpst/__Email-Thread-Summarization
 https://github.com/abtpst/Email-Thread-Summarization
   
https://github.com/abtpst/__Text-Summarization
 https://github.com/abtpst/Text-Summarization
   
Kindly provide me with more information about this
 project and please
let me know if you need some more information on my
 background.
   
   
Thanks,
   
Abhijit

Re: [Dbpedia-gsoc] Pattern Discovery and Knowledge Base Completion

2014-03-14 Thread Marco Fossati
Hi Joey and welcome on board!

Take a look at the following threads to dig deeper into the idea [].
Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32037592/
[2] http://sourceforge.net/p/dbpedia/mailman/message/32029333/
[3] http://sourceforge.net/p/dbpedia/mailman/message/32083999/

On 3/13/14, 6:49 PM, joey.zh...@gmail.com wrote:
 Hi all,
 My name is Gensheng Zhang (or simply, Joey). I am a second year Computer
 Science Ph.D student at UT Arlington, and interested in the Knowledge
 Base Completion project. My current research interests are also
 knowledge base related. I am very comfortable with coding as well, given
 that I had worked as a software engineer for a few years before pursuing
 my Ph.D. degree.
 The listed idea of  “Pattern Discovery and Knowledge Base Completion”
 seems naturally fit in Association Rule Mining methods, although the
 methods may be too computation intensive. Could you please tell me if
 there is any specific requirements for applying this project? I have
 very limited knowledge of Scala, but more than willing to learn it.
 Joey


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] The First Step for GSOC 2014: About QA Engine Project

2014-03-14 Thread Marco Fossati
Hi Wencan and welcome to our community!

QA is quite a hot topic. Take a look at the following threads [1, 2].

On 3/14/14, 2:36 AM, wencan luo wrote:
 This is Wencan Luo. A third-year PhD student in University of
 Pittsburgh. My research interest is Natural Language Processing (NLP).

 I'm interested in the idea Natural language question answering engine
 http://wiki.dbpedia.org/gsoc2014/ideas#h359-22. I know it is a
 difficult project since a general QA system is really hard. However, it
 is a success. It will be another deep QA system like Watson and will
 beat Google to some extend. I believe my NLP background, strong
 programming skill and my great interest in DBpedia project can make this
 project move a further step.

 Luckily, QA is the final project in my NLP course (CS2731 Fall 2011) and
 the QA system develop by our team (with other two students) achieved the
 best performance among all the 7 teams in that class based a hybrid
 model which used different models for different type of questions (why,
 who, what, how, when).

 I also noticed that other applicants are also interested in this
 project. However, unfortunately, I missed the first Skype chat mentioned
 in the mailing list. Is it possible to share the discussion results with
 us so that applicants who interested in this project can also benefit
 from these discussion.
Take a look at this thread [3] and don't worry, if it's not on the 
mailing list, it doesn't exist.

 One additional question for this project is that is there any training
 data for this problem? (Some data including both the given query and the
 corresponding answer?) If so, how big of the data size do you have?
@Axel, @Christina and @Elena can give you more details about that.

Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32024743/
[2] http://sourceforge.net/p/dbpedia/mailman/message/32026164/
[3] http://sourceforge.net/p/dbpedia/mailman/message/32073626/

 Thanks.
 Hope this inspiring project a success

 --
 Wencan Luo
 CS Department- Univ. of Pittsburgh
 210 S. Bouquet Street
 6501 Sennott Square
 Pittsburgh, PA 15260
 E-mail: wencanluo...@gmail.com mailto:wencanluo...@gmail.com or
 wen...@cs.pitt.edu mailto:wen...@cs.pitt.edu


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] QA engine

2014-03-13 Thread Marco Fossati
Hi Brij and welcome to our community!

Have a look at the following threads [1, 2] for more detailed 
discussions about the project.
Since research in this area is very active, THE warm-up task is to 
maximize code porting from existing projects.
Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32024743/
[2] http://sourceforge.net/p/dbpedia/mailman/message/32026164/

On 3/12/14, 9:00 PM, Brij Mohan Lal Srivastava wrote:
 Hi,

 This is Brij Mohan. I study at IIIT-Hyderabad, India. I am interested to
 work on the Natural Language question answering engine project
 proposed by DBpedia.

 I have 2+ years of industry experience and have worked on various IBM
 projects at Avnet Inc. I have immense liking towards NLP and Machine
 Learning. Hence I am pursuing my Masters by Research at IIIT. My
 research project is Dialog Systems in Speech.

 I have read following papers related to NLIDB:
 1. www.iiit.ac.in/~sangal/files/papers/PID2519697.pdf
 http://www.iiit.ac.in/~sangal/files/papers/PID2519697.pdf
 2. www.aclweb.org/anthology/I/I13/I13-1173.pdf
 http://www.aclweb.org/anthology/I/I13/I13-1173.pdf

 These papers give deep insights to solve NLIDB problems and will
 definitely be of great help to proceed further in the given DBpedia project.

 Please find my CV at
 https://www.dropbox.com/s/q0jfwtpfr9hqsz3/Brij-CV-IIIT.pdf. Or check my
 profile at http://about.me/brijmohan , http://lnkd.in/GBCuQs

 I like to work on Java / C++ / Python and am comfortable with any OOP
 language including JS (related frameworks like Node.js, etc).

 I am preparing my project proposal and will be submitting it soon.
 Please guide me on getting started on the warm-up exercises so that I
 can get an insight of the project.

 Nice to have all of your support.

 Thanks  Regards
 --
 Brij Mohan Lal Srivastava
 M.S. by Research (CSE)
 IIIT-H, Gachibowli
 Hyderabad
 +91 77997 28715
 brijmohanla...@research.iiit.ac.in
 mailto:brijmohanla...@research.iiit.ac.in



 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC introduction

2014-03-13 Thread Marco Fossati
Hi Kenji and welcome to our community!

Take a look at this thread [1] and come back to us with more ideas and 
questions.
Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32095225/

On 3/13/14, 10:05 AM, 山内健二 wrote:
 Dear, everybody.

 My name is Kenji Yamauchi.
 I have learned the basics on Natural Language Processing and Machine
 Learning on my graduate course and would like to write master thesis
 on Natural Language Processing.
 I wrote tiny Python scripts (available here,
 https://github.com/ivstivs/wikipedia_searcher) using SPARQL endpoints
 to query DBpedia for my paper on last semester.

 I'm interested in the idea of automated Wikidata mappings to DBpedia
 ontology (http://wiki.dbpedia.org/gsoc2014/ideas#h359-11), so I want
 know related knowledge to the mappings except above idea's
 description.
 Would you tell me where I could find more information?
From idea's description, I have recognized that the process is
 performed manually now. Do I understand correctly?

 Best regards.

 
 Kenji Yamauchi
 Kyoto University

 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC 14 - Question Answering [also an introduction]

2014-03-13 Thread Marco Fossati
Hi Yogarshi and welcome to our community!

Have a look at the following threads [1, 2] for more detailed 
discussions about the project.
Since research in this area is very active, THE warm-up task is to 
maximize code porting from existing projects.
Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32024743/
[2] http://sourceforge.net/p/dbpedia/mailman/message/32026164/

On 3/13/14, 11:16 AM, Yogarshi Vyas wrote:
 Hi all,

 My name is Yogarshi Vyas. I'm a final year undergrad at the
 Indian Institute of Technology, Kharagpur. I'll be finishing my
 undergraduate education in a couple of months, after which I'm slated to
 move to grad-school for a PhD in NLP/ML in September - possibly at the
 University of Maryland, CP. I thought the intermediate period of summer
 months would be a good time to do some nice hands on work for NLP
 systems, and I'm interested in working with DBpedia in these months - if
 possible, through the GSoC 14 program.

  I've had quite a few experiences using NLP/IR techniques - both
 theoretical as well as well practical. I've worked in the past on query
 segmentation, POS tagging and other related areas. The task of building
 and improving a DBpedia question-answering system seems a very
 interesting problem to take up and I'd like to talk to the people
 involved to get their perspective on the kind of work that they are
 looking to do for this. I've mostly worked on (and am most comfortable
 using) Python, but can also use Java to an extent.

   I realize this might be slightly late for a GSoC proposal, but
 nevertheless, I'm still interested in working/applying.

 Thanks.


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] New idea added

2014-03-12 Thread Marco Fossati
Hi Jim,

On 3/11/14, 7:04 PM, Jim O'Regan wrote:
 On 11 March 2014 10:55, Marco Fossati hell.j@gmail.com wrote:
 Hi everyone,

 I've just added a new idea on our ideas page. Check this out!

 http://wiki.dbpedia.org/gsoc2014/ideas?v=kce#h359-23

 Instead of just Python, why not JSR 223? Also, Ace (http://ace.c9.io/)
 already exists, so I don't see any point in making a student work on
 syntax highlighting.
I agree, these are a couple of basic features that can be ported from 
existing projects, as you pointed out.
I like Ace and will update the idea with your links, thanks!

Cheers!
 Otherwise, I quite like the idea. It could also
 be useful as a general framework for validating data (if there's an
 imdbId, verify date of birth, etc.)


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


[Dbpedia-gsoc] New idea added

2014-03-11 Thread Marco Fossati
Hi everyone,

I've just added a new idea on our ideas page. Check this out!

http://wiki.dbpedia.org/gsoc2014/ideas?v=kce#h359-23

Cheers,
-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-11 Thread Marco Fossati
Hi Abhijit (in CC the mailing list for transparency),



On 3/9/14, 11:54 PM, Abhijit Pratap Singh Tomar wrote:
 Hi Marco, how are you ?

 Sorry about not responding sooner (schoolwork :( ). I have been looking
 at the documentation of the various modules on

 http://okbqa.org/

 I intended to take a look at some queries on google drive, but i think i
 might need permission form you guys.
@Christina, could you take care of this please?
 Anyway, could you tell me a bit
 more about SPARQL ? is it a special query processing tool ? what level
 of competency is required in SPARQL if I want to make a meaningful
 contribution to the project. I hope I am making sense here !
SPARQL knowledge is vital for this project, since you must be able to 
map natural language to SPARQL queries.
Have a look at this tutorial [1].

 Finally, could you give me an idea as to what the degree of my
 contribution going to be in this project, if I get selected ?
You will be an official committer of the DBpedia codebase. Also, if the 
project has some interesting outcome from a research perspective, we may 
follow-up for a publication.

Cheers!

[1] https://www.cambridgesemantics.com/semantic-university/sparql-by-example

 Thanks,

 Abhijit


 On Thu, Mar 6, 2014 at 4:47 AM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Abhijit and welcome on board!

 Have a look at the following threads [1, 2] for more detailed
 discussions about the project.
 Don't hesitate to come back to us once you get a clearer view.
 Cheers!

 [1] http://sourceforge.net/p/dbpedia/mailman/message/32024743/
 [2] http://sourceforge.net/p/dbpedia/mailman/message/32026164/


 On 3/5/14, 11:12 PM, Abhijit Pratap Singh Tomar wrote:
   Hi,
  
   My name is Abhijit. I am a Computer Science graduate student at
 NYU. I
   am really interested in knowing more about the project Natural
 language
   question answering engine  .
  
   I have studied Natural Language Processing in my last semester. I
   created an Email Summarization Tool and later Extended it to General
   Text Summarization. This semester I am taking two courses; Machine
   Learning and Computational Geometry. I would like to implement those
   techniques, if needed.
  
   Below is a link to my work on Natural Language Processing on my
 github
   profile.
  
   https://github.com/abtpst/Email-Thread-Summarization
  
   https://github.com/abtpst/Text-Summarization
  
   Kindly provide me with more information about this project and please
   let me know if you need some more information on my background.
  
  
   Thanks,
  
   Abhijit
  
  
  
  
 
 --
   Subversion Kills Productivity. Get off Subversion  Make the Move
 to Perforce.
   With Perforce, you get hassle-free workflows. Merge that actually
 works.
   Faster operations. Version large binaries.  Built-in WAN
 optimization and the
   freedom to use Git, Perforce or both. Make the move to Perforce.
  
 
 http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
  
  
  
   ___
   Dbpedia-gsoc mailing list
   Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
  

 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j

 
 --
 Subversion Kills Productivity. Get off Subversion  Make the Move to
 Perforce.
 With Perforce, you get hassle-free workflows. Merge that actually works.
 Faster operations. Version large binaries.  Built-in WAN
 optimization and the
 freedom to use Git, Perforce or both. Make the move to Perforce.
 
 http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc



-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists

Re: [Dbpedia-gsoc] Gsoc 2014 consistency check and pattern discovery

2014-03-11 Thread Marco Fossati
Hi Cong,

On 3/11/14, 10:09 AM, Magnus Knuth wrote:

 Am 09.03.2014 um 22:06 schrieb Cong Wang:

 Hi Magnus, Heiko and Marco,

 Hello Cong Wang,
 and welcome to DBpedia community.

 My name is Cong Wang, a Ph.D. student from Wright State University with 
 Pascal Hitzler. I am very interested in the two topics of gsoc 2014, 
 ontolocy consistency check and pattern discovery. I found the two topics 
 are actually quite interactive, so I write this email to all of you.

 I had read the related papers. Pattern discovery and type inference are both 
 about ontology enrichment (under a maximum likelihood manner) [1,2], And 
 while ontology is automatically enriched, there can be inconsistency, say, 
 enrichment will cause inconsistency. But on the other hand, enrichment will 
 help for detecting potential inconsistency. Since the ontology is too big to 
 put into any reasoners, so the idea is remove inconsistency by some 
 suggestions [3].

 Now, I have an idea based on your methods. First design a set of patterns 
 which itself is inconsistent (the patterns must be high frequent), then use 
 some statistical method to extract axioms under these types, if there are 
 axioms extracted with high probably, then we may argue there are potential 
 conflicts.  How do you think of this idea?

 Could you provide an example. How is a pattern inconsistent? What will be 
 done to axioms that are potentially conflicting?

 I think the first and main issue of this task is to develop a system to find 
 such axioms. These need to be evaluated and the removal of potential 
 incorrect axioms will be a postponed step.

 I also have one question. In [4], I don't understand how the Airpedia 
 project relates to consistency checking issues. Can you give me some more 
 details?

 Afaik Airpedia generates extraction mappings for multiple languages, which is 
 a step prior to extraction, so there might be errors in the created mappings, 
 but no inconsistencies. We might get inconsistencies in the data extracted 
 afterwards. But that should be answered finally by @Marco, since I am not 
 involved in this project.
The referred Airpedia dataset contains automatically generated instance 
type statements for different language chapters.
By running some basic statistics over such dataset (e.g. absolute 
frequency), we can obtain a ranked list of actually used ontology classes.
This is also complementary to more recent work by @Heiko [1].
I use the term 'consistency' in rather a shallow way. We want to check 
here the real-world usage of the DBpedia ontology.
Sorry if such a term may be misinterpreted.

Cheers!

[1] http://www.heikopaulheim.com/documents/iswc2013.pdf

 Best regards
 Magnus

 Thanks.

 [1]. Lorenz Bühmann, Jens Lehmann. Pattern Based Knowledge Base Enrichment.
 [2]. Heiko Paulheim, Christian Bizer: Type Inference on Noisy RDF Data.
 [3]. Gerald Töpper, Magnus Knuth, Harald Sack. DBpedia Ontology Enrichment 
 for Inconsistency Detection
 [4]. http://wiki.dbpedia.org/gsoc2013/ideas/OntologyCheck?v=l3f

 Best regards.


 --
 Cong Wang
 Ph.D Candidate,
 Kno.e.sis Center,
 Wright State University.


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Automatic Wikidata to DBpedia ontology Mapping [Question]

2014-03-07 Thread Marco Fossati
Hi Hady,

Thanks for pointing out the manual Wikidata-to-DBpedia mapping you did 
last year.
How did you detect those 4k Wikidata entities that should be classes and 
not individuals? Your work can be a great starting point, since there 
seems to be no such distinction in Wikidata.
Cheers!

On 3/7/14, 9:13 AM, Dimitris Kontokostas wrote:
 Hello Hady,

 You did a good job last year and mapped a big part of the (most
 frequent) properties manually.
 However, mapping the long tail can be a much more tedious job and we
 also need to map new properties / classes as they are defined in wikidata.
 In the end we want this to be a (semi-)automatic approach with high
 confidence that we can run periodically and increase our recall.

 Making this a generic mapping approach might not give a good precision
 so we should focus on wikidata  dbpedia only.

 @Marco can help you with your other questions

 Best,
 Dimitris


 On Thu, Mar 6, 2014 at 7:22 PM, Hady elsahar hadyelsa...@gmail.com
 mailto:hadyelsa...@gmail.com wrote:

 Dear all ,

 i have a question considering the idea

 *_4.6. Automated Wikidata mappings to DBpedia ontology:_*
 those are the links for manually mapping wikidata properties[1] and
 classes[2] to Dbpedia

 - i collected them last year, they were 544 properties and 4184
 classes to map , of course ~4600 mappings is a tedious task for one
 person , but do we have other reasons for doing that automatically ?

 *_for example : _*
 1- to adapt wikidata ontology changes
 2- we will consider generic approach that can be done to create
 mappings between other linked open databases and DBpedia


 in addition to levenshtein distance mentioned before , we might
 consider other approaches like Normalized Pointwise Mutual
 Information[3] and try evaluation against all methods.


 Thanks
 Regards

 
 1-https://github.com/hadyelsahar/extraction-framework/wiki/Mapping-WikData-to-DBpedia-properties
 
 2-https://github.com/hadyelsahar/extraction-framework/wiki/Mapping-Wikidata-Types-to-DBpedia-ontology-Classes
 3-https://svn.spraakdata.gu.se/repos/gerlof/pub/www/Docs/npmi-pfd.pdf

 -
 Hady El-Sahar
 Research Assistant
 Center of Informatics Sciences | Nile University
 http://nileuniversity.edu.eg/



 
 --
 Subversion Kills Productivity. Get off Subversion  Make the Move to
 Perforce.
 With Perforce, you get hassle-free workflows. Merge that actually works.
 Faster operations. Version large binaries.  Built-in WAN
 optimization and the
 freedom to use Git, Perforce or both. Make the move to Perforce.
 
 http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




 --
 Kontokostas Dimitris


 --
 Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
 With Perforce, you get hassle-free workflows. Merge that actually works.
 Faster operations. Version large binaries.  Built-in WAN optimization and the
 freedom to use Git, Perforce or both. Make the move to Perforce.
 http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Automatic Wikidata to DBpedia ontology Mapping [Question]

2014-03-07 Thread Marco Fossati
Good point Hady, I'll add it to the idea description, thanks!

On 3/7/14, 1:26 PM, Hady elsahar wrote:
 HI all ,

 thanks for your reply Dimitris .

 Marco , We already extracted all Wikidata Facts in Triples[1] , so by
 querying all unique values for the wikidata property P31 [2] , we can
 get all the wikidata classes in use (not sure if there are classes that
 doesn't have any instances ).

 i think we can get to the same results also by querying the Wikidata API [3]


 1-
 https://github.com/hadyelsahar/extraction-framework/wiki/WikiData-DBpedia-Dump-Release-v.0.2#wiki-facts-dump
 2- http://www.wikidata.org/wiki/Property:P31
 3- http://www.wikidata.org/w/api.php



 Regards


 On Fri, Mar 7, 2014 at 12:02 PM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Hady,

 Thanks for pointing out the manual Wikidata-to-DBpedia mapping you did
 last year.
 How did you detect those 4k Wikidata entities that should be classes and
 not individuals? Your work can be a great starting point, since there
 seems to be no such distinction in Wikidata.
 Cheers!

 On 3/7/14, 9:13 AM, Dimitris Kontokostas wrote:
   Hello Hady,
  
   You did a good job last year and mapped a big part of the (most
   frequent) properties manually.
   However, mapping the long tail can be a much more tedious job and we
   also need to map new properties / classes as they are defined in
 wikidata.
   In the end we want this to be a (semi-)automatic approach with high
   confidence that we can run periodically and increase our recall.
  
   Making this a generic mapping approach might not give a good
 precision
   so we should focus on wikidata  dbpedia only.
  
   @Marco can help you with your other questions
  
   Best,
   Dimitris
  
  
   On Thu, Mar 6, 2014 at 7:22 PM, Hady elsahar
 hadyelsa...@gmail.com mailto:hadyelsa...@gmail.com
   mailto:hadyelsa...@gmail.com mailto:hadyelsa...@gmail.com wrote:
  
   Dear all ,
  
   i have a question considering the idea
  
   *_4.6. Automated Wikidata mappings to DBpedia ontology:_*
   those are the links for manually mapping wikidata
 properties[1] and
   classes[2] to Dbpedia
  
   - i collected them last year, they were 544 properties and 4184
   classes to map , of course ~4600 mappings is a tedious task
 for one
   person , but do we have other reasons for doing that
 automatically ?
  
   *_for example : _*
   1- to adapt wikidata ontology changes
   2- we will consider generic approach that can be done to create
   mappings between other linked open databases and DBpedia
  
  
   in addition to levenshtein distance mentioned before , we might
   consider other approaches like Normalized Pointwise Mutual
   Information[3] and try evaluation against all methods.
  
  
   Thanks
   Regards
  
  
 
 1-https://github.com/hadyelsahar/extraction-framework/wiki/Mapping-WikData-to-DBpedia-properties
  
 
 2-https://github.com/hadyelsahar/extraction-framework/wiki/Mapping-Wikidata-Types-to-DBpedia-ontology-Classes
  
 3-https://svn.spraakdata.gu.se/repos/gerlof/pub/www/Docs/npmi-pfd.pdf
  
   -
   Hady El-Sahar
   Research Assistant
   Center of Informatics Sciences | Nile University
   http://nileuniversity.edu.eg/
  
  
  
  
 
 --
   Subversion Kills Productivity. Get off Subversion  Make the
 Move to
   Perforce.
   With Perforce, you get hassle-free workflows. Merge that
 actually works.
   Faster operations. Version large binaries.  Built-in WAN
   optimization and the
   freedom to use Git, Perforce or both. Make the move to Perforce.
  
 
 http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
   ___
   Dbpedia-gsoc mailing list
   Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
   mailto:Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
  
  
  
  
   --
   Kontokostas Dimitris
  
  
  
 
 --
   Subversion Kills Productivity. Get off Subversion  Make the Move
 to Perforce.
   With Perforce, you get hassle-free workflows. Merge that actually
 works.
   Faster operations. Version large binaries.  Built-in WAN
 optimization

Re: [Dbpedia-gsoc] Automatic Wikidata to DBpedia ontology Mapping [Question]

2014-03-07 Thread Marco Fossati
Hi Sergey,

On 3/7/14, 4:04 PM, Sergey Skovorodkin wrote:
 Hi!

 May be I'll ask some irrelevant or strange questions, but I really want
 to understand what's going on :)
Don't worry, this is the right place to ask any kind of question ;-)

  i collected them last year, they were 544 properties and 4184 classes
 to map

 Does it mean, that we have to expand our ontology (as we have only 650
 classes)? Or should we match Wikidata's classes only to our existing
 classes?
We should try our best to fit into the existing ones.
However, we learn from the Wikipedia infobox properties mapping process 
[1] that we may need to add new classes to the DBpedia ontology in order 
to suit the data we have.

[1] http://mappings.dbpedia.org

 And if I understood it correctly, Wikidata Item could be an instance
 of *any* other Item, so every Item could be a class.
Yes, this is exactly how the data model looks like, i.e. classes = 
individuals.
 If so, why do we limit ourselves by these 4184 classes? Or is it just a
 preliminary result?
This is the result of @Hady's last year work, which seems a 
straightforward yet agile way to find the actual class usage.
The candidate for this project is expected to investigate that in detail.

Cheers!

 Thanks
 Sergey Skovorodkin


 On Fri, Mar 7, 2014 at 5:41 PM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Good point Hady, I'll add it to the idea description, thanks!

 On 3/7/14, 1:26 PM, Hady elsahar wrote:
   HI all ,
  
   thanks for your reply Dimitris .
  
   Marco , We already extracted all Wikidata Facts in Triples[1] , so by
   querying all unique values for the wikidata property P31 [2] , we can
   get all the wikidata classes in use (not sure if there are
 classes that
   doesn't have any instances ).
  
   i think we can get to the same results also by querying the
 Wikidata API [3]
  
  
   1-
  
 
 https://github.com/hadyelsahar/extraction-framework/wiki/WikiData-DBpedia-Dump-Release-v.0.2#wiki-facts-dump
   2- http://www.wikidata.org/wiki/Property:P31
   3- http://www.wikidata.org/w/api.php
  
  
  
   Regards
  
  
   On Fri, Mar 7, 2014 at 12:02 PM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
   mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:
  
   Hi Hady,
  
   Thanks for pointing out the manual Wikidata-to-DBpedia
 mapping you did
   last year.
   How did you detect those 4k Wikidata entities that should be
 classes and
   not individuals? Your work can be a great starting point,
 since there
   seems to be no such distinction in Wikidata.
   Cheers!
  
   On 3/7/14, 9:13 AM, Dimitris Kontokostas wrote:
 Hello Hady,

 You did a good job last year and mapped a big part of the
 (most
 frequent) properties manually.
 However, mapping the long tail can be a much more tedious
 job and we
 also need to map new properties / classes as they are
 defined in
   wikidata.
 In the end we want this to be a (semi-)automatic approach
 with high
 confidence that we can run periodically and increase our
 recall.

 Making this a generic mapping approach might not give a good
   precision
 so we should focus on wikidata  dbpedia only.

 @Marco can help you with your other questions

 Best,
 Dimitris


 On Thu, Mar 6, 2014 at 7:22 PM, Hady elsahar
   hadyelsa...@gmail.com mailto:hadyelsa...@gmail.com
 mailto:hadyelsa...@gmail.com mailto:hadyelsa...@gmail.com
 mailto:hadyelsa...@gmail.com
 mailto:hadyelsa...@gmail.com mailto:hadyelsa...@gmail.com
 mailto:hadyelsa...@gmail.com wrote:

 Dear all ,

 i have a question considering the idea

 *_4.6. Automated Wikidata mappings to DBpedia ontology:_*
 those are the links for manually mapping wikidata
   properties[1] and
 classes[2] to Dbpedia

 - i collected them last year, they were 544 properties
 and 4184
 classes to map , of course ~4600 mappings is a tedious
 task
   for one
 person , but do we have other reasons for doing that
   automatically ?

 *_for example : _*
 1- to adapt wikidata ontology changes
 2- we will consider generic approach that can be done
 to create
 mappings between other linked open databases and DBpedia


 in addition to levenshtein

Re: [Dbpedia-gsoc] GSoC2014 - Natural language question answering engine

2014-03-06 Thread Marco Fossati
Hi Abhijit and welcome on board!

Have a look at the following threads [1, 2] for more detailed 
discussions about the project.
Don't hesitate to come back to us once you get a clearer view.
Cheers!

[1] http://sourceforge.net/p/dbpedia/mailman/message/32024743/
[2] http://sourceforge.net/p/dbpedia/mailman/message/32026164/


On 3/5/14, 11:12 PM, Abhijit Pratap Singh Tomar wrote:
 Hi,

 My name is Abhijit. I am a Computer Science graduate student at NYU. I
 am really interested in knowing more about the project Natural language
 question answering engine  .

 I have studied Natural Language Processing in my last semester. I
 created an Email Summarization Tool and later Extended it to General
 Text Summarization. This semester I am taking two courses; Machine
 Learning and Computational Geometry. I would like to implement those
 techniques, if needed.

 Below is a link to my work on Natural Language Processing on my github
 profile.

 https://github.com/abtpst/Email-Thread-Summarization

 https://github.com/abtpst/Text-Summarization

 Kindly provide me with more information about this project and please
 let me know if you need some more information on my background.


 Thanks,

 Abhijit



 --
 Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
 With Perforce, you get hassle-free workflows. Merge that actually works.
 Faster operations. Version large binaries.  Built-in WAN optimization and the
 freedom to use Git, Perforce or both. Make the move to Perforce.
 http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk



 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)

2014-03-03 Thread Marco Fossati
Hi everyone,

On 3/3/14, 7:55 AM, Sourish Dasgupta wrote:
 Hello all,

 I agree with Marco that we should be concentrating on English for the
 moment. This is because every language has some innate characteristic
 linguistic nuances which may not be found in other languages.

 However, I still feel that techniques borrowed from computational
 semantics might be helpful in improving the accuracy significantly. By
 and large languages follow the SVO structure (Subject Verb Object). It
 covers approx. 42% of world languages [1]. So the research insights that
 we get in working with English, both at a statistical and at a
 linguistic level, might be very important for future extension.
That's exactly my point, more technically explained.
If we manage to implement simple questions that fit well in all SVO 
languages, this would be a dramatic added value in terms of multilingual 
support.

Cheers!

 Sourish

 [1]: Russell Tomlin, Basic Word Order: Functional Principles, Croom
 Helm, London, 1986,


 On Mon, Mar 3, 2014 at 12:18 AM, Pablo N. Mendes pablomen...@gmail.com
 mailto:pablomen...@gmail.com wrote:


 I just want to suggest caution with multilinguality. Doesn't seem to
 me like an easy problem. QA is already hard enough in one language,
 if one tries to solve it for all languages at once, it will be
 overwhelming in three months. I'd suggest focusing on English, but
 not hardcoding anything that is language specific, keeping them in
 configuration files and properly engineered subclasses, thinking
 that one day it will all be ported to another language.

 On Mar 1, 2014 1:10 AM, Ankur Padia padiaan...@gmail.com
 mailto:padiaan...@gmail.com wrote:

 Hello Marco,

 This is inform that I am resending a copy of the mail send
 before as I forgot to CC the author of the referred paper who is
 working in the direction of development of QA system using DL as
 tool for language representation.

   I think Google Translator API would come handy to perform
 the conversion of an foreign language question to English
 language question for cases where particular knowledge or triple
 is missing [2] in given chapter and then after firing it against
 English Knowledge Base which do have one. However, there is a
 paper on Wh-questions by Dr. Sourish Dasgupta (cc) and its
 possible semantic formalization in description logics [1].

 - Ankur.

 Reference :
 ---
 [1] http://arxiv.org/abs/1312.6948
 [2] Approach taken in QAKiS.

 On Fri, Feb 28, 2014 at 4:54 PM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com wrote:

 Also, I think a crucial point will be the multilingual
 capabilities of the tool. In this way, all the DBpedia
 chapters can benefit from it.

 So, the first implementation should focus on very simple
 questions, but in multiple languages.
 WH questions would be great.
 Of course, this requires language-specific validation. We
 will definitely need the help of the worldwide community.

 Sounds like the project is getting more and more exciting!
 Cheers,

 On 2/28/14, 12:08 PM, Marco Fossati wrote:

 Hi Ankur,

 On 2/28/14, 2:00 AM, Ankur Padia wrote:

 Among the approach listed before, I will prefer
 TBSL as it scope is
 relatively wider.

 All right, go ahead with that.

   Ideally QA engine for DBpedia should be able
 to handle all kinds of
 question with its appropriate semantic parsing and
 satisfactory
 conversion to SPARQL queries. To address the scope
 for a QA system, it
 would highly depend on the time at hand. For example
 given a span of
 GSoC, addressing even a small number of English
 nuances in queries would
 be ambitious (Correct me if I am wrong).

 Exactly, keep in mind that a successful project implies
 a tool that
 actually works.
 Hence, I suggest to proceed first with the
 implementation of single
 predicate queries, in order to provide a reasonable
 coverage of simple
 questions.
 Cheers,


 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j




 
 --
 Flow-based real-time traffic analytics

Re: [Dbpedia-gsoc] QA engine (was Fwd: Request to access experimental code)

2014-02-28 Thread Marco Fossati
Also, I think a crucial point will be the multilingual capabilities of 
the tool. In this way, all the DBpedia chapters can benefit from it.

So, the first implementation should focus on very simple questions, but 
in multiple languages.
WH questions would be great.
Of course, this requires language-specific validation. We will 
definitely need the help of the worldwide community.

Sounds like the project is getting more and more exciting!
Cheers,

On 2/28/14, 12:08 PM, Marco Fossati wrote:
 Hi Ankur,

 On 2/28/14, 2:00 AM, Ankur Padia wrote:
Among the approach listed before, I will prefer TBSL as it scope is
 relatively wider.
 All right, go ahead with that.
  Ideally QA engine for DBpedia should be able to handle all kinds of
 question with its appropriate semantic parsing and satisfactory
 conversion to SPARQL queries. To address the scope for a QA system, it
 would highly depend on the time at hand. For example given a span of
 GSoC, addressing even a small number of English nuances in queries would
 be ambitious (Correct me if I am wrong).
 Exactly, keep in mind that a successful project implies a tool that
 actually works.
 Hence, I suggest to proceed first with the implementation of single
 predicate queries, in order to provide a reasonable coverage of simple
 questions.
 Cheers,

-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Interested in Natural Language Question Answering Engine for GSOC

2014-02-27 Thread Marco Fossati
Hi Aarsh and welcome on board!

We already have been discussing quite a lot about this project, so 
please have a look at the following mailing list thread and feel free to 
come back with more focused questions:

http://sourceforge.net/p/dbpedia/mailman/message/32026164/

Cheers!

Marco

On 2/27/14, 4:53 AM, Aarsh Shah wrote:
 Hello everyone,

 I am Aarsh Shah, a final year computer science student studying in Nirma
 University,India. I just went through the ideas list of Dbpedia and am
 extremely interested in the Natural Language Question Answering Engine
 ideas as it matches with my interests and the kind of projects I have
 done before.

I implemented a chat analyser using nltk and python which
 could analyse IRC chat logs and identify the missing names of speakers
 in some sentences which was based on understanding the sentence style of
 each user. I am also quite familiar with Rdf , r2rml, semantic web,
 linked data and Sparql. Moreover, Python has been my preferred language
 of choice for quite some time. I am currently going through the QUEPY
 documentation and it is quite interesting. Please can you advice me on
 how to proceed with this idea to convince the organisation and my mentor
 Marco Fossati that I am right man for this project ? :) I would love to
 write some sample code as asked by the org and the mentor to prove my
 coding abilities and my familiarity with this project. This project
 combines both semantic web and NLP and it would be interesting to see
 the kind of results we can achieve by combining the power of nltk and
 wordnet with sparql and dbpedia URI's and mappings.

 -Regards
 -Aarsh

-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] GSOC 2014 - Ontology Consistency check

2014-02-25 Thread Marco Fossati


On 2/25/14, 12:38 PM, Marco Fossati wrote:
 Hi Sindhu and welcome to our community (CCing the mailing list)!

 The basic idea of the project you are interested in is to investigate
 the actual usage of the ontology and to make it effective for querying
 all the different DBpedia chapters.
 I'm trying to add a couple of new references to that page, but the wiki
 doesn't want to save them.
 Pasting them below:

 -Type Inference on Noisy RDF Data
 http://www.heikopaulheim.com/documents/iswc2013.pdf

 -Towards an Automatic Creation of Localized Versions of DBpedia
 http://glearning.tju.edu.cn/pluginfile.php/156800/mod_folder/content/0/ISWC2013/82180481-towards-an-automatic-creation-of-localized-versions-of-dbpedia.pdf


 Please dig into the bibliography and come back to me with more focused
 questions.
 Cheers!

 On 2/25/14, 10:13 AM, Sindhu Kiranmai Ernala wrote:
 Hi,

 I am Sindhu Kiranmai, a 3rd year undergrad from International Institute
 of Information Technology, pursuing Btech in Computer Science + MS by
 research in Exact Humanities. I am interested in contributing for the
 project: Ontology consistency check as part of GSOC 2014. As part of my
 course work and research, I've been extensively working on Ontology and
 Graph Grammars. I also wrote a research paper on ontology of cinema
 (Computationally Searching for cinematic art, submitted at xCoAx
 2014). I've read the ideas page for the project, but I would like you to
 explain further regarding the idea of the project. With good experience
 and interest in ontology, I am hopping to enjoy this project.

 Thanks,
 Sindhu Kiranmai

-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] Handshaking moments... Hi.

2014-02-10 Thread Marco Fossati
Hi Ankur and welcome on board!

I assume you are referring to the following project idea page:
http://wiki.dbpedia.org/gsoc2013/ideas/OntologyCheck

The basic idea is to leverage machine learning-based methods for 
automatic ontology type/property inference in order to assess their 
actual usage.

In addition to the given references, find below further ones:

-Towards an automatic creation of localized versions of DBpedia
http://glearning.tju.edu.cn/pluginfile.php/156800/mod_folder/content/0/ISWC2013/82180481-towards-an-automatic-creation-of-localized-versions-of-dbpedia.pdf

-Type inference on noisy rdf data
http://www.heikopaulheim.com/documents/iswc2013.pdf

Hope this helps.
Let me know!

Cheers,

On 2/8/14, 9:35 AM, Dimitris Kontokostas wrote:
 Hello Ankur and welcome to the community,

 IIRC Marco Fossati suggested this topic (or at least added the existing
 publication links).
 Marco, can you point Ankur to additional resources?

 Best,
 Dimtiris


 On Fri, Feb 7, 2014 at 12:49 PM, Ankur Padia padiaan...@gmail.com
 mailto:padiaan...@gmail.com wrote:

 Hello everyone,

 My name is Ankur Padia and I work in the area of Knowledge
 Representation. I would like to contribute to DBpedia in general and
 ontology consistency in specific. Please let me know how to proceed.
 I am currently referring to the paper suggested on project page.

 I would be thankful if any one could provide me with further
 relevant research papers.

 - Ankur Padia.

 
 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 
 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




 --
 Kontokostas Dimitris

-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] On Massive extraction of triples from Media Wikis idea

2013-04-18 Thread Marco Fossati
Hi Pablo,

It's a low-level generic parser for MediaWiki content.
It converts all the content of any MediaWiki resource into structured 
data. The output could be JSON (as it is now), JSON-LD or RDF, i.e., it 
can be modeled for our needs.
Compared to DBpedia extraction framework, it does not make any 
processing on the semantics of data e.g. on infoboxes, but handles every 
content item e.g. article body, tables, etc.
I see some similarities with the Wiktionary extraction project [1] that 
Sebastian mentioned in the GSoC idea.
Since Sebastian proposed to configure the Wiktionary extractor in order 
to parse other Wikis, I was just wondering if these 2 projects were 
complementary, could be merged or whatever could help.
Of course, JSONpedia will be released with an open source licence.

@Sebastian, can you give us some more thoughts about that?
Cheers!

[1] http://dbpedia.org/Wiktionary

On 4/18/13 11:32 AM, Pablo N. Mendes wrote:

 What does it offer that the DEF does not have?

 Cheers,
 Pablo


 On Wed, Apr 17, 2013 at 10:33 PM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 Hi Sebastian,

 I was wondering if the JSONpedia project [1] could be helpful for the
 idea you are mentoring for GSoC 2013.
 Have a look at the slides [2].
 What do you think about?
 Let me know.
 Cheers,

 [1] http://json.it.dbpedia.org/frontend/form.html
 [2] http://www.slideshare.net/spaziodati/introducing-jsonpedia
 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j

 
 --
 Precog is a next-generation analytics platform capable of advanced
 analytics on semi-structured data. The platform includes APIs for
 building
 apps and a phenomenal toolset for data science. Developers can use
 our toolset for easy data analysis  visualization. Get a free account!
 http://www2.precog.com/precogplatform/slashdotnewsletter
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 mailto:Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




 --

 Pablo N. Mendes
 http://pablomendes.com

-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


Re: [Dbpedia-gsoc] On Massive extraction of triples from Media Wikis idea

2013-04-18 Thread Marco Fossati
Definitely, that's why Sebastian's idea can become a very interesting 
GSoC project.

On 4/18/13 4:41 PM, Pablo N. Mendes wrote:

 The difference between JSON and HTML are 15min, Scala and IntelliJ. :)

 I'd think the important part is how the markup is parsed, templates,
 resolved, etc.


 On Thu, Apr 18, 2013 at 4:39 PM, Marco Fossati hell.j@gmail.com
 mailto:hell.j@gmail.com wrote:

 I can't say if it's a competitor. The main difference relies in the
 output, which is structured data (JSON) instead of semi-structured
 data (HTML).
 For more details, see the slides [1].
 Cheers!

 [1] http://www.slideshare.net/__spaziodati/introducing-__jsonpedia
 http://www.slideshare.net/spaziodati/introducing-jsonpedia


 On 4/18/13 4:23 PM, Pablo N. Mendes wrote:


 Is JSONPedia a competitor of gwtwiki and Sweble?

 https://code.google.com/p/__gwtwiki/
 https://code.google.com/p/gwtwiki/
 http://en.wikipedia.org/wiki/__Sweble#The_current_state_of___parsing
 http://en.wikipedia.org/wiki/Sweble#The_current_state_of_parsing


 On Thu, Apr 18, 2013 at 4:18 PM, Marco Fossati
 hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com wrote:

  Hi Pablo,

  It's a low-level generic parser for MediaWiki content.
  It converts all the content of any MediaWiki resource into
  structured data. The output could be JSON (as it is now),
 JSON-LD or
  RDF, i.e., it can be modeled for our needs.
  Compared to DBpedia extraction framework, it does not make any
  processing on the semantics of data e.g. on infoboxes, but
 handles
  every content item e.g. article body, tables, etc.
  I see some similarities with the Wiktionary extraction
 project [1]
  that Sebastian mentioned in the GSoC idea.
  Since Sebastian proposed to configure the Wiktionary
 extractor in
  order to parse other Wikis, I was just wondering if these 2
 projects
  were complementary, could be merged or whatever could help.
  Of course, JSONpedia will be released with an open source
 licence.

  @Sebastian, can you give us some more thoughts about that?
  Cheers!

  [1] http://dbpedia.org/Wiktionary


  On 4/18/13 11:32 AM, Pablo N. Mendes wrote:


  What does it offer that the DEF does not have?

  Cheers,
  Pablo


  On Wed, Apr 17, 2013 at 10:33 PM, Marco Fossati
  hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com
  mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com mailto:hell.j@gmail.com
 mailto:hell.j@gmail.com__ wrote:

   Hi Sebastian,

   I was wondering if the JSONpedia project [1] could be
  helpful for the
   idea you are mentoring for GSoC 2013.
   Have a look at the slides [2].
   What do you think about?
   Let me know.
   Cheers,

   [1]
 http://json.it.dbpedia.org/frontend/form.html
 http://json.it.dbpedia.org/__frontend/form.html
  http://json.it.dbpedia.org/__frontend/form.html
 http://json.it.dbpedia.org/frontend/form.html
   [2]
 http://www.slideshare.net/spaziodati/introducing-jsonpedia
 http://www.slideshare.net/__spaziodati/introducing-__jsonpedia


 http://www.slideshare.net/__spaziodati/introducing-__jsonpedia
 http://www.slideshare.net/spaziodati/introducing-jsonpedia
   --
   Marco Fossati
 http://about.me/marco.fossati
   Twitter: @hjfocs
   Skype: hell_j



 
 --__--__--

   Precog is a next-generation analytics platform
 capable of
  advanced
   analytics on semi-structured data. The platform
 includes
  APIs for
   building
   apps and a phenomenal toolset for data science.
 Developers
  can use
   our toolset for easy data analysis 
 visualization. Get a
  free account!
 http://www2.precog.com/precogplatform/slashdotnewsletter
 http://www2.precog.com/__precogplatform/__slashdotnewsletter

 http://www2.precog.com/__precogplatform/__slashdotnewsletter
 http://www2