subject:"\[Dbpedia\-gsoc\] GSOC 2015 \- Introduction"

[Dbpedia-gsoc] GSoC 2015 - Introduction (Mingzhe)

2015-03-27 Thread Mingzhe Du

Hi everyone,

My name is Mingzhe. I am a PhD student at the University of South Carolina.
My work mainly focuses on Natural Language Processing and NLP related web
development.

I am the principle developer of Wikitheoria.com [1], an NSF sponsored
web-based crowd-sourcing tool to share and collaborate on sociological
researchable ideas. The ultimate goal of this project is to contribute the
well-structured sociology information and knowledge to our Linked Data
community. I am proficient in Python, Java at the back-end, Javascript and
jQuery at the front-end. I have also been using NodeJS, angularJS and
MongoDB during my development at HelpMonger.com [2].

I am particularly interested in project idea *5.10 DBpedia Metadata
Datasets*. I have some experience on RDF and SPARQL during the course study
of Natural Language Processing and Service Oriented Computing. I believe
this project will help me gain more experience and knowledge that I could
apply to Wikitheoria in the future. I have submitted my proposal on
http://www.google-melange.com/.

Hoping to work with you soon.

References
[1] http://www.wikitheoria.com
[2] http://www.helpmonger.com http://www.wikitheoria.com/


Best,
Mingzhe
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

2015-03-25 Thread Алексей Степанов

Hi,

Sorry for my silence - it was 2 hard weeks in University.

I chose 5.4 task.

https://docs.google.com/document/d/1TdzP45vntVU4ufTpKcN_ftfE9zIDgh8NBorxZW-Iyf0/edit?usp=sharing
- this is my proposal.

I will wait for a response.

Regards,
Alexey Stepanov

On 9 March 2015 at 17:54, Dimitris Kontokostas jimk...@gmail.com wrote:

Hi Alexey welcome to DBpedia!

On Sun, Mar 8, 2015 at 8:10 PM, Алексей Степанов fec...@gmail.com wrote:

Hi everyone,

My name is Alex, I'm a first year aspirant of Moscow State University of
department of Computational Mathematics and Cybernetics.

I'm interested in one of the next topics:
5.4. Mappings freshness Better statistics / reporting tools
5.5. Improved Mapping Support for the Mappings Wiki
5.6. DBpedia Data Error Reporting Tool
5.8. DBpedia Live scaling new interface

I have 2 years experience in Java programming. Also I have good knowledge
in SQL-programming. Me and my science adviser are interested in Semantic
Web/Linked Open Data and Databases. And I want to get knowledge and
experience in Scala and JavaScript.

Can you share any suggestions in which can I work on for the GSoC Warm-up
that can be related to the topics 5.4 - 5.5?

Please have a look at this thread where we suggest some warm up tasks and
provide more details

http://www.mail-archive.com/dbpedia-gsoc@lists.sourceforge.net/msg00578.html

Cheers,
Dimitris

Hoping to collaborate with you very soon, even if not in the GSoC program.

--
Regards,
Alexey Stepanov

--
Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub
for all
things parallel software development, from weekly thought leadership
blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

--
Kontokostas Dimitris

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

2015-03-25 Thread Andre Pereira

Hi,

you need to go to the official GSoC site
https://www.google-melange.com/gsoc/homepage/google/gsoc2015, create a
student profile and submit your proposal there, otherwise you won't be
officially applying for GSoC.

Best regards,
André Pereira

On 25 March 2015 at 19:26, Алексей Степанов fec...@gmail.com wrote:

Hi,

Sorry for my silence - it was 2 hard weeks in University.

I chose 5.4 task.

https://docs.google.com/document/d/1TdzP45vntVU4ufTpKcN_ftfE9zIDgh8NBorxZW-Iyf0/edit?usp=sharing
- this is my proposal.

I will wait for a response.

Regards,
Alexey Stepanov

On 9 March 2015 at 17:54, Dimitris Kontokostas jimk...@gmail.com wrote:

Hi Alexey welcome to DBpedia!

On Sun, Mar 8, 2015 at 8:10 PM, Алексей Степанов fec...@gmail.com
wrote:

Hi everyone,

My name is Alex, I'm a first year aspirant of Moscow State University of
department of Computational Mathematics and Cybernetics.

I have 2 years experience in Java programming. Also I have good
knowledge in SQL-programming. Me and my science adviser are interested in
Semantic Web/Linked Open Data and Databases. And I want to get knowledge
and experience in Scala and JavaScript.

Can you share any suggestions in which can I work on for the GSoC
Warm-up that can be related to the topics 5.4 - 5.5?

Please have a look at this thread where we suggest some warm up tasks and
provide more details

http://www.mail-archive.com/dbpedia-gsoc@lists.sourceforge.net/msg00578.html

Cheers,
Dimitris

Hoping to collaborate with you very soon, even if not in the GSoC
program.

--
Regards,
Alexey Stepanov

--
Kontokostas Dimitris

--
Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for
all
things parallel software development, from weekly thought leadership blogs
to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSOC 2015 - Introduction

2015-03-23 Thread Thiago Galery

Hi Vasanth, I suggest you taking a look at the previous messages in the
mailing list archives and check out the discussion there, so you have a
better idea of what to do. Bare in mind that submission date is really
close, so you'd need to look into this asap.
All the best,
Thiago

On Mon, Mar 23, 2015 at 5:07 PM, Vasanth Kalingeri 
vasanth.kaling...@gmail.com wrote:

 Hi,
 My name is Vasanth Kalingeri. I am a 3rd year undergrad in
 computer science, pursuing my engineering in SJCE Mysore. I have completed
 a course on machine learning in Coursera, which further lead me into an
 interest towards NLP. I am also freelancing since 2 years.
 My interest for NLP grew primarily when I wanted a knowledge base
 from a given corpus of text, so that it could answer questions on the
 corpus. This lead me to dbpedia and further into the topic 5.1.
 I am extremely interested in building such a system to extract
 facts from a corpus. Will get working on the warmup tasks soon.
 Regards,
 Vasanth


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 - Introduction

2015-03-12 Thread David Przybilla

Hi Shashank,

It looks alright.
I think you can skip the Spark part, as you are not interested in the
project concerning the model building.

As for the specific project you selected I think best would be to:

- Understand how a spotlight model is divided (Surface form store, Context
Store, Candidate Store). Probably this blog [1] entry can help you  as well
as playing with [2]

- Also reading the main paper on which spotlight is based on (I previously
mentioned it but it is also mentioned in the literature at github)

[1]
http://engineering.idioplatform.com/2015/02/23/spotlight-model-editor.html
[2] https://github.com/idio/spotlight-model-editor

On Thu, Mar 12, 2015 at 1:35 PM, shashank juyal sjuyal...@gmail.com wrote:

 Hi David,

 Please find attached the warm up tasks I have done.
 I am still involved in some of the issues and documentation. I have also
 mentioned those in the pdf.
 Please let me know if any other warm up task has to be done.

 Thanks and Regards,
 Shashank Juyal



 On Sun, Mar 8, 2015 at 12:36 AM, David Przybilla dav.alejan...@gmail.com
 wrote:

 Hi Shashank,

 On DBpedia Spotlight – Better Context Vectors:

 Here are the DBPedia Spotlight warm tasks:
 https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Warm-up-tasks

 if you take a look at the github issue page you should find some of the
 problems we are dealing with. One of the ideas could be experimenting with
 word2vec.

 Have a nice weekend :)

 On Sat, Mar 7, 2015 at 11:46 AM, shashank juyal sjuyal...@gmail.com
 wrote:

 Hi,

 I am a Masters student in International Institute of Information
 technology, Hyderabad (IIIT-H). I am interested in taking part in this
 year's GSOC. Many of the projects in DBPedia sounds very familiar and
 interesting to me as I have worked closely with many of the concepts and
 technologies used in the project.

 I have worked previously with Wikipedia data and built a small search
 over it based on tf-idf score and my own parser. Also currently I am
 working in a project Question Answer techniques using NLP which uses
 concepts like wordtovec, CBOW, NL Processing and translation to query
 language, which are mentioned in some of the projects in DBPedia-Spotlight.

 Based on this, I would like to work on the following projects:

 1) Fact Extraction from Wikipedia Text
 2) Keyword Search on DBpedia
 3) Deploying a DBpedia Question Answering Engine
 4) DBpedia Spotlight – Better Context Vectors

 Please let me know the warm-up tasks in the above projects.

 Linked Profile: in.linkedin.com/in/shajuyal
 Github Profile: https://github.com/sjuyal

 Thanks and Regards,
 Shashank Juyal


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

2015-03-09 Thread Dimitris Kontokostas

Hi Alexey  welcome to DBpedia!


On Sun, Mar 8, 2015 at 8:10 PM, Алексей Степанов fec...@gmail.com wrote:

 Hi everyone,

 My name is Alex, I'm a first year aspirant of Moscow State University of
 department of Computational Mathematics and Cybernetics.

 I'm interested in one of the next topics:
 5.4. Mappings freshness  Better statistics / reporting tools
 5.5. Improved Mapping Support for the Mappings Wiki
 5.6. DBpedia Data Error Reporting Tool
 5.8. DBpedia Live scaling  new interface

 I have 2 years experience in Java programming. Also I have good knowledge
 in SQL-programming. Me and my science adviser are interested in Semantic
 Web/Linked Open Data and Databases. And I want to get knowledge and
 experience in Scala and JavaScript.

 Can you share any suggestions in which can I work on for the GSoC Warm-up
 that can be related to the topics 5.4 - 5.5?


Please have a look at this thread where we suggest some warm up tasks and
provide more details
http://www.mail-archive.com/dbpedia-gsoc@lists.sourceforge.net/msg00578.html

Cheers,
Dimitris



 Hoping to collaborate with you very soon, even if not in the GSoC program.



 --
 Regards,
 Alexey Stepanov


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




-- 
Kontokostas Dimitris
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

2015-03-09 Thread Marco Fossati

Hi Robert,

On 3/8/15 8:20 PM, rlits...@mail.uni-mannheim.de wrote:
Hi Thiago, Hi DBPedia-Team,

thanks for your reply. I'd like to clarify a fundamental question:

- In the previous GSoC the participant seem to have built his own
goldstandard of mappings. Are standard benchmarks for quality
measurement insufficient
Which standards are you thinking about? Could you reference them?
, i.e. does the schema matching quality vary
and depend that much on the source schemas used?

- In my opinion one could tackle this task either practically oriented
by implementing a promising approach, or research oriented by working
on an improvement of existing solutions. Which approach is likely the
way to go?
For the scope of GSoC, I would advocate the former.

I have done quite some research and gone through a few papers and I
think I have fair understanding. Are there any particular warm-up task
related to this task that you could suggest?
Have a look at the unsolved issues on the specific repo of GSoC 2014
project:
https://github.com/dbpedia/wikidata-mapper/issues
Those tasks can be applied to Freebase schema too.

Best regards

Robert

Zitat von Thiago Galery tgal...@gmail.com:

Hi Robert, I would advise taking a look at Marco's response to another
prospective student. He points to these links for a summary of a similar
project in 2014

-idea: http://wiki.dbpedia.org/gsoc2014/ideas#h359-11
-proposal:
https://docs.google.com/document/d/16lAqKLAsAGQW0cp9SA0Egb1vlb6mPCcHYezVN-zB870/edit?pli=1
-stuff
https://docs.google.com/document/d/16lAqKLAsAGQW0cp9SA0Egb1vlb6mPCcHYezVN-zB870/edit?pli=1-stuff
done:
https://github.com/dbpedia/extraction-framework/wiki/GSoC-2014-Progress-Sergey-Skovorodkin

On Fri, Mar 6, 2015 at 12:04 PM, rlits...@mail.uni-mannheim.de wrote:

Hello everybody,

first off I'd like to introduce myself . I'm Robert, a current Masters
student at the Mannheim University. I'm studying Business Informatics
and pursuing
the Data and Web Science Specialization Track. One of my major
interests lies in
Data Mining and I constantly complement my studies with Data Mining
related online
courses (MOOCs) during my free time. Alongside my studies I'm also
employed as a
student researcher at the Data and Web Science research group [1] under the
supervision of Prof. Bizer. You will find many professors mentioned in
many of the
papers you suggest as a starting point. A major part of the research
is particularly
dedicated at Open Linked Data, hence the education is close-knit with
examples
and from research projects.

Furthermore, during one of my previous internships I have been involed
in building
an Active Learning system for Named Entity Recognition which has also
enhanced my
experience within this field. The first time I got in touch with NLP
and Machine Learning
was during my Bachelor Thesis that concerned with the classification
of Scientific Papers.

Now coming to the GSoC project:

My first priority would be to work on 5.7. Reverse Engineering and
Aligning Freebase
with DBpedia. I have a working knowledge of Sparql and the Freebase
MQL query language
if needed. During my prior semester I have used DBPedia and Freebase
to perform web
data integration in a closed domain. So I'm aware of schema
integration and schema matching
procedures, which I think qualifies me along with my programming
experience fairly well.
After digging into the proposal of the project there are some
uncertainties that aroused.
In the descriptin you mention the introduction of new properties and
classes if needed.
Your first reference [2] concerns mainly with the reduction/fusion of
closely related
or equivalent properties.

- Can you give me an intuition of a situation where a need for a new class
or
property would arise?

- Can you also please give an example of tools that are based on
freebase and that
should be easily migrated to DBpedia?

- Speaking of the current approaches of mapping classes and
properties, is there any
work currently going on that deal with hierarchies of subjects and objects?

- Related to [2], do S1 and O1 represent actual subjects and objects
or rdf:type classes
of S1 and O1? I think one problem could (at least partially) solve the
other, namely
using a trustful class mapping could assist in working out equivalent
property mappings
and vice versa.

I would be available full-time during the time period of GSoC and it comes
naturally for me that I get myself into the latest research prior the start
of the GSoC period.

- Can you please advise me what would be the next step?

- The project mentioned above is only one of my interests given your
proposals. Do I
have to elaborate my interest to my second and third priority in a
similar way?

Best regads

Robert

[1] http://dws.informatik.uni-mannheim.de/en/home/
[2] http://wiki.knoesis.org/index.php/Property_Alignment

Re: [Dbpedia-gsoc] GSoC 2015 - Introduction

2015-03-09 Thread Alexandru Todor

Hi Guido,

Dimitris already gave you some hints on bugs/features you can be working on.
What I can give you are some general tasks regarding to topic 5.5 Improving
the Mappings Wiki (5.4 has similar requirements):

There are 2 main components you will be working with, the dbpedia mappings
wiki and the server component of the extraction framework.

The mappings wiki is a modified version of Mediawiki. It stores the
mappings between Mediawiki Templates and DBpedia Classes/properties. Each
template is mapped onto a dbpedia class and each property in the template
is mapped onto a dbpedia ontology property. Whenever an editor saves a
mapping he has the option of validating it. This option is presented as a
validate button besides the save button. By clicking this button a service
call is executed to the Server component of the DBpedia Extraction
Framework. When the call is made the contents of the mediawiki article are
passed to the server, the server then analyzes if the text conforms to the
dbpedia mappings syntax and validates it. If it passes the validation the
mappings wiki tells the editor his mapping is valid, otherwise not valid.
Of course the mappings wiki does more things but this is just go get a
quick idea.

I can give you 2 fast warm-up tasks with more to follow:

1) Create a mediawiki extension [1] that hooks into the create/edit
workflow of mediawiki [2] , you will use the necessary hooks for that.
Insert another button besides save that calls a rest web service.
2) Get the server module of the extraction framework up and running and
experiment with it. [3] [4] (The documentation is a bit outdated but should
work with minor changes)

[1] http://www.mediawiki.org/wiki/Manual:Developing_extensions
[2] http://www.mediawiki.org/wiki/Manual:Hooks
[3] http://wiki.dbpedia.org/Documentation#h25-10z
[4] http://wiki.dbpedia.org/Server

On Mon, Mar 9, 2015 at 10:08 AM, Dimitris Kontokostas jimk...@gmail.com
wrote:

 Hi Guido  welcome to DBpedia

 issues 355, 354  327 are related to the mappings wiki/server

 Cheers,
 DImitris

 On Sat, Mar 7, 2015 at 12:29 PM, Guido Pio Mariotti 
 guidopio.mariott...@gmail.com wrote:

 Hi,
 my name is Guido, I'm a student of Politecnico of Turin and actually I
 attend the first year of the master's degree in Computer Engineering.
 I'm interested in the topic 5.4 and 5.5, and I already have knowledge of
 Java and Javascript, also I'm going to take a PHP course in this semester,
 so I was thinking of start learning Scala.
 Do you have any suggestions in which bugs/features can I work on for the
 GSoC Warm-up that can be related to the two topic in which I'm interested?

 Hoping to collaborate with you very soon, even if not in the GSoC
 program, I wish you a nice week-end.


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc




 --
 Kontokostas Dimitris


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

2015-03-08 Thread rlitschk

Hi Thiago, Hi DBPedia-Team,

thanks for your reply. I'd like to clarify a fundamental question:

- In the previous GSoC the participant seem to have built his own
goldstandard of mappings. Are standard benchmarks for quality
measurement insufficient, i.e. does the schema matching quality vary
and depend that much on the source schemas used?

I have done quite some research and gone through a few papers and I
think I have fair understanding. Are there any particular warm-up task
related to this task that you could suggest?

Best regards

Robert

Zitat von Thiago Galery tgal...@gmail.com:

Hi Robert, I would advise taking a look at Marco's response to another
prospective student. He points to these links for a summary of a similar
project in 2014

On Fri, Mar 6, 2015 at 12:04 PM, rlits...@mail.uni-mannheim.de wrote:

Hello everybody,

Now coming to the GSoC project:

- Can you give me an intuition of a situation where a need for a new class
or
property would arise?

- Can you also please give an example of tools that are based on
freebase and that
should be easily migrated to DBpedia?

- Speaking of the current approaches of mapping classes and
properties, is there any
work currently going on that deal with hierarchies of subjects and objects?

I would be available full-time during the time period of GSoC and it comes
naturally for me that I get myself into the latest research prior the start
of the GSoC period.

- Can you please advise me what would be the next step?

- The project mentioned above is only one of my interests given your
proposals. Do I
have to elaborate my interest to my second and third priority in a
similar way?

Best regads

Robert

[1] http://dws.informatik.uni-mannheim.de/en/home/
[2] http://wiki.knoesis.org/index.php/Property_Alignment

Re: [Dbpedia-gsoc] GSoC 2015 Introduction and Parallel processing in DBpedia extraction Framework

2015-03-08 Thread Navin Pai

Yup, looking at the changelog of Apache Spark and having worked on
upgrading much smaller applications across Spark versions, I can attest
that this process shouldn't take too much time. The number of breaking
changes are very minimal in recent versions.

An idea I had, which I would like feedback on is having a configuration
picker, rather than a list of preconfigured container/images. Kind of along
the lines of Fedora's Revisor project [1]. You could mix and match
depending on the configuration you want to use and a customized
image/container is created for you. Of course, the feasibility of this is
an open question...

Honestly, if you ask me, this one project could probably be broken up into
multiple projects, each with a different end goal. Docker brings in a very
interesting set of things to play with, and it would be great if some of
the mentors could provide more feedback on what the end goal of this
specific GSoC project is. :)

Thanks

[1] http://revisor.fedoraunity.org/



Hi Xiao, and welcome!

 Some thoughts from my initial impression and I appreciate your feedback:
  ?- The project ?uses?? ?spark 0.9.1 while the latest version? of spark?
 is
  bumped to 1.2.1.? I suppose there will be some work on upgrade it to the
  new version.?
 

 It'll perhaps be good to port the code to Spark 1.2.1; I can't imagine
 it'll take too much work because the Spark API has been pretty stable since
 that.


  - It looks like the process is putting the data into HDFS, using spark
 the
  exact data and writing result back to HDFS. ?Are there any design
 document
  for this project?
 

 Yes, but it can also work without HDFS. On a single-node cluster you can
 write directly

to the file system (I'm not sure if there is enough
 documentation on that, but there should be; it's mostly about substituting
 hdfs:///home/user/blah with file:///home/user/blah). On a multi-node
 cluster with NFS you can also work without HDFS.

 I have been meaning to write a proper paper on the project since a few
 months but never managed to get around to it.

 - Spark can works with various distributed file system (S3, GlusterFS, etc)
  not limited to HDFS. So I suppose this could be configurable.


 It'd be a good idea to make this configurable, and I suppose it fits in
 well with the docker containers idea too. Different kinds of configurations
 for EC2/S3, Google Cloud etc.

 Feel free to ask any other questions that you may have while running it.

 Cheers,
 Nilesh

 You can also email me at cont...@nileshc.com or visit my website
 http://nileshc.com/


 On Thu, Mar 5, 2015 at 8:27 PM, Xiao Meng xiaom...@gmail.com wrote:

  Hi,
 
  My name is?
   Xiao, currently a PhD student in Simon Fraser University, Canada.
  ?
 
 
  A little background on myself:
 
  - My research is mainly on data management especially on NoSQL databases.
  - I worked for GSoC 2008 on PostgreSQL [1] when I was an undergraduate
  student:-)
  -
  ?Now ?
  I have been working on some open source projects for one year.
  ?They?
   include Apache Hive[2] and Apache Drill[3], both are SQL-on-Hadoop
  engines. I've
  ?also ?
  played
  ?Apache S?
  park for a while and have some hand-on experiences.
  ?I am learning scala and pretty like it.?
 
  - During the period
  ? of working on Hadoop ecosystem?
  , I gained experience on deploying clusters for dev and test. Docker is a
  great tool for this purpose and I have been building several complex
 docker
  containers [4].
 
  I've heard the
  ?great
   DBpedia project long times ago and always want to play with it:-)
 
  Given my background,  I am pretty interested in the following project:
  ? ?
  Parallel processing in DBpedia extraction Framework
  ?[5]?.
 
 
  Some thoughts from my initial impression and I appreciate your feedback:
 
  ?- The project ?
  uses?
  ? ?
  spark 0.9.1 while the latest version
  ? of spark?
  is bumped to 1.2.1.
  ?
  I suppose there will be some work on upgrade it to the new version.
  ?
  - I
  t looks like the process is putting the data into HDFS, using spark the
  exact data and writing result back to HDFS.
  ?
  Are there any design document for this project?
  - Spark can works with various distributed file system (S3,
 GlusterFS,
  etc) not limited to HDFS. So I suppose this could be configurable.
 
  ?I will try it out in following days.
  ? Any suggestions for evolving this project?
  ?
 
  ?Look forward to contributing to DBpedia!
 
 
  [1] https://wiki.postgresql.org/wiki/GSoC_2008
  [2] https://github.com/xiaom/docker-drill
  [3] https://github.com/apache/hive
  [4] https://github.com/apache/drill
  [5] https://github.com/dbpedia/distributed-extraction-framework


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership

Re: [Dbpedia-gsoc] GSoC 2015 - Introduction

2015-03-07 Thread David Przybilla

Hi Shashank,

On DBpedia Spotlight – Better Context Vectors:

Here are the DBPedia Spotlight warm tasks:
https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Warm-up-tasks

if you take a look at the github issue page you should find some of the
problems we are dealing with. One of the ideas could be experimenting with
word2vec.

Have a nice weekend :)

On Sat, Mar 7, 2015 at 11:46 AM, shashank juyal sjuyal...@gmail.com wrote:

 Hi,

 I am a Masters student in International Institute of Information
 technology, Hyderabad (IIIT-H). I am interested in taking part in this
 year's GSOC. Many of the projects in DBPedia sounds very familiar and
 interesting to me as I have worked closely with many of the concepts and
 technologies used in the project.

 I have worked previously with Wikipedia data and built a small search over
 it based on tf-idf score and my own parser. Also currently I am working in
 a project Question Answer techniques using NLP which uses concepts like
 wordtovec, CBOW, NL Processing and translation to query language, which are
 mentioned in some of the projects in DBPedia-Spotlight.

 Based on this, I would like to work on the following projects:

 1) Fact Extraction from Wikipedia Text
 2) Keyword Search on DBpedia
 3) Deploying a DBpedia Question Answering Engine
 4) DBpedia Spotlight – Better Context Vectors

 Please let me know the warm-up tasks in the above projects.

 Linked Profile: in.linkedin.com/in/shajuyal
 Github Profile: https://github.com/sjuyal

 Thanks and Regards,
 Shashank Juyal


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSoC 2015 Introduction and Parallel processing in DBpedia extraction Framework

2015-03-07 Thread Nilesh Chakraborty

Hi Xiao, and welcome!

Some thoughts from my initial impression and I appreciate your feedback:
 - The project uses spark 0.9.1 while the latest version of spark is
 bumped to 1.2.1. I suppose there will be some work on upgrade it to the
 new version.


It'll perhaps be good to port the code to Spark 1.2.1; I can't imagine
it'll take too much work because the Spark API has been pretty stable since
that.


 - It looks like the process is putting the data into HDFS, using spark the
 exact data and writing result back to HDFS. Are there any design document
 for this project?


Yes, but it can also work without HDFS. On a single-node cluster you can
write directly to the file system (I'm not sure if there is enough
documentation on that, but there should be; it's mostly about substituting
hdfs:///home/user/blah with file:///home/user/blah). On a multi-node
cluster with NFS you can also work without HDFS.

I have been meaning to write a proper paper on the project since a few
months but never managed to get around to it.

- Spark can works with various distributed file system (S3, GlusterFS, etc)
 not limited to HDFS. So I suppose this could be configurable.


It'd be a good idea to make this configurable, and I suppose it fits in
well with the docker containers idea too. Different kinds of configurations
for EC2/S3, Google Cloud etc.

Feel free to ask any other questions that you may have while running it.

Cheers,
Nilesh

You can also email me at cont...@nileshc.com or visit my website
http://nileshc.com/


On Thu, Mar 5, 2015 at 8:27 PM, Xiao Meng xiaom...@gmail.com wrote:

 Hi,

 My name is
  Xiao, currently a PhD student in Simon Fraser University, Canada.
 


 A little background on myself:

 - My research is mainly on data management especially on NoSQL databases.
 - I worked for GSoC 2008 on PostgreSQL [1] when I was an undergraduate
 student:-)
 -
 Now 
 I have been working on some open source projects for one year.
 They
  include Apache Hive[2] and Apache Drill[3], both are SQL-on-Hadoop
 engines. I've
 also 
 played
 Apache S
 park for a while and have some hand-on experiences.
 I am learning scala and pretty like it.

 - During the period
  of working on Hadoop ecosystem
 , I gained experience on deploying clusters for dev and test. Docker is a
 great tool for this purpose and I have been building several complex docker
 containers [4].

 I've heard the
 great
  DBpedia project long times ago and always want to play with it:-)

 Given my background,  I am pretty interested in the following project:
  
 Parallel processing in DBpedia extraction Framework
 [5].


 Some thoughts from my initial impression and I appreciate your feedback:

 - The project 
 uses
  
 spark 0.9.1 while the latest version
  of spark
 is bumped to 1.2.1.
 
 I suppose there will be some work on upgrade it to the new version.
 
 - I
 t looks like the process is putting the data into HDFS, using spark the
 exact data and writing result back to HDFS.
 
 Are there any design document for this project?
 - Spark can works with various distributed file system (S3, GlusterFS,
 etc) not limited to HDFS. So I suppose this could be configurable.

 I will try it out in following days.
  Any suggestions for evolving this project?
 

 Look forward to contributing to DBpedia!


 [1] https://wiki.postgresql.org/wiki/GSoC_2008
 [2] https://github.com/xiaom/docker-drill
 [3] https://github.com/apache/hive
 [4] https://github.com/apache/drill
 [5] https://github.com/dbpedia/distributed-extraction-framework


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Dbpedia-gsoc mailing list
 Dbpedia-gsoc@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

[Dbpedia-gsoc] [GSOC 2015] Introduction and Help in starting the contribution

2015-03-03 Thread Nurendra Choudhary

Hi Everybody,

I am Nurendra Choudhary from International Institute of Information
Technology, Hyderabad, India [1] and doing my major in Computational
Linguistics.
My interests lie in Natural Language Processing, Artificial Intelligence
and Machine Learning. I like coding in Python, C, C++. Here's my
SourceForge[2] and Github[3] profile. I normally go by the name akirato
when doing projects or any coding.
I went through the Ideas Page for GSOC 2015 and am really interested in the
Fact Extraction from Wikipedia Text project.
I have some ideas on the project. Like maybe, the first step could be to
find the relation between verbs and the rest of the parts (something like
theta roles, maybe) which further can be developed to finding relation
between all pairs of words and so on.
I have setup the development environment with Eclipse.
Could you help me in proceeding further with the necessities required for
the project?

[1]http://iiit.ac.in/
[2]http://sourceforge.net/u/akirato/profile/
[3]https://github.com/Akirato/

Regards,
Nurendra Choudhary
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

[Dbpedia-gsoc] GSoC 2015 - Introduction (Mingzhe)

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

Re: [Dbpedia-gsoc] GSOC 2015 - Introduction

Re: [Dbpedia-gsoc] GSoC 2015 - Introduction

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

Re: [Dbpedia-gsoc] GSoC 2015 - Introduction

Re: [Dbpedia-gsoc] GSoC 2015 Introduction

Re: [Dbpedia-gsoc] GSoC 2015 Introduction and Parallel processing in DBpedia extraction Framework

Re: [Dbpedia-gsoc] GSoC 2015 - Introduction

Re: [Dbpedia-gsoc] GSoC 2015 Introduction and Parallel processing in DBpedia extraction Framework

[Dbpedia-gsoc] [GSOC 2015] Introduction and Help in starting the contribution

13 matches

Site Navigation

Mail list logo

Footer information