Hi All,

Below is the reply I got from Andreas Harth from webdatacommons project. He
suggests that the btc-2012 dataset I mentioned in my previous mail has a
sufficient FOAF dataset.
Shall I go ahead with that dataset for my project?

"the BTC 2012 has FOAF data [1].  You'd get a more comprehensive FOAF
dataset if you first get all instances of foaf:Persons (simple grep)
and then start a crawl from those, e.g., via LDSpider [2].  I assume
that a hop-1 crawl would already get you a sizable dataset.

All the best with your project, I look forward to seeing the results!

Best regards,
Andreas.

[1] 
http://km.aifb.kit.edu/**projects/btc-2012/<http://km.aifb.kit.edu/projects/btc-2012/>
[2] http://code.google.com/p/**ldspider/<http://code.google.com/p/ldspider/>
"
Thanks,
Dileepa


On Tue, Jun 25, 2013 at 5:45 PM, Dileepa Jayakody <dileepajayak...@gmail.com
> wrote:

> Hi All,
>
> For my project:  FOAF co-reference based disambiguation, as the first
> milestone I'm developing an EntityHub ReferencedSite for a foaf data-set.
> With help from Rupert and others I was able to index a sample foaf dataset
> using the genericrdf indexing tool and setup a referenced-site. foaf-data
> can be filtered, by using propertyfilter.config to import foaf:*. This will
> import all entities which define foaf properties. The next step will be
> to develop a EntityProcessor to further filter and clean the foaf data by
> defining the required foaf properties that are going to be used for
> disambiguation purpose.
>
> To continue my project I would like to finalize the FOAF dataset I need to
> use, and highly appreciate your input on this.
> In the foaf-wiki site [1] there are many datasource projects but many of
> them are out of date.
>
> Following are my findings for a dataset for my project;
>
> 1. The billion-tripple challenge 2012 project [2] , a web-crawled dataset
> including data from dbpedia, freebase, datahub, timbl, rest datasources. 
> Quantity
> wise I think this has a sufficient amount (1436545545 quads) of data and
> it's fairly upto date.
> 2. WebDataCommons project [3] which has a dataset (1079175202 quads)
> created in August 2012. But the sources of the data is not specified in the
> project. I have posted on their group asking if they have foaf data in
> their dataset, waiting for their suggestions on it.
>
> 3. DBpedia also has resources having foaf properties. Specially 
> 'dbpedia-ont:Person'
> type entities contain foaf properties. I think we can map
> dbpedia-ont:Person to a FOAF profile here. WDYT?
>
> 4. There are several websites like http://iwlearn.net/, opera-community
> exposing their contact list as FOAF, but they don't contain data on public
> figures, celebrities AFAIK.
>
> Can I please have your opinions on finalizing a dataset for my project?
> Appreciate your help.
>
> Thanks,
> Dileepa
>
> [1] http://www.w3.org/wiki/FoafSites
> [2] http://km.aifb.kit.edu/projects/btc-2012/
> [3] http://webdatacommons.org/
>

Reply via email to