Hi Rafa et al, On Tue, Jun 18, 2013 at 7:57 PM, Rafa Haro <[email protected]> wrote:
> Hi Dileepa, > > El 18/06/13 13:20, Dileepa Jayakody escribió: > >> Hi All, >> >> After going through a lot documentation on Stanbol and Entity >> Disambiguation, I started trying out the Stanbol EntityHub indexing tool >> [1] to create a site for foaf-dataset. I found a sufficient foaf dataset >> in >> N-Quad format here [2], and would like to know if you guys are ok with me >> going ahead with this dataset. Found couple of more sites providing their >> contacts as foaf, but thought of starting with this dataset as it's >> collected/crawled from various sources. >> > I would need more information about this dataset because, initially, it > seems that DataHub already contains DBpedia and Freebase dumps as well as a > lot of datasets more from Linked Open Data, but then, the documentation > says that "The seed set for the Datahub crawl contained all example URIs > marked example/*". Do you know what that means exactly?. My guess is that above statement says that 'datahub' dataset contains a set of URIs of example foaf profiles. I found many example URIs when I browsed through the datahub dataset. eg: <http://example.com/ns> <http://www.iana.org/domains/*example/*> . <http://example.org/on-time-flight> <http://www.iana.org/domains/*example/*> . <http://schemapedia.com/examples/9f0f7654ab69a4c5126e8591be4c528d < http://schemapedia.com/*example*s/9f0f7654ab69a4c5126e8591be4c528d.rdf> . Also, you need to be careful about creating duplicate entries from Datahub, > Freebase and DBpedia. I think the datahub dataset doesn't contain parts of dbpedia and freebase dumps as they are crawled and collected under the separately as mentioned in project. During the datahub site creation using indexing tool I faced some problems retrieving the stored data after the index is created. I sent a separate mail on this to the list titled : "Can't find entities after configuring a entityhub site". Appreciate your help on my way forward in the project. Thanks, Dileepa > >> For indexing purpose I'm following the steps given at [3]. I believe I >> should configure this as a ReferencedSite and not ManagedSite because the >> data is collected from various sources. Please correct me if I'm wrong. >> > I think that Referenced sites are just used when you already have your own > external knowledge base, so using the entityhub indexing tool and then > deploying the index will automatically create a ManagedSite. > > Regards > >> >> Your pointers,suggestions are very much appreciated. >> >> Thanks, >> Dileepa >> >> [1] >> https://svn.apache.org/repos/**asf/stanbol/trunk/entityhub/** >> indexing/genericrdf<https://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/genericrdf> >> [2] >> http://km.aifb.kit.edu/**projects/btc-2012/<http://km.aifb.kit.edu/projects/btc-2012/> >> [3]http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html> >> >> >> On Fri, Jun 14, 2013 at 9:46 AM, Dileepa Jayakody < >> [email protected] >> >>> wrote: >>> Thanks a lot Rafa. >>> >>> I will go through these docs and let you guys know if I have questions. >>> >>> Regards, >>> Dileepa >>> >>> >>> On Thu, Jun 13, 2013 at 5:48 PM, Rafa Haro <[email protected]> wrote: >>> >>> Hi Dileepa, >>>> >>>> I can suggest you a couple of useful links. First one is a quite good >>>> guide for creating new engines in Stanbol. I hope is not getting old: >>>> http://blog.iks-project.eu/****getting-started-with-apache-**<http://blog.iks-project.eu/**getting-started-with-apache-**> >>>> stanbol-enhancement-engine/<ht**tp://blog.iks-project.eu/** >>>> getting-started-with-apache-**stanbol-enhancement-engine/<http://blog.iks-project.eu/getting-started-with-apache-stanbol-enhancement-engine/> >>>> > >>>> >>>> >>>> Second one is about working with custom vocabularies in Stanbol. I think >>>> you are going to know how to configure entityhub indexing tools for >>>> storing >>>> the FOAF data: http://stanbol.apache.org/**** >>>> docs/trunk/customvocabulary.**<http://stanbol.apache.org/**docs/trunk/customvocabulary.**> >>>> html >>>> <http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html> >>>> > >>>> >>>> >>>> I hope this helps. >>>> >>>> Cheers, >>>> >>>> Rafa Haro >>>> >>>> El 12/06/13 23:05, Dileepa Jayakody escribió: >>>> >>>> Hi All, >>>> >>>>> Can you guys please give me some directions on what components in >>>>> Stanbol >>>>> code base I should study more for my project? (It seems not feasible to >>>>> go >>>>> through all the areas of the code base as it is pretty big :)) >>>>> At the moment I'm looking at below components; >>>>> /enhancer/generic/servicesapi >>>>> /enhancement-engines/****entityhublinking, disambiguation-mlt >>>>> >>>>> /entityhub/site/managed >>>>> >>>>> Greatly appreciate your pointers to relavant areas of the codebase >>>>> that I >>>>> should be more focused on. >>>>> >>>>> Thanks, >>>>> Dileepa >>>>> >>>>> >>>>> >>>>> On Tue, Jun 11, 2013 at 2:36 PM, Dileepa Jayakody < >>>>> [email protected] >>>>> >>>>> wrote: >>>>>> Hi Rafa >>>>>> >>>>>> >>>>>> On Tue, Jun 11, 2013 at 2:16 PM, Rafa Haro <[email protected]> wrote: >>>>>> >>>>>> Hi Dileepa, >>>>>> >>>>>>> El 11/06/13 07:07, Dileepa Jayakody escribió: >>>>>>> >>>>>>> My suggestion on integrating foaf-search [3] would basically need >>>>>>> to >>>>>>> do a >>>>>>> >>>>>>> on-the-fly retrieval of data, but as you have pointed out it could >>>>>>>> impose a >>>>>>>> performance hit. But foaf-search looks promising with a big index of >>>>>>>> FOAF >>>>>>>> data. >>>>>>>> >>>>>>>> A concern about using foaf-search is that you should ensure that >>>>>>>> you >>>>>>>> >>>>>>> manage FOAF information associated with the EntityHub site used for >>>>>>> Entity >>>>>>> Linking. So, for instance, if you want to link your entities with >>>>>>> DBpedia, >>>>>>> can you be sure that the results of searching with a surface form >>>>>>> (name >>>>>>> mention) in foaf-search are going to include the right entity foaf >>>>>>> data?. >>>>>>> In other words, does foaf-search index have information about all >>>>>>> your >>>>>>> entities in DBpedia EntityHub site? >>>>>>> >>>>>>> AFAIK foaf-search has integrated DBpedia 3.8, therefore we can assume >>>>>>> it >>>>>>> >>>>>>> is up-to-date with DBpedia entities. However the free API access is >>>>>> restricted to 50000 calls per-day and there are some terms of use [8] >>>>>> that >>>>>> might be bit of a concern (eg: availability of service, warranty) >>>>>> >>>>>> [8] >>>>>> http://www.foaf-search.net/****Terms<http://www.foaf-search.net/**Terms> >>>>>> <http://www.foaf-search.**net/Terms<http://www.foaf-search.net/Terms> >>>>>> > >>>>>> >>>>>> >>>>>> Regards >>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> ------------------------------ >>>>>>> This message should be regarded as confidential. If you have received >>>>>>> this email in error please notify the sender and destroy it >>>>>>> immediately. >>>>>>> Statements of intent shall only become binding when confirmed in hard >>>>>>> copy >>>>>>> by an authorised signatory. >>>>>>> >>>>>>> Zaizi Ltd is registered in England and Wales with the registration >>>>>>> number >>>>>>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam >>>>>>> Road, >>>>>>> London W10 5JJ, UK. >>>>>>> >>>>>>> >>>>>>> -- >>>> >>>> ------------------------------ >>>> This message should be regarded as confidential. If you have received >>>> this email in error please notify the sender and destroy it immediately. >>>> Statements of intent shall only become binding when confirmed in hard >>>> copy >>>> by an authorised signatory. >>>> >>>> Zaizi Ltd is registered in England and Wales with the registration >>>> number >>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, >>>> London W6 7AN. >>>> >>> >>> >>> > > -- > > ------------------------------ > This message should be regarded as confidential. If you have received this > email in error please notify the sender and destroy it immediately. > Statements of intent shall only become binding when confirmed in hard copy > by an authorised signatory. > > Zaizi Ltd is registered in England and Wales with the registration number > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, > London W6 7AN. >
