Thanks everyone for your suggestions and clarifications.

On Tue, Dec 15, 2015 at 10:18 PM, Tom Morris <tfmor...@gmail.com> wrote:

> Two other sources you might consider are Freebase and Wikidata.  Using
> them together with DBpedia might give you better results.
>
> Tom
>
> On Tue, Dec 15, 2015 at 5:27 AM, Vihari Piratla <viharipira...@gmail.com>
> wrote:
>
>> Thanks Dimitris for a detailed response.
>> I see 2,945,956 unique titles in instance-types_en.nt.bz2 and 2,716,774
>> unique titles in instance-types-transitive_en.nt.bz2. The number of unique
>> titles in the two files together is 2,945,956.
>> Currently, Wikipedia contains 5,031,836 articles in English. I am
>> assuming the dump is missing 2 million or so titles because of the bug in
>> the extraction framework.
>>
>> When can we expect the 2016 release?
>>
>> Thanks
>>
>> On Mon, Dec 14, 2015 at 8:53 PM, Dimitris Kontokostas <jimk...@gmail.com>
>> wrote:
>>
>>> Hi Vihari,
>>>
>>> The main reason for the size reduction is due to the split between
>>> direct & transitive types [1]
>>> There was a bug [2] that indirectly affected some type assignments but
>>> is now fixed and the next release will not have this problem.
>>> Also note that besides SD-Types, in this release we published two
>>> additional type datasets, dbatx and LHD [3]
>>>
>>> Regarding your 2nd question ('__'). These resources are extracted from
>>> additional infoboxes in the same page but when they cannot be merged, we
>>> create additional resources.
>>> This is also a way to create intermediate node mappings
>>> <http://mappings.dbpedia.org/index.php/Template:IntermediateNodeMapping>through
>>> the mappings wiki e.g. in [4]
>>>
>>> [1]
>>> http://downloads.dbpedia.org/2015-04/core-i18n/en/instance-types-transitive_en.nt.bz2
>>> [2] https://github.com/dbpedia/extraction-framework/issues/404
>>> [3] http://wiki.dbpedia.org/dbpedia-data-set-2015-04
>>> [4]
>>> http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_officeholder
>>>
>>> On Mon, Dec 14, 2015 at 1:12 PM, Vihari Piratla <viharipira...@gmail.com
>>> > wrote:
>>>
>>>> Hi,
>>>> I am a software developer, we use DBpedia instance type or
>>>> mapping-based type files in a pipeline to recognize entities.
>>>> We found that the latest instance-types resource available at
>>>> http://downloads.dbpedia.org/2015-04/core-i18n/en/instance-types_en.nt.bz2
>>>> is much smaller than the corresponding 2014 release
>>>> http://data.dws.informatik.uni-mannheim.de/dbpedia/2014/en/instance_types_en.nt.bz2
>>>> .
>>>> As a result, the latest instance file is missing many entries present
>>>> on Wikipedia such as Taj_Mahal, J._Paul_Getty_Museum, Grand_Canyon.
>>>> What is the reason for the reduced size (110MB->35MB)
>>>> Is this a bug?
>>>> Are there some other files that we have to consider along with this
>>>> file?
>>>>
>>>> We also sometimes see entries with '__', as in "Abraham_Lincoln__1" in
>>>> the line
>>>> <http://dbpedia.org/resource/Abraham_Lincoln__1> <
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>>>> http://dbpedia.org/ontology/TimePeriod>
>>>> What does '__' mean? Where can I find more information about these
>>>> things.
>>>>
>>>> Thanks
>>>> --
>>>> Vihari PIratla
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Dbpedia-discussion mailing list
>>>> Dbpedia-discussion@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>>
>>>>
>>>
>>>
>>> --
>>> Kontokostas Dimitris
>>>
>>
>>
>>
>> --
>> V
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> Dbpedia-discussion@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>>
>


-- 
V
------------------------------------------------------------------------------
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to