> instance-types_en.nt.bz2 and instance-types-transitive_en.nt.bz2 together is > 2,945,956. > Currently, Wikipedia contains 5,031,836 articles in English. > I am assuming the dump is missing 2 million or so titles because of the bug > in the extraction framework.
Not necessarily. There are very many articles without an infobox, and they don't get a "SD type" (as Dimitris called it). They may get a heuristic type, dbtax type or LHD type. Can you find some examples of articles with infobox and without type? ------------------------------------------------------------------------------ _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion