At that level you don't need particularly advanced tools to work with the
DBpedia dumps.  You can go pretty far with something along the lines of

awk -F\\t  '{print $1}' | sort | uniq -c | sort -nr | head

I'm surprised Kingsley hasn't mentioned that it is quite straightforward to
load DBpedia data into your own instance Virtuoso if you have a machine
with a lot of memory (say 32GB to be safe,  but I have done it with 20GB.)
 32GB of RAM is not that expensive these days if you buy from some place
like Crucial,  also you can rent a machine for 40 cents an hour on AWS that
is up to the task.

You'll have to increase some of timeouts and limits,  but you can turn
around the kind of analytic queries that are being talked about here in
under a minute over all of dbpedia-en,  and that is more stuff than is
loaded on the public endpoint.

Last year I had product on the AWS marketplace that had Virtuoso and
DBpedia 3.8 pre-loaded,  and it was starting to get popular,  then
Shellshock happened and I never got around to building a new one.




On Mon, Feb 2, 2015 at 11:19 AM, Kingsley Idehen <kide...@openlinksw.com>
wrote:

>  On 2/1/15 10:55 AM, Jörn Hees wrote:
>
>  Hi Kingsley,
>
> On 31 Jan 2015, at 19:58, Kingsley Idehen <kide...@openlinksw.com> 
> <kide...@openlinksw.com> wrote:
>
>
>  > Anyway, we've loaded stats data into the DBpedia public instance:
>
>  I know and this is cool, but it's only based on wikiPageWikiLink.
>
> I was interested in ordering nodes by triple count with them as subject / 
> object, based on all properties, not just wikiPageWikiLinks.
>
> If there's a simpler way to this than parsing the dumps [1] (hopefully there 
> is and i was just too stupid to find it), please let me know.
>
> Best,
> Jörn
>
> [1]: 
> https://joernhees.de/blog/2015/01/28/dbpedia-2014-stats-top-subjects-predicates-and-objects/
>
>
>
> We are going to look into the queries posed in your examples.
>
> --
> Regards,
>
> Kingsley Idehen       
> Founder & CEO
> OpenLink Software
> Company Web: http://www.openlinksw.com
> Personal Weblog 1: http://kidehen.blogspot.com
> Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
> Twitter Profile: https://twitter.com/kidehen
> Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
> Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is
> your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>


-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254    paul.houle on Skype   ontolo...@gmail.com
http://legalentityidentifier.info/lei/lookup
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to