[Owlim-discussion] BigOWLIM 3.3 released; already in use by the BBC for the World Cup website

Damyan Ognyanoff Tue, 22 Jun 2010 04:03:01 -0700

BigOWLIM 3.3 Handles Millions of Queries per Day, Supports OWL 2 RL Reasoning 
and Provides Unmatched Linked Data Integration, Management and Retrieval 
Capabilities


Ontotext is pleased to announce version 3.3 of BigOWLIM, the World's most 
scalable semantic repository [3]. The main purpose of this release is to 
consolidate a number of advanced features, many of which have existed for some 
time as bespoke developments. These features are now part of the core BigOWLIM 
product and significantly broaden the application areas in which the repository 
can be used. We have also dedicated several months to make BigOWLIM more robust 
and easy to use.

The development of this version was influenced by the requirements of FactForge 
and LinkedLifeData (two of the most advanced linked data portals) and the BBC's 
2010 World Cup website - probably the most challenging real-world use case of 
semantic repositories implemented so far. Within the LarKC project BigOWLIM is 
used as the data layer in a platform for Web scale reasoning, which features a 
range of reasoning plug-ins, including the WebPIE massively parallel reasoning 
system [7,8].

The key characteristics and features of BigOWLIM include:

  a.. The most efficient semantic repository in the World [2,3], in terms of 
speed with which it can load, do inferencing, and query the data; 
  b.. Pure Java implementation and fully compatible with Sesame 2, which brings 
interoperability benefits and support for all popular RDF syntaxes and query 
languages, including SPARQL; 
  c.. Clustering support brings resilience, failover and horizontally scalable 
parallel query processing; 
  d.. Customisable reasoning, in addition to RDFS, OWL-Horst, and OWL 2 RL 
support. BigOWLIM 3.3 is the only semantic repository which provides 
comprehensive OWL 2 RL support today [5]; 
  e.. Optimized owl:sameAs handling, which delivers dramatic improvements in 
performance and usability when huge volumes of data from multiple sources are 
integrated; 
  f.. Full-text search, based on either Lucene or proprietary techniques, for 
searching RDF data; 
  g.. High performance retraction of statements and their inferences. While 
forward-chaining and materialisation speeds up query answering, this unique 
technique removes the performance degradation that materialisation-based 
systems face when retracting statements; 
  h.. Powerful and expressive consistency/integrity constraint checking 
mechanisms; 
  i.. RDF rank, similar to Google's PageRank, can be calculated for the nodes 
in an RDF graph and used for ordering query results by relevance, visualisation 
and many other purposes; 
  j.. RDF Priming, based upon activation spreading, allows efficient data 
selection and context-aware query answering for handling huge datasets; 
  k.. Notification mechanism, to allow clients to react to statements in the 
update stream.
These features are already proven at FactForge (previously known as LDSR), 
where BigOWLIM is used to load 8 of the central LOD [6] datasets (DBPedia, 
Geonames, Wordnet, Musicbrainz, Freebase, UMBEL, Lingvoj and the CIA World 
Factbook) in a repository which contains 1.2 billion explicit and 0.8 billion 
implicit statements. BigOWLIM's owl:sameAs optimization allows FactForge to 
deal with 'only' 2 billion statements in its indices, while the number of 
distinct statements retrievable form the repository is 10 billion. This feature 
allows FactForge to deliver non-inflated query results, while the semantics of 
owl:sameAs is still fully accounted for during query evaluation. FactForge is a 
public service that allows users to perform RDF search, execute SPARQL queries, 
and to explore this data in real-time, adhering to its semantics. FactForge is 
the only system which provides a solution to the Modigliani test, defined at 
ReadWriteWeb as the tipping point of the Semantic Web [4]. 

BigOWLIM is also at the heart of the LinkedLifeData RDF warehouse, which 
combines 25 of the most popular biomedical databases in a repository that 
contains more than 4 billion statements. This is the only public service that 
allows for efficient query evaluation against all of these datasets at once.

The latest version of the BigOWLIM repository has been successfully integrated 
into the high performance Semantic Web publishing stack powering the BBC's 2010 
World Cup football website, performing OWL reasoning with continuously changing 
data and handling millions of page requests per day.

Included with version 3.3 is a thoroughly reworked documentation set that 
includes:

  a.. User guide - updated with details of new features and all configuration 
parameters; 
  b.. Primer - updated with examples and recent trends in semantic 
technologies; 
  c.. Quick start guide - to help those new to BigOWLIM to get set up and 
running smoothly.
Some popular namespace prefixes come predefined within BigOWLIM 3.3 in order to 
simplify query writing. Such as the prefixes for: the RDF, RDFS, and OWL 
schemata; all the prefixes for linked data namespaces used in FactForge; and 
prefixes for projects like Good Relations. A detailed list is of the predefined 
prefixes can be found in the prefixes.txt file in the distribution.

 Furthermore, the OWLIM website (http://www.ontotext.com/owlim/) has been 
revised to include more relevant details, latest benchmark results, etc. For 
further information, please contact owlim-i...@ontotext.com - we would be very 
pleased to hear from you.

Sometimes developing OWLIM feels like mountain climbing - each new achievement 
opens up new opportunities and challenges. We often think of OWLIM as a 
track-laying machine that extends the reach of the data railways, step by step, 
changing the data-economy of entire domains by allowing more and more complex 
data to be handled at lower cost. 

BigOWLIM 3.3 is as robust and advanced as it is today, because of the numerous 
clients who believed in it, used it and provided us feedback. Using OWLIM you 
help us make it better and lay the track further! 

 

The OWLIM team, June 2010

------

[1] "In our tests, BigOWLIM provides the best average query response time and 
answers maximum number of queries for both the datasets. . it is clear to see 
that execution speed-wise BigOWLIM outperforms Allegrograph and Sesame for 
almost all of the dataset queries." - Thakker , D., Osman, T., Gohil, S., 
Lakin, P. from Press Association and the Nottingham Trent University. In: "A 
Pragmatic Approach to Semantic Repositories Benchmarking" (2010) Proceedings of 
the 7th Extended Semantic Web Conference, ESWC 2010

[2] BSBM Results for Virtuoso, Jena TDB, BigOWLIM (November 2009). Bizer, Ch., 
Schultz, A. 
http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V5/index.html
 

[3] Large Triple Stores. Wiki page supported by W3C. 
http://esw.w3.org/LargeTripleStores 

[4] The Modigliani Test for Linked Data: Results. Richard MacManus, 
ReadWriteWeb,  
http://www.readwriteweb.com/archives/the_modigliani_test_for_linked_data.php 

[5] Implementations - OWL. http://www.w3.org/2007/OWL/wiki/Implementations 

[6] Linking Open Data. W3C SWEO Community Project. 
http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData 

[7] Report on platform validation and recommendation for next version. 
Deliverable D5.5.3 of project LarKC. To appear. http://www.larkc.eu/ 

[8] OWL reasoning with WebPIE: calculating the closure of 100 billion triples. 
Urbani J., Kotoulas, S., Maaseen J., van Harmelen, F. & Bal, H. In Proceedings 
of the ESWC '10.

_______________________________________________
OWLIM-discussion mailing list
OWLIM-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/owlim-discussion

[Owlim-discussion] BigOWLIM 3.3 released; already in use by the BBC for the World Cup website

Reply via email to