BigOWLIM 3.3 Handles Millions of Queries per Day, Supports OWL 2 RL Reasoning and Provides Unmatched Linked Data Integration, Management and Retrieval Capabilities
Ontotext is pleased to announce version 3.3 of BigOWLIM, the World's most scalable semantic repository [3]. The main purpose of this release is to consolidate a number of advanced features, many of which have existed for some time as bespoke developments. These features are now part of the core BigOWLIM product and significantly broaden the application areas in which the repository can be used. We have also dedicated several months to make BigOWLIM more robust and easy to use. The development of this version was influenced by the requirements of FactForge and LinkedLifeData (two of the most advanced linked data portals) and the BBC's 2010 World Cup website - probably the most challenging real-world use case of semantic repositories implemented so far. Within the LarKC project BigOWLIM is used as the data layer in a platform for Web scale reasoning, which features a range of reasoning plug-ins, including the WebPIE massively parallel reasoning system [7,8]. The key characteristics and features of BigOWLIM include: a.. The most efficient semantic repository in the World [2,3], in terms of speed with which it can load, do inferencing, and query the data; b.. Pure Java implementation and fully compatible with Sesame 2, which brings interoperability benefits and support for all popular RDF syntaxes and query languages, including SPARQL; c.. Clustering support brings resilience, failover and horizontally scalable parallel query processing; d.. Customisable reasoning, in addition to RDFS, OWL-Horst, and OWL 2 RL support. BigOWLIM 3.3 is the only semantic repository which provides comprehensive OWL 2 RL support today [5]; e.. Optimized owl:sameAs handling, which delivers dramatic improvements in performance and usability when huge volumes of data from multiple sources are integrated; f.. Full-text search, based on either Lucene or proprietary techniques, for searching RDF data; g.. High performance retraction of statements and their inferences. While forward-chaining and materialisation speeds up query answering, this unique technique removes the performance degradation that materialisation-based systems face when retracting statements; h.. Powerful and expressive consistency/integrity constraint checking mechanisms; i.. RDF rank, similar to Google's PageRank, can be calculated for the nodes in an RDF graph and used for ordering query results by relevance, visualisation and many other purposes; j.. RDF Priming, based upon activation spreading, allows efficient data selection and context-aware query answering for handling huge datasets; k.. Notification mechanism, to allow clients to react to statements in the update stream. These features are already proven at FactForge (previously known as LDSR), where BigOWLIM is used to load 8 of the central LOD [6] datasets (DBPedia, Geonames, Wordnet, Musicbrainz, Freebase, UMBEL, Lingvoj and the CIA World Factbook) in a repository which contains 1.2 billion explicit and 0.8 billion implicit statements. BigOWLIM's owl:sameAs optimization allows FactForge to deal with 'only' 2 billion statements in its indices, while the number of distinct statements retrievable form the repository is 10 billion. This feature allows FactForge to deliver non-inflated query results, while the semantics of owl:sameAs is still fully accounted for during query evaluation. FactForge is a public service that allows users to perform RDF search, execute SPARQL queries, and to explore this data in real-time, adhering to its semantics. FactForge is the only system which provides a solution to the Modigliani test, defined at ReadWriteWeb as the tipping point of the Semantic Web [4]. BigOWLIM is also at the heart of the LinkedLifeData RDF warehouse, which combines 25 of the most popular biomedical databases in a repository that contains more than 4 billion statements. This is the only public service that allows for efficient query evaluation against all of these datasets at once. The latest version of the BigOWLIM repository has been successfully integrated into the high performance Semantic Web publishing stack powering the BBC's 2010 World Cup football website, performing OWL reasoning with continuously changing data and handling millions of page requests per day. Included with version 3.3 is a thoroughly reworked documentation set that includes: a.. User guide - updated with details of new features and all configuration parameters; b.. Primer - updated with examples and recent trends in semantic technologies; c.. Quick start guide - to help those new to BigOWLIM to get set up and running smoothly. Some popular namespace prefixes come predefined within BigOWLIM 3.3 in order to simplify query writing. Such as the prefixes for: the RDF, RDFS, and OWL schemata; all the prefixes for linked data namespaces used in FactForge; and prefixes for projects like Good Relations. A detailed list is of the predefined prefixes can be found in the prefixes.txt file in the distribution. Furthermore, the OWLIM website (http://www.ontotext.com/owlim/) has been revised to include more relevant details, latest benchmark results, etc. For further information, please contact owlim-i...@ontotext.com - we would be very pleased to hear from you. Sometimes developing OWLIM feels like mountain climbing - each new achievement opens up new opportunities and challenges. We often think of OWLIM as a track-laying machine that extends the reach of the data railways, step by step, changing the data-economy of entire domains by allowing more and more complex data to be handled at lower cost. BigOWLIM 3.3 is as robust and advanced as it is today, because of the numerous clients who believed in it, used it and provided us feedback. Using OWLIM you help us make it better and lay the track further! The OWLIM team, June 2010 ------ [1] "In our tests, BigOWLIM provides the best average query response time and answers maximum number of queries for both the datasets. . it is clear to see that execution speed-wise BigOWLIM outperforms Allegrograph and Sesame for almost all of the dataset queries." - Thakker , D., Osman, T., Gohil, S., Lakin, P. from Press Association and the Nottingham Trent University. In: "A Pragmatic Approach to Semantic Repositories Benchmarking" (2010) Proceedings of the 7th Extended Semantic Web Conference, ESWC 2010 [2] BSBM Results for Virtuoso, Jena TDB, BigOWLIM (November 2009). Bizer, Ch., Schultz, A. http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V5/index.html [3] Large Triple Stores. Wiki page supported by W3C. http://esw.w3.org/LargeTripleStores [4] The Modigliani Test for Linked Data: Results. Richard MacManus, ReadWriteWeb, http://www.readwriteweb.com/archives/the_modigliani_test_for_linked_data.php [5] Implementations - OWL. http://www.w3.org/2007/OWL/wiki/Implementations [6] Linking Open Data. W3C SWEO Community Project. http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData [7] Report on platform validation and recommendation for next version. Deliverable D5.5.3 of project LarKC. To appear. http://www.larkc.eu/ [8] OWL reasoning with WebPIE: calculating the closure of 100 billion triples. Urbani J., Kotoulas, S., Maaseen J., van Harmelen, F. & Bal, H. In Proceedings of the ESWC '10.
_______________________________________________ OWLIM-discussion mailing list OWLIM-discussion@ontotext.com http://ontotext.com/mailman/listinfo/owlim-discussion