Ontotext are pleased to announce the second BETA release of OWLIM version 5.0 <http://www.ontotext.com/owlim> featuring:

 * *Transaction management and isolation mechanisms* have been
   completely refactored. The previous strategy used lazy writing of
   modified database pages, such that dirty pages were only flushed to
   disk when further updates occur and no more memory is available.
   While extremely fast, the problem with this approach is that there
   is a considerable recovery time associated with replaying the
   transaction log after an abnormal termination. The new mechanism
   uses two modes: 'bulk-loading' (fast) with similar behaviour to
   previous versions and 'normal' (safe) where database modifications
   are flushed to disk as part of the commit operation. When running in
   safe mode, *database recovery is instant* and there is a
   *significant improvement in concurrency between updates and queries*.

 * *New context indices* can be used to improve query performance when
   data is modelled using many named graphs. These are switched on and
   off using a single configuration parameter enable-context-index

 * The *SPARQL 1.1 Graph Store HTTP Protocol* is now supported
   according to the W3C Working Draft
   <http://www.w3.org/TR/sparql11-http-rdf-update/> from the 12th May
   2011. This provides a REST interface for managing collections of
   graphs, using either directly or indirectly named graphs.

 * *Sesame <http://www.openrdf.org>**2.6.5* with many bug-fixes and
   updates to bring SPARQL 1.1 Query
   <http://www.w3.org/TR/2012/WD-sparql11-query-20120105/> support up
   to the latest W3C Working Draft from the 5th January 2012. NOTE:
   This beta release is bundled with a snapshot version of Sesame. The
   final release of OWLIM will coincide with the official Sesame 2.6.5
   release

 * *Significant reduction in disk-space requirements* is achieved with
   the following modifications:
     o *Index compression* can now be used to reduce disk storage
       requirements by using zip compression on database pages. This
       feature if off by default, but can be switched on when creating
       a new repository. The configuration parameter
       index-compression-ratio can be set to -1 (the default value
       indicating no compression) or a value in the range 10-50
       
<https://confluence.ontotext.com/pages/createpage.action?spaceKey=OWLIMint&title=10-50&linkCreation=true&fromPageId=16191743>
       indicating the desired percentage reduction in page sizes. Any
       pages that can not be compressed by the specified amount are
       stored uncompressed. Therefore a compression ratio that is too
       aggressive will not bring many benefits. Experiments have shown
       that for large datasets a value of about 30% is close to optimal
       and can lead to storage space savings of 50%.
     o *Restructuring of the triple indices* has also led to a
       reduction in disk-space requirements of around 18% independent
       of the compression functionality
     o *Entity compression* is a modification that reduces the storage
       requirements for the lookup table that maps between internal
       identifiers and resources. This is transparent to the user and
       happens automatically. More disk space reductions are apparent
       using this version.

 * A new *literal index* is created automatically for numeric and
   date/time data-types. The index is used during query evaluation if a
   query or a subquery (e.g. union) has a filter that is comprised of a
   conjunction of literal constraints, e.g. FILTER(?x >= 3 && ?y <= 5
   && ?start > "2001-01-01"^^xsd:date). Other patterns, including those
   that use negation, will not use the index for this version of OWLIM.

 * All *control queries now use SPARQL Update syntax* (used mostly to
   control the Lucene-based full-text search, RDF Rank and geo-spatial
   plug-ins). This has a number of advantages, namely:
     o No special control query pseduo-graph is required by the
       Replication Cluster master in order to identify control queries
       that must be pushed to all worker nodes
     o SPARQL Updates use the corresponding SPARQL update protocol, so
       they can be automatically processed by load-balancers that
       examine URL patterns
     o It is more consistent with the SPARQL language, since these
       'control queries' cause a change of state in OWLIM

 * *Incremental Lucene-based full-text search index* for updating the
   index for specific resources or all un-indexed resources. Using this
   technique can avoid the more expensive approach of rebuilding the
   whole index frequently.

 * *Incremental RDF Rank* allows the RDF rank for specific resources to
   be (re-)computed as directed by the user. This technique can avoid
   the more expensive approach of rebuilding all RDF Rank values
   frequently.

 * The *getting started* application has been restructured so that it
   now works with remote repositories.

*Known problems*

 * There are several outstanding fixes to be done in Sesame, which can
   cause incorrect query results when using sub-queries.
 * The behaviour of the 'include inferred' checkbox in the Sesame
   Workbench is unpredictable when using OWLIM repositories.
 * This version of OWLIM is *not backwardly compatible* with any
   previous version. This means that images created with OWLIM 4.3 and
   before will not work correctly with OWLIM 5.0 and must be
   re-created. There have been a great many modifications to the
   storage files, indexing structures, etc, and upgrade mechanisms have
   proven too complex and probably slower than re-loading the database
   anyway. Please *do not attempt to upgrade to OWLIM 5.0 unless you
   drop and recreate all databases*. A migration tool, which allows for
   automated re-loading of data from any Sesame-accessible repository,
   will be provided to ease the transition.

Full documentation for all OWLIM editions is available online <http://owlim.ontotext.com> (click on the OWLIM 5.0 beta link on the left hand side).

The OWLIM team
March 2012

_______________________________________________
Owlim-discussion mailing list
[email protected]
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion

Reply via email to