Claude Warren created JENA-528:
----------------------------------

             Summary: Deprecation of BulkUpdateHandler has unintended 
consequences for other projects
                 Key: JENA-528
                 URL: https://issues.apache.org/jira/browse/JENA-528
             Project: Apache Jena
          Issue Type: Bug
          Components: Jena
    Affects Versions: Jena 2.10.1
            Reporter: Claude Warren
            Priority: Blocker


Description copied from email conversations:

SDB currently implements its own BulkUpdateHandler, and I just made some tests 
that indicate that it is significantly faster than using GraphUtil.add (2 
seconds versus 40 seconds for 10k triples). Now that BulkUpdateHandler has been 
deprecated, and Model.add is already using GraphUtil.add, what call sequence 
are we supposed to use to retain the good performance of the BulkUpdateHandler? 
Could a method Graph.add(Iterable<Triple>) be added to allow graphs to optimize 
the behavior for specific Graph types?


I understand SDB is rather unsupported, but the issue is really a question on 
the core API.

Deprecating the BulkUpdateHandler will not only affect SDB but any other 
database such as Oracle RDF (the Jena adapter of which implements its own BUH 
right now). Granted, the class is not gone yet, but some existing API calls 
(Model.add) already bypass the BulkUpdateHandler, and I believe this was 
premature (revision 1419595). My suggestion is to continue to delegate 
Model.add through the BulkUpdateHandler for the upcoming release until the 
interface has been truly removed/replaced with something else. BUH does not 
represent much implementation overhead for Graph implementers, because they can 
simply use the default implementation. The current implementation is too 
inefficient for our product.

If there is a cleaner mechanism to get the same performance, then I'd be happy 
to hear about it.
===
On 9/4/2013 3:15, Claude Warren wrote:
As I recall the discuss around this topic dealt with the idea that you
could add each triple inside a transaction and when the transaction
committed transaction code would do the bulk update if supported.
However
I may be way off base here.  I have no objection to retaining the BUH.
==

If this were the case, then the code in the GraphUtil helper functions should 
probably wrap the individual performUpdate calls with a transaction, but they 
don't.

I would greatly appreciate seeing this resolved before the final release. As 
suggested earlier, we could gain time for a proper redesign by avoiding the 
calls to the GraphUtil replacement functions, or changing those functions so 
that they call graph.getBulkUpdateHandler() for the time being, and possibly 
undeprecate BUH for now.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to