[jira] [Commented] (JENA-1083) MInor refactoring in TupleTables

ASF GitHub Bot (JIRA) Wed, 30 Dec 2015 02:57:07 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074931#comment-15074931
 ]


ASF GitHub Bot commented on JENA-1083:
--------------------------------------

Github user afs commented on the pull request:

    https://github.com/apache/jena/pull/110#issuecomment-167976621
  
    Putting an extra indirection in Triple to get to the array makes it less 
cache friendly.  As a data structure that can be right at the heart of CPU 
bound operations, this may make a noticeable difference to Triple. At a bare 
minimum, experimentation would be necessary to verify the consequences of the 
change.
    
    Every time the code has `(Node ... arg)` there is an array created.  So if 
I read it right, in `QuadOrdering.unmapAndCreate` there are 2 temporary arrays 
and several method calls including inside `unmap`, yet the current code does 
this directly (`QuadTableForm.quad`) by reordering use of arguments.  Actually, 
the important case is the reverse direction: `PMapQuadTable._find` (more 
results than patterns so more comes out that goes it).
    
    Have you run a profiler on the code?
    
    I think there is a lot to be said for the reordering version - write out 
the combinations once, test extensively. TDB is not a good example here. (1) it 
is disk backed and the costs are different (2) seeing the current TxnMem code, 
it could be better though there are more important gains elsewhere.
    
    This PR seems like a lot of change to tidy up relatively few lines.
    
    The concept of reordering is specific to indexing.  At some point, 
somewhere there is going to be some fiddly bit of code to do reordering. It's 
the mismatch between a hierarchical index where order is important and uniform 
accessible of sequences - it's analogously SQL tables and e.g. ISAM, VSAM style 
access which can be used to build SQL tables.
    
    One of the things I tried to improve with `TupleMap` is the ability to use 
it for the index mapping, not just `Tuple` to `Tuple`.  I found get the naming 
right hard; it depends on how you think about mapping - whether it getting from 
the unmapped tuple to putting in the mapped tuple.
    
    An illustration of one possibility: 
[TripleOps](https://github.com/afs/jena-workspace/blob/master/src/main/java/tupledev/TripleOps.java)
 shows two things: adding an external function to give triples an SPO order and 
using that operation to work between triples and tuples.


> MInor refactoring in TupleTables
> --------------------------------
>
>                 Key: JENA-1083
>                 URL: https://issues.apache.org/jira/browse/JENA-1083
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: A. Soroka
>            Priority: Minor
>
> There are some minor refactorings available for TupleTable and its subtypes, 
> particularly PMapTripleTable and PMapQuadTable that will clarify their use. 
> Specifically, current impls of those abstract types have to override several 
> methods for adding, removing, and finding tuples. In fact, the only 
> information being added when those methods are overridden is conversion 
> between canonical and internal tuple ordering. This refactoring is to provide 
> methods that do that conversion and nothing else, which will make two methods 
> the most that any implementation of those abstract classes will have to 
> provide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1083) MInor refactoring in TupleTables

Reply via email to