[ https://issues.apache.org/jira/browse/JENA-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112769#comment-13112769 ]
Stephen Allen commented on JENA-121: ------------------------------------ Looks good. The changes to TokenizerText.java are probably not part of this change, but must have just gotten swept up in the patch. > Improvements to Bindings > ------------------------ > > Key: JENA-121 > URL: https://issues.apache.org/jira/browse/JENA-121 > Project: Jena > Issue Type: Improvement > Components: ARQ > Reporter: Stephen Allen > Priority: Minor > Attachments: JENA-121-r1174067.patch > > > The Binding interface is a key object for query execution. It has some > issues such that it may be a good idea to think about tweaking the design a > bit. Some thoughts: > 1) Bindings should be immutable (in the strong Java sense: > http://www.ibm.com/developerworks/java/library/j-jtp02183/index.html) > 2) Add a BindingPair class that represents a Var/Node value (could be called > something else, BindingValue?) > 2a) Binding constructor/factory method takes an Iterable<BindingPair> to > initialize it > 2b) Binding can now implement Iterable<BindingPair> which would be more > efficient than iterating over the variables then looking up each node > 3) An implementation that has better memory usage than BindingMap (a HashMap > may be overkill here, if we can use the iterator from 2b in more places) > 4) An implementation that copies parent BindingPairs instead of maintaining a > reference. If the parent bindings are not held onto by themselves after > being incorporated into a child, we can save memory by copying and letting > the parent be GCed (indeed in the common join case, this appears to be what > happens). We would also get speed benefits from storing the BindingPairs in > a single data structure, making iterating and looking up values faster. > Additionally, more Binding objects die young instead of being held as part of > a higher level algebra collection (like sort or distinct), which can help > with GC overhead. > 5) Expose an iterator of BindingPairs ordered by variable. This is needed > for BindingComparator, and may be an option for Algebra.merge()/compatible() > if we eliminate fast get(Var) lookups of nodes (as a consequence of 3). The > ordering could be determined at construction or be initialized lazily. > 6) Method for estimating memory size for the binding. Would be very useful > for setting threshold policies for DataBags. Although this may be tough to > do, especially if Nodes are shared between bindings. > Some of these points need some more investigation, and some good profiling to > ensure that they are beneficial, especially 3, 4, and 5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira