On 30/12/13 19:39, Claude Warren wrote:
With a Node interface I can implement a serializable node that handles all
the core node types. It means I only have to convert the node once.
Without an interface I have to convert to a serializable format and then
convert back to "native" form.
So it is a single class? Lucky it's only the concrete types! - there are
subclasses of variable in at least two places. Quite a bit of
instanceof Node_RuleVariable in the reasoner code.
I took a look at the code bases looking for other instanceof tests. I
found OWLDLProfile, OWLProfile and OWLLiteProfile that do instanceof
test when they should be doing .isXXX tests. Changed.
Other than that, there does not seem to be much code that makes use of
the class hierarchy although my looking was not systematic. Of course,
other extension code may be doing so.
Andy
On Mon, Dec 30, 2013 at 7:24 PM, Andy Seaborne <[email protected]> wrote:
On 30/12/13 18:58, Claude Warren wrote:
I did a quick Node (Interface) and NodeImpl implementation while working
on
the RMI code. (It made some things easier) there was not much change to
the code to put in an interface that has the current methods of Node. I
would like to move this into the current code base, but if we decide not
to
do that I can work around it.
This would be better done on a branch for discussion. I'm -1 to just
putting it into trunk.
"not much change" needs a migration strategy because this is going to
affect all modules, and it's not just the project's code either.
What does it make easier?
Andy
On Mon, Dec 30, 2013 at 6:23 PM, Andy Seaborne <[email protected]> wrote:
PS
http://mail-archives.apache.org/mod_mbox/jena-dev/201207.
mbox/%[email protected]%3E
On 30/12/13 18:21, Andy Seaborne wrote:
On 30/12/13 16:28, Claude Warren wrote:
For RMI I am only implementing a Graph.
It may make sense to wrap model and dataset in order to achieve better
performance (e.g. wrapping a TDB model/dataset may provide better
performance than creating a model against multiple graphs on the client
side), but for now it is just a Graph.
Will be be more performant in any measurable way?
I did have to create a model wrapper for the Security code, but that
is
another kettle of fish.
My plan is to complete the RMI implementation - 90% or so complete
now, and
add security to it (so you can restrict RMI access to specific graphs
etc).
Is there any issue with turning on the UUID inside the NodeFactory? I
see
that there is code for this.
I would also like to see Node changed to an interface -- but that is
another discussion -- I think it will keep the core cleaner as things
like
Node_Null won't pollute it.
I agree about interfaces that's why NodeFactory has gone in) but the
detail of the exact contract needs to be clear.
One argument for them is holding per-storage info in a Node impl but
that is limited in the system like Jena where Node.equals is global and
determined by RDF semantics.
I'm looking to simplify Graph/Triple/Node, so get rid of AnonIds (a
nuisense - they show up in the RDF API). And TripleMatch. Some renaming
to sane length method names. Extension for graphs as nodes(nested
graphs) and module-specific Nodestio reuse the storage (they never
leave a model - they help reuse things like "Triple" and "Graph" - I
found them useful in ARQ/TDB etc for example "this pattern slot is
defined").
There is lots of potential flexibility that is not used and I think we
know now that some of that is not of any use and it just confuses.
By the way, abstract interface classes (i.e. all methods abstract) are
reported as a bit faster than interfaces.
The most important factor to me is that we do realistic steps so we do
not get caught with an unresourceable transition from Jena2 to Jena3. I
think we should only consider things that people will resource.
Node_NULL is not used anywhere - @deprecate and delete!
(Looks like it is left over from RDB days.)
Andy
JENA-189
Claude
On Mon, Dec 30, 2013 at 3:27 PM, Andy Seaborne <[email protected]>
wrote:
On 29/12/13 20:40, Claude Warren wrote:
The RMI simply exposes an existing graph implementation on a remote
system.
The normal disclaimers apply but given the standard Jena
configuration:
NodeFactory.createAnon() uses UID to create an id that would be
passed to
the graph on the remote server where the anon would be recreated.
The result is that both the client and the server have the same anon
id
for
the blank node.
Am I missing something?
Only that UID are, strictly, only unique for the machine they are
allocated on. RMI etc can pass them around but they only safely
identify
things on the same machine as their origin (they aren't long enough
for
wider uniqueness). Its the UID user's responsibility nor to present
them
on on a non-origin machine.
Ideally, Jena3, I'd like to use UUIDs, and then store only two longs,
for
blank nodes. They they are globally safe as well as being smaller.
Out of curiosity - why do you need to extend to Model? Is there a
client-side implementation of graph and then it's just a case of
wrapping a
Graph just like another other graph? Or am I missing something?
Another issue in parsing is keeping label->bnode mapping. Labels
must be
matched to any previous use in the parser run.
The RIOT parsers do not use jena-core UID generation for bnode ids. If
it's a map of label to node allocated, there is a growing data
structure.
Something that we occasionally get reports of being a problem as
the map
grows for very large parser runs.
Instead, RIOT allocates a large number (122 bits of random) and xors
it
with the label. So the internal id is calculated from the label and
is
unique yet there is no growing data structure.
Andy
Claude
On Sun, Dec 29, 2013 at 7:43 PM, Andy Seaborne <[email protected]>
wrote:
On 29/12/13 16:58, Claude Warren wrote:
Greetings,
I have an initial implementation of an RMI based Graph that allows
one
JVM
to access a graph in a different JVM. I hope to extend this to the
Model
level in the near future. I just wanted to know if anyone was
interested
in this project.
Claude
The perennial question ...
How do you treat blank nodes?
Andy