Re: [Discuss] Apache Portable Uniform RDF Runtime (PURR)

2012-11-25 Thread Paolo Castagna
On 22 Nov 2012 15:04, "Andy Seaborne"  wrote:
>
> Rob's comments on inverting the reader process [2] suggest to me pulling
out an API and I wonder if we can identify a portability layer that enables
some (not all) interoperability and mix-n-match.
>
> The term "API" is creating some confusion in the discussions triggered by
the Clerezza incubator project being noted [1][3] as "low activity"
>
> To some, it's what the application sees -- a presentation API.  To others
it is some kind of abstraction between machinery like storage, inference,
parsing and writing.  They don't have to be the same.
>
> Even if the only outcome if parser and stream processing mouldarity, I
think it is worth doing. Just being able to add an external "parser" to
Jena in a cleaner way that is currently possible is useful.
>
> ** Apache Portable Uniform RDF Runtime (PURR) **
>
> (OK - the "U" is a bit forced :-)
>
> To me, what we need is an abstraction that allows multiple
implementations by swapping the jars (or OGSi bundles).  So PURR is a set
of interfaces.  No state.  c.f. SLJ4J.
>
> There would be many presentation APIs: Model-like, RDF-ORM, Ontology, and
also for natural use in other JVM-based languages - Scala, Clojure,
whatever is the next JVM language de jour.
>
> It's not a full application library. It's rather low level.  Writing much
code directly at the interface may not be pretty.
>
> This is not the Jena graph SPI although that was trying to preform that
purpose but has wider coverage of functionality.  I think we can go more
minimal yet.
>
> The Jena Graph SPI has a number of handlers - events, stats, transactions
- which seem to make the problem too large.  These would be part of another
subsystem ("extends PURR") and be different in different providers.  One of
those would be Jena Graph.
>
> PURR would provide the basic concepts from RDF:
>
> Terms: IRIs, Literals, bNodes
> Triples and Quads
> Graph, Dataset
> Factories for each.
>
> and for each be quite vanilla.
>
> e.g. a literal is a lexical form, a datatype and an optional language
tag.  Immutable with getters.  Structural equality.  No value, XSD or
otherwise.
>
> (I'll do a quick sketch in another message - but don't read it as fixed,
just a concrete discussion point)
>
> Parsers:
>
> Parsers, and in the general sense of anything that produced RDF from
whatever input, be it an RDF syntax or mapping another data format (a
conversion process), need and input stream and a factory, and emits
> Triples, Quads comprising of terms.  That don't need a full "graph" -
they need a destination to send Triples/Quads (or be pull parsers).
>
> Writers:
>
> Writing is not the reverse of parsing - parsers produce a stream, writers
for Turtle etc need to poke around the graph to decide what will "look
nice".  Even N-triples written clustered by subject can be useful.
>
> Negatives:
>
> 1/ It's wildly ambitious and impractical to even consider portability and
abstraction.  Too much time has passed.  Waste of effort.
>
> 2/ The portability layer is so narrow that it is not helpful.
>
> 3/ No SPARQL.
> (counter: (1) SPARQL is a remote protocol - this is same-JVM).
> (counter: (2) develop a SPARQL API using PURR basic terms)
>
>
> Opinions?

Considering projects such as Any23 (currently not using Jena) and Marmotta
(about entering incubation and not using Jena), it's s good thing to try
doing.

I suppose this will be a module within Jena. Would Any23 or Marmotta use it
or contribute to it?

Paolo

>
> Andy
>
>
> [1] http://wiki.apache.org/incubator/November2012
>
> [2] http://s.apache.org/KCv
> -->
>
http://mail-archives.apache.org/mod_mbox/jena-dev/201210.mbox/%3CC0B6979A3CA668458B697E4EA907CA940A01AF5C%40CFWEX01.americas.cray.com%3E
>
> [3] http://s.apache.org/lK
> -->
>
http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201211.mbox/%3CCAEWfVJ%3DcKATgo32u-AZDQKq%2BmsaVM_CWRnLo_OLdTYP1jFVzAw%40mail.gmail.com%3E


Re: User Defined Functions

2012-11-25 Thread Damian Steer

On 24 Nov 2012, at 21:55, Andy Seaborne  wrote:

> On 20/11/12 18:46, Rob Vesse wrote:
>> Hi All
>> 
>> I have put in place some new functionality which I'm calling User
>> Defined Functions – it is essentially a lightweight way for users to
>> define new functions for use in their SPARQL queries without having
>> to write the Java code for the function themselves.  This means it is
>> less powerful than adding a full extension function as you can't use
>> arbitrary Java code but it provides a simple way to encapsulate
>> complex or large expressions into simple function calls, in essence
>> it is an expression aliasing mechanism.
> 
> Good idea.

+1

>> 
>> Let me know what you think and any ideas for refinement beyond what I 
>> already listed here,
> 
> - - - - - - -
> off the top of my head syntax but the big thing to do would be to add syntax 
> (esp for SPARQL Update)
> 
> DEFUN my:function1(?x, ?y, ?z) =
>  SPARQL expression ...
> ENDDEF

I tried something similar (i.e. defining functions without java) using 
javax.scripting, within assemblers. Has the advantage of standard syntax and no 
java.

Damian

Re: BulkUpdateHandler and events

2012-11-25 Thread Claude Warren
I care.  I have been working on a security wrapper (dynamic proxy) and
I have to modify to match all the changes.  Part of the security
design was to do it so that changes to the interfaces required changes
to the security classes since this forces security evaluation of new
methods.

However, I will manage to merge those in when necessary.  Actually,
these changes will simplify the implementation.

+1

-- Claude

On Sun, Nov 25, 2012 at 4:55 PM, Andy Seaborne  wrote:
> I've had a go at removing all bulk update handler add and delete calls in
> the (graph level) BulkUpdateHandler.
>
> There is a branch "jena-core-simplified".
>
> Graph acquires
>
> removeAll()
> remove(s,p,o)   // Remove by pattern.
>
> and the utility class GraphUtil has the implementations taken from
> SimpleBulkUpdateHandler so code changes to callers has been remove the bulk
> update call with a call to the equivalent static in GraphUtil.
>
> 1/
> Currently, there isn't a Graph.add(Graph)
>
> 2/
> ARP has been "de-bulked" - it sends triples off to the target graph of the
> parser straight away; it used to batch them into units of 1000.
>
> 3/
> There some issues arising around event callbacks.
>
> There are various events, one for each way triples are added.
> -- see GraphListener.
>
> notifyAddArray( Graph g, Triple [] ts )
> notifyAddTriple( Graph g, Triple t )
> notifyAddList( Graph g, List L )
> notifyAddIterator( Graph g, List it )
> notifyAddIterator( Graph g, Iterator it )
> notifyAddGraph( Graph g, Graph added )
>
> This seems rather complicated but it can't simply be removed now because all
> this is reflected at the model level:
>
> http://jena.apache.org/documentation/notes/event-handler-howto.html
>
> The main way to get events is via ModelListenerAdapter which wires the
> ModelChnagedListener level to the graph level.
>
> And we don't believe there are any other implementations of Model (at least,
> amongst people who upgrade).
>
> If we switch to just two events,
>   notifyAddTriple
>   notifyDeleteTriple
>
> will anyone notice or care?
>
> In the branch currently, because currently all former bulk update operations
> go via GraphUtil, the old-style events are generated and the tests all pass
> unchanged.
>
> Proposal - part 1:
>
> Ask in users@j.a.o to see what, if any, use is make of model listeners.
>
> Plan to make all changes either a single "added" or "removed" call.
> If there is no evidence of use, remove extra calls now, and the next version
> is 2.10.0.
>
> Proposal - part 2:
>
> Leave, deprecated, the call Graph.getBulkUpdateHandler and implementation
> machinery of SimpleBulkUpdateHandler for one release, then remove it.
>
> Andy
>
>



-- 
I like: Like Like - The likeliest place on the web
Identity: https://www.identify.nu/user.php?cla...@xenei.com
LinkedIn: http://www.linkedin.com/in/claudewarren


BulkUpdateHandler and events

2012-11-25 Thread Andy Seaborne
I've had a go at removing all bulk update handler add and delete calls 
in the (graph level) BulkUpdateHandler.


There is a branch "jena-core-simplified".

Graph acquires

removeAll()
remove(s,p,o)   // Remove by pattern.

and the utility class GraphUtil has the implementations taken from 
SimpleBulkUpdateHandler so code changes to callers has been remove the 
bulk update call with a call to the equivalent static in GraphUtil.


1/
Currently, there isn't a Graph.add(Graph)

2/
ARP has been "de-bulked" - it sends triples off to the target graph of 
the parser straight away; it used to batch them into units of 1000.


3/
There some issues arising around event callbacks.

There are various events, one for each way triples are added.
-- see GraphListener.

notifyAddArray( Graph g, Triple [] ts )
notifyAddTriple( Graph g, Triple t )
notifyAddList( Graph g, List L )
notifyAddIterator( Graph g, List it )
notifyAddIterator( Graph g, Iterator it )
notifyAddGraph( Graph g, Graph added )

This seems rather complicated but it can't simply be removed now because 
all this is reflected at the model level:


http://jena.apache.org/documentation/notes/event-handler-howto.html

The main way to get events is via ModelListenerAdapter which wires the 
ModelChnagedListener level to the graph level.


And we don't believe there are any other implementations of Model (at 
least, amongst people who upgrade).


If we switch to just two events,
  notifyAddTriple
  notifyDeleteTriple

will anyone notice or care?

In the branch currently, because currently all former bulk update 
operations go via GraphUtil, the old-style events are generated and the 
tests all pass unchanged.


Proposal - part 1:

Ask in users@j.a.o to see what, if any, use is make of model listeners.

Plan to make all changes either a single "added" or "removed" call.
If there is no evidence of use, remove extra calls now, and the next 
version is 2.10.0.


Proposal - part 2:

Leave, deprecated, the call Graph.getBulkUpdateHandler and 
implementation machinery of SimpleBulkUpdateHandler for one release, 
then remove it.


Andy




[jira] [Updated] (JENA-353) RDFList.append doc indicates a copy is created not always so.

2012-11-25 Thread Ian Dickinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Dickinson updated JENA-353:
---

Assignee: Ian Dickinson

> RDFList.append doc indicates a copy is created not always so.
> -
>
> Key: JENA-353
> URL: https://issues.apache.org/jira/browse/JENA-353
> Project: Apache Jena
>  Issue Type: Bug
>Affects Versions: Jena 2.7.4
>Reporter: Claude Warren
>Assignee: Ian Dickinson
>
> Documentation for RDFList.append() indicates that a copy is created, both 
> lists are merged into the copy and the copy returned so as to be "non 
> side-effecting operation on either this list or the given list"
> However, in the case where this list is empty the other list is returned.  I 
> believe copy of the other should be returned in this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [Discuss] Apache Portable Uniform RDF Runtime (PURR)

2012-11-25 Thread Andy Seaborne

On 23/11/12 10:49, Reto Bachmann-Gmür wrote:

In clerezza we put a strong emphasis on identity criteria. This is also a
reason why no factories are part of the API. Two nodes are identical and
can be used interchangeably iff the are equals according to the relevant
specs, so it shall not matter if you got your instance from a factory or
implemented the interface yourself, the two instances behave the same in
all contexts.


Same debate for Jena Graph SPI a long time ago.

The obvious follow-through to me is that there should be classes (final 
even), not interfaces, because they have strong equality rules.


Only if the interfaces are to be implemented directly, and not via a 
copy transformation, do interfaces seem to have a real reason for being 
there.



Identity was also the reason for having the distinction
between immutable and mutable graphs. The RDF specification define when two
graphs are equals (they are if they are isomorphic) but this criterion can
only be matched to the Object.equals if the graph aren't mutable (as you
otherwise run into big problems).


Slight confusion between identity and equivalence.  Two graph are 
equivalent by bNode-isomorphism; they are still different.


Equivalence is context sensitive.  There may be other information, now 
or later, that breaks that equivalence.  But if two things have the same 
identity, that can't happen.


RDF 1.1 is (probably) going to spell this more clearly.

Work-in-progress:
http://www.w3.org/2011/rdf-wg/wiki/User:Rcygania2/B-Scopes

Andy



Contract Tests?

2012-11-25 Thread Claude Warren
I am looking for a set of tests that I would call contract tests.

I am not looking for unit tests that show a model insert works but
rather things like the following:

// create a model
Model m = ModelFactory.createDefaultModel();
// create a resource
Resource r = m.createResource( "foo" );
// show resource is not in model
Assert.assertEquals( 0, r.listProperties().toList().size() );
// add resource to model
m.add( r, ResourceFactory.createProperty( "bar"), "Bar");
// add non model resource with same resource URL
m.add( ResourceFactory.createResource( "foo" ),
ResourceFactory.createProperty( "baz"), "Baz");
// verify that the existing resource was updated
Assert.assertEquals( 2, r.listProperties().toList().size() );

Basically a set of tests that prove an implementation meets all the
contracts of an API is what I am looking for.

-- 
I like: Like Like - The likeliest place on the web
Identity: https://www.identify.nu/user.php?cla...@xenei.com
LinkedIn: http://www.linkedin.com/in/claudewarren


Re: [jira] [Commented] (JENA-353) RDFList.append doc indicates a copy is created not always so.

2012-11-25 Thread Claude Warren
Ian,

My issue is that the contract as spelled out in the documentation is
not always kept.  From the documentation I would assume that a copy is
returned thus I can execute:

L3 = L1.append( L2 );
L3.removeHead()

and not have L1 or L2 affected.

However if L1==nill

then L3.removeHead() removes the head of L2.

conversely, from the documentation I would expect that

L3 = L1.append( L2 );
L2.removeHead()

would not impact L3 but it does in the case where L1=nill

Disclaimer:  I am new to the RDFList construct so perhaps I am missing
something.

-- Claude


On Sun, Nov 25, 2012 at 12:30 AM, Ian Dickinson (JIRA)  wrote:
>
> [ 
> https://issues.apache.org/jira/browse/JENA-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503443#comment-13503443
>  ]
>
> Ian Dickinson commented on JENA-353:
> 
>
> What's the use case when this makes a difference? The documentation is 
> correct: when you do
>
> L1.append( L2 )
>
> where L1 == nil, the post-condition remains true that neither list has a 
> different set of triples than it did before the append operation. If your 
> application needs to copy the list so that the appended list (or the 
> original) can be side-effected by other code, you have `RDFList.copy`. I'm 
> happy to consider changes to the current contract, but the motivation needs 
> to be clearer. Copying a list is potentially an expensive operation, so it 
> should be very clear to the caller when it is going to take place.
>
>> RDFList.append doc indicates a copy is created not always so.
>> -
>>
>> Key: JENA-353
>> URL: https://issues.apache.org/jira/browse/JENA-353
>> Project: Apache Jena
>>  Issue Type: Bug
>>Affects Versions: Jena 2.7.4
>>Reporter: Claude Warren
>>
>> Documentation for RDFList.append() indicates that a copy is created, both 
>> lists are merged into the copy and the copy returned so as to be "non 
>> side-effecting operation on either this list or the given list"
>> However, in the case where this list is empty the other list is returned.  I 
>> believe copy of the other should be returned in this case.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira