meaning of org.apache.jena.sparql.core.DatasetChanges.listen

2015-11-16 Thread Chris Dollin

Dear All

Some time recently org.apache.jena.sparql.core.DatasetChanges
grew a listen() method with the comment "Release any resources".

What sort of any are the released resources? Presumably finish()
does resource cleanup, so what is reset doing that finish doesn't
do? My best guess is that it is for abandoning state that is
handling an incomplete series of triples without abandoning
the entire state of the DatasetChanges implementation.

[I'm asking because ppd-index implements TextDocProducerBatch
which implements DatasetChanges and I want to know what the
expectation of callers of TextDocProducerBatch.reset)() may have.]

Chris

--
"It does not need to take events in their correct order." /Hexwood/

Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)


Re: meaning of org.apache.jena.sparql.core.DatasetChanges.listen

2015-11-16 Thread Andy Seaborne

On 16/11/15 14:36, Chris Dollin wrote:

Dear All


Hi Chris,



Some time recently org.apache.jena.sparql.core.DatasetChanges


git log 
jena-arq/src/main/java/org/apache/jena/sparql/core/DatasetChanges.java



grew a listen() method with the comment "Release any resources".


s/listen\(\)/reset\(\)/



What sort of any are the released resources? Presumably finish()
does resource cleanup, so what is reset doing that finish doesn't
do? My best guess is that it is for abandoning state that is
handling an incomplete series of triples without abandoning
the entire state of the DatasetChanges implementation.


I can't find any use of reset().

But a sequence of changes might be several start-finish to group things 
but part of a larger process that is across the same internal resources 
in which case a final "reset()" indicates that's all over e.g. a commit. 
it decouples the app needs for grouping (e.g. a small set of related 
changes) to a larger grouping like a transaction.


start-finish-start-finish...-start-finish-reset

Advance notice:

It looks like DatasetChanges or an interface extending DatasetChanges or 
a better-parallel interface needs to reflect transaction boundaries 
properly.


This has now come up a couple of times in different places so it is 
indicative that DatasetChanges isn't the right design.



[I'm asking because ppd-index implements TextDocProducerBatch


not part of Jena


which implements DatasetChanges and I want to know what the
expectation of callers of TextDocProducerBatch.reset)() may have.]


Any experience to report especially regarding transactions and 
DatasetChanges changes or replacement?




Chris



Andy




Re: meaning of org.apache.jena.sparql.core.DatasetChanges.listen

2015-11-17 Thread Claude Warren
Looks to me like we need a set of contract tests for Dataset.  It would
make extension/implementaiton validation simple.

Claude

On Mon, Nov 16, 2015 at 5:12 PM, Andy Seaborne  wrote:

> On 16/11/15 14:36, Chris Dollin wrote:
>
>> Dear All
>>
>
> Hi Chris,
>
>
>> Some time recently org.apache.jena.sparql.core.DatasetChanges
>>
>
> git log
> jena-arq/src/main/java/org/apache/jena/sparql/core/DatasetChanges.java
>
> grew a listen() method with the comment "Release any resources".
>>
>
> s/listen\(\)/reset\(\)/
>
>
>> What sort of any are the released resources? Presumably finish()
>> does resource cleanup, so what is reset doing that finish doesn't
>> do? My best guess is that it is for abandoning state that is
>> handling an incomplete series of triples without abandoning
>> the entire state of the DatasetChanges implementation.
>>
>
> I can't find any use of reset().
>
> But a sequence of changes might be several start-finish to group things
> but part of a larger process that is across the same internal resources in
> which case a final "reset()" indicates that's all over e.g. a commit. it
> decouples the app needs for grouping (e.g. a small set of related changes)
> to a larger grouping like a transaction.
>
> start-finish-start-finish...-start-finish-reset
>
> Advance notice:
>
> It looks like DatasetChanges or an interface extending DatasetChanges or a
> better-parallel interface needs to reflect transaction boundaries properly.
>
> This has now come up a couple of times in different places so it is
> indicative that DatasetChanges isn't the right design.
>
> [I'm asking because ppd-index implements TextDocProducerBatch
>>
>
> not part of Jena
>
> which implements DatasetChanges and I want to know what the
>> expectation of callers of TextDocProducerBatch.reset)() may have.]
>>
>
> Any experience to report especially regarding transactions and
> DatasetChanges changes or replacement?
>
>
>> Chris
>>
>>
> Andy
>
>
>


-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


Re: meaning of org.apache.jena.sparql.core.DatasetChanges.listen

2015-11-17 Thread Andy Seaborne

On 17/11/15 10:21, Claude Warren wrote:

Looks to me like we need a set of contract tests for Dataset.  It would
make extension/implementaiton validation simple.


There are various abstract classes already for datasets and we are using 
them for JENA-624 e.g. AbstractDatasetGraphTests.


We have identified an area where testing caught an issue very late (the 
SPARQL scripted tests picked up an index ordering issue). We are pulling 
that back into the more basic DatasetGraph tests.


DatasetChanges is not part of the Dataset(Graph) API.  It's the 
interface to handle signalled changes to attach behaviour, e.g log 
changes, keep a text index up to date (Chris's UC), generate RDF patch 
files, ...


Andy



Claude

On Mon, Nov 16, 2015 at 5:12 PM, Andy Seaborne  wrote:


On 16/11/15 14:36, Chris Dollin wrote:


Dear All



Hi Chris,



Some time recently org.apache.jena.sparql.core.DatasetChanges



git log
jena-arq/src/main/java/org/apache/jena/sparql/core/DatasetChanges.java

grew a listen() method with the comment "Release any resources".




s/listen\(\)/reset\(\)/



What sort of any are the released resources? Presumably finish()
does resource cleanup, so what is reset doing that finish doesn't
do? My best guess is that it is for abandoning state that is
handling an incomplete series of triples without abandoning
the entire state of the DatasetChanges implementation.



I can't find any use of reset().

But a sequence of changes might be several start-finish to group things
but part of a larger process that is across the same internal resources in
which case a final "reset()" indicates that's all over e.g. a commit. it
decouples the app needs for grouping (e.g. a small set of related changes)
to a larger grouping like a transaction.

start-finish-start-finish...-start-finish-reset

Advance notice:

It looks like DatasetChanges or an interface extending DatasetChanges or a
better-parallel interface needs to reflect transaction boundaries properly.

This has now come up a couple of times in different places so it is
indicative that DatasetChanges isn't the right design.

[I'm asking because ppd-index implements TextDocProducerBatch




not part of Jena

which implements DatasetChanges and I want to know what the

expectation of callers of TextDocProducerBatch.reset)() may have.]



Any experience to report especially regarding transactions and
DatasetChanges changes or replacement?



Chris



 Andy