On 10/10/11 12:10, Paolo Castagna wrote:
>> I think it's time to promote TxTDB to be the TDB trunk.
>
> +1
I'll make it so today. Probably a good idea if only one of us does that.
JENA-97 TDB 0.9.0 snapshot sometimes returns a SELECT binding twice
Awaiting confirmation. Test case does not illustrate the problem.
Not new to TxTDB.
A possible alternative reading of the report has been fixed.
I am still unclear on this. Is it a bug?
There was a bug, it's fixed. Details in JIRA.
"... actually mean the same variable appears in the iterator of
variables more than once."
If it is a bug, it is present in the latest stable TDB release and therefore
both TDB and TxTDB are affected. A TxTDB<-> TDB switch over will not make
the situation worst respect to this.
It would be good to get to the bottom of this though before the next (Tx)TDB
release.
?? I have as far as I'm concerned.
so I propose doing a switch over by:
svn mv \
https://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk
https://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/tags/TDB-0.8.X
svn cp \
https://svn.apache.org/repos/asf/incubator/jena/Experimental/TxTDB/trunk
https://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk
The only reason for the second being a "cp" (I strongly prefer not
leaving visible orphan copies around) is to have a temporary version
that marks the changeover. By diff'ing TDB/trunk against
Experimental/TxTDB/trunk, it would be possible to find items to backport
to TDB-0.8.X should that be necessary. I expect the copy to be around
for a short period of time only.
+1
It's a good plan.
JENA-102 tdbloader creates stats.opt file in existing DB
Not a blocker because it's problem with the current release.
It is well worth addressing stats.opt maintenance properly,
not just solving the point problem.
+1 on "worth addressing stats.opt maintenance properly".
A first step on this would be to add a comment on JENA-102 to clarify what
"properly" would mean in practice, what's need to be done? Or, open a new
JIRA issue for it and link it with JENA-102.
If we end-up with an in-memory and on-disk solution which could (eventually)
be used to answer specific SPARQL queries such as the ones I often see in the
office and Dave mentioned recently:
SELECT DISTINCT ?p WHERE {?s ?p ?o.}
SELECT DISTINCT ?cls WEHRE {?i a ?cls.}
That would be awesome.
And IMHO the wrong way to do it.
Use query caching, and prepopulate the cache with the answers. Don't
build it into the engine itself; have a special tool that does a fast
calculation and runs to maintain the cache.
Bonus prize: make the cache insensitive to the names of variables.
I am not proposing we do this in one shot, the use stats to answer the above
SPARQL queries is a completely (and not necessary) step. However, keeping this
in mind and come up with a solution which would make that possible would be
good.
Are you proposing we close JENA-102 and deal with stats.opt mainenance properly
before the TxTDB<-> TDB switch over?
No. After.
Before or after does not make a big difference to me, so long we fix it.
Paolo
Andy
[*] As in "finish" and "decide which of two ... or both options"