SPARQL LOAD Error on Fuseki 3.4.0

2017-08-30 Thread Yasunori Yamamoto
Hello, I have switched Fuseki from 2.6.0 to 3.4.0 and found that the SPARQL 
LOAD query failed, which didn’t on 2.6.0.
When starting up with the option --update and issuing the update query to the 
SPARQL update endpoint (i.e., “LOAD 
” to 
http://localhost:3030/myDB/update ), I encountered the HTTP 500 error.
The server said the following.

Caused by: org.apache.jena.riot.RiotException: Failed to determine the content 
type: (URI=http://dbpedia.org/resource/Joseph_Hocking : stream=text/html)

This issue also happened to another dataset server’s URI.

Caused by: org.apache.jena.riot.RiotException: Failed to determine the content 
type: (URI=http://www.wikidata.org/entity/Q2 : stream=application/json)

The both didn’t happen if I use Fuseki 2.6.0.

Regards,
Yasunori


moving graph between tdb

2017-08-30 Thread Andrew U Frank
i have a large tdb stored dataset with multiple graphs. can i move one
graph from one to another dataset? what would be the command using http
protokoll?
thank you!
andrew





Re: Fuseki TDB database size growth

2017-08-30 Thread Chris Tomlinson
Hi,

We’re going to explore using TDB2 with online compaction. We’ll be looking at 
the behavior under graph deletion and large literals. Our use case is a library 
with associated cultural heritage information (the current instance is 
https://www.tbrc.org). New information is added and corrections are made to 
existing items.

If an update is made to a Work or Person and so on that would be effected in 
Jena via deleting the corresponding named graph and uploading a revised named 
graph for the individual. We intend to keep track of diffs by using git with 
turtle files for each named graph, external to Jena.

So the online compaction would be an essential feature to keep blank nodes from 
growing unbounded.

No we do not generally expect to rewrite the entire db.

Thanks,
Chris


> On Aug 30, 2017, at 4:27 AM, Rob Vesse  wrote:
> 
> No, it is perfectly usable as a primary database
> 
> However, if your use case regularly rewrites your entire database then you 
> are going to have problems and this would be true of any database system, 
> although obviously implementation specifics will have an impact on this.
> 
> Rob
> 
> On 22/08/2017 03:22, "Chris Tomlinson"  wrote:
> 
>Hi,
> 
>This is interesting to know about blank nodes and reference counting. Does 
> the comment regarding deleting triples not recovering blank nodes apply if an 
> entire named graph which includes some blank nodes is deleted?
> 
>If so it seems that in production Jena/TDB is expected to be periodically 
> reloaded from scratch or to not use blank nodes very much. 
> 
>In this case is Jena/TDB more aimed at use cases where it perhaps 
> functions like an index cache rather than a primary database. Is this 
> accurate? If so what sort of primary database systems are typically found 
> coupled with Jena/TDB?
> 
>Regards,
>Chris
> 
>> On Aug 21, 2017, at 05:28, Rob Vesse  wrote:
>> 
>> All the data structures used in TDB are broadly speaking append only. This 
>> means that the database Will tend to grow in size overtime.
>> 
>> Certain ways of using the database can exacerbate this. In your example I 
>> would guess that you have a lot of blank nodes present in the data?
>> 
>> Each unique blank node generates a unique identifier inside the system and 
>> will continually expand the node table. TDB does not implement reference 
>> counting so even if you delete every triple that references a given RDF node 
>> it will never be removed from the node table.
>> 
>> Similarly as the indexes are updated they do not reclaim space so the 
>> B+Tree’s will continue to grow over time.
>> 
>> Reloading from scratch creates a smaller database because it is able to 
>> maximally pack the data into the Data structures on disk and you do not have 
>> any unused identifiers allocated.
>> 
>> Rob
>> 
>> On 21/08/2017 11:20, "Lorenzo Manzoni"  wrote:
>> 
>>   Hi,
>> 
>>   I'm writing you because we have a behavior of fuseki TDB  we can not 
>>   understand:
>> 
>>   */the fuseki database filesystem size continues to grow even if the 
>>   number of triples does not increase substantially./*
>> 
>>   We are using the latest version of fuseki (3.4.0) as triple store of a 
>>   semantic media wiki (mw 1.24, smw 2.1.1) and all the night we have a 
>>   scheduled job that updates the wiki pages and executes maintenance 
>>   scripts(e.g. 
>>   
>> https://www.semantic-mediawiki.org/wiki/Help:Maintenance_script_%22rebuildData.php%22)
>>  
>>   . These scripts update the semantic data on the wiki and the triples on 
>>   fuseki. Basically every triple are rewritten.
>> 
>>   We have observed that the fuseki database filesystem size grew over time 
>>   to 20Gb but when we recreate it from scratch the database size is only 
>>   500 Mb.
>> 
>>   After that every day  fuseki database grows about 200Mb and the number 
>>   of triples does not change substantially
>> 
>>   I originally assumed that the rebuild data script was the problem but 
>>   when I executed it alone the fuseki database space did not increase.
>> 
>>   We are running fueski on a 64 bit redhat machine.
>> 
>>   Someone can  help us?
>> 
>>   Thanks in advance,
>> 
>>   Lorenzo
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 



Re: SPARQL vs Jena rules

2017-08-30 Thread baran . ha
On Sun, 27 Aug 2017 11:54:36 +0200, Lorenz B.  
 wrote:



Hello Baran,




Kind regards,
Lorenz



I think statements like

On Fri, 25 Aug 2017 11:52:46 +0200, Lorenz Buehmann
 wrote:


 Inferencing and querying are totally different
things. So why are you thinking about refactoring the whole project?


or in next posting


Again, why do you compare those two Jena mechanisms? What is the
expected outcome?


are generally spoken confining snd mistakable for Jena-users making
thoughts about a proper design.

Assume i have following dev-scenario:

A Jena-app with InfModel + Rules-List -> output RDF -> TDB/Fuseki -> A
Query-UI

(Query UI contains a lot of queries which users activate with a mouse
click and get responses presented in the same UI.)

Now you i can try to add rules to my rules-list so that i can
formulate some of the queries of my Query-UI in a more leightweight
way with better performance.

(Or vice versa i change some of my queries so that i can delete some
rules of my rules-list which is not so interesting.)

The question here was to compare rules vs. SPARQL CONSTRUCT queries to
modify an existing dataset. Indeed, the combination of both might be
more powerful, although it remains open what part of the semantics to
cover by which mechanism.
Nevertheless, it's totally use case dependent - what kind of rules, how
many rules, size of the dataset, is the data volatile (forward chaining
vs. backward chaining), what kind of queries, and so on and so forth.

Good luck with the project.


Last not least, i would still like to add to this:

Experimenting with CONSTRUCT-queries can be in fact a relative direct help  
if you are struggling with Jena-rules + SPARQL relations in a 'teased'  
tripples-level although you perhaps never use CONSTRUCT-queries in the  
product-level of your app.


Thanks to tina sani starting this thread-title 'SPRQL vs Jena rules'. This  
stuff hasn't been enough handled in this listing as it deserves. And i am  
also to blame for this because i didn't have had the aggressiv energy and  
courage to put the right postings though i have had a lot of problems and  
some very encouring app cases. What a pity when i look back.


thanks, baran

PS: I wonder why Dave doesn't comment in this thread. Perhaps because he  
thinks, Lorenz is ok, i myself cannot stand the low-level-knowledge of the  
users in this thread or no matter what you do, by some heavy data-input an  
app with InfModel would hang anyway? Lorenz is ofcourse ok, but i 'guess'  
Jena users are also very curious about Dave's comments...


*

--
Using Opera's mail client: http://www.opera.com/mail/


Re: 400 Not a file upload

2017-08-30 Thread Mikael Pesonen


Upgraded to version 3.4.0 and now works!

Br,
Mikael


On 30.8.2017 13:22, Andy Seaborne wrote:

What does the server log say?
Can you upgrade?

    Andy

On 30/08/17 11:10, Mikael Pesonen wrote:


Hi,

I'm getting that error from cmd:

apache-jena-fuseki-2.4.1/bin/s-update 
--service=http://:3030/ds/update 
--file=/tmp/sparql_update_59a68dfd76a5e


/tmp/sparql_update_59a68dfd76a5e exists. What could case that error?




Br,



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Latest Fuseki standalone server

2017-08-30 Thread Mikael Pesonen


Strange, now I downloaded 3.4.0 and it contains fuseki-server too. My 
mistake, sorry about that.


Br,
Mikael


On 30.8.2017 13:33, Andy Seaborne wrote:

There were also 2.5.0 and 2.6.0 before the alignment jump to 3.4.0.

What were you expecting to find, and where? (= what was not clear?)

 Andy

On 29/08/17 14:15, aj...@apache.org wrote:
Jena used to maintain separate versioning for the core and for 
Fuseki2. But no longer! You have the latest version, and you need 
keep track of only one version number. (Thanks, Andy!)


https://issues.apache.org/jira/browse/JENA-1373


ajs6f

Mikael Pesonen wrote on 8/29/17 8:51 AM:


Hi,

is 2.4.1 still the latest fuseki-server.jar server? At least it's 
not included in current apache-jena-fuseki-3.4.0.tar.gz archive.


Br,



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Latest Fuseki standalone server

2017-08-30 Thread Mikael Pesonen


Okay so I'll upgrade to 2.6.0.  Thanks!

Br,
Mikael


On 30.8.2017 13:33, Andy Seaborne wrote:

There were also 2.5.0 and 2.6.0 before the alignment jump to 3.4.0.

What were you expecting to find, and where? (= what was not clear?)

 Andy

On 29/08/17 14:15, aj...@apache.org wrote:
Jena used to maintain separate versioning for the core and for 
Fuseki2. But no longer! You have the latest version, and you need 
keep track of only one version number. (Thanks, Andy!)


https://issues.apache.org/jira/browse/JENA-1373


ajs6f

Mikael Pesonen wrote on 8/29/17 8:51 AM:


Hi,

is 2.4.1 still the latest fuseki-server.jar server? At least it's 
not included in current apache-jena-fuseki-3.4.0.tar.gz archive.


Br,



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: 400 Not a file upload

2017-08-30 Thread Mikael Pesonen


Hi,

this is the log (removed query):

[2017-08-30 14:08:13] Fuseki INFO  [606] POST 
http://:3030/ds/query
[2017-08-30 14:08:13] Fuseki INFO  [606] POST /ds :: 'query' :: 
[application/x-www-form-urlencoded] ?

[2017-08-30 14:08:13] Fuseki INFO  [606] Query = 
[2017-08-30 14:08:13] Fuseki INFO  [606] 200 OK (12 ms)
[2017-08-30 14:08:13] Fuseki INFO  [607] POST 
http://:3030/ds/update
[2017-08-30 14:08:13] Fuseki INFO  [607] POST /ds :: 'update' :: 
[application/sparql-update] ?

[2017-08-30 14:08:13] Fuseki INFO  [607] 400 Not a file upload (0 ms)

Br
Mikael

On 30.8.2017 13:22, Andy Seaborne wrote:

What does the server log say?
Can you upgrade?

    Andy

On 30/08/17 11:10, Mikael Pesonen wrote:


Hi,

I'm getting that error from cmd:

apache-jena-fuseki-2.4.1/bin/s-update 
--service=http://:3030/ds/update 
--file=/tmp/sparql_update_59a68dfd76a5e


/tmp/sparql_update_59a68dfd76a5e exists. What could case that error?




Br,



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Latest Fuseki standalone server

2017-08-30 Thread Andy Seaborne

There were also 2.5.0 and 2.6.0 before the alignment jump to 3.4.0.

What were you expecting to find, and where? (= what was not clear?)

 Andy

On 29/08/17 14:15, aj...@apache.org wrote:
Jena used to maintain separate versioning for the core and for Fuseki2. 
But no longer! You have the latest version, and you need keep track of 
only one version number. (Thanks, Andy!)


https://issues.apache.org/jira/browse/JENA-1373


ajs6f

Mikael Pesonen wrote on 8/29/17 8:51 AM:


Hi,

is 2.4.1 still the latest fuseki-server.jar server? At least it's not 
included in current apache-jena-fuseki-3.4.0.tar.gz archive.


Br,



Re: 400 Not a file upload

2017-08-30 Thread Andy Seaborne

What does the server log say?
Can you upgrade?

Andy

On 30/08/17 11:10, Mikael Pesonen wrote:


Hi,

I'm getting that error from cmd:

apache-jena-fuseki-2.4.1/bin/s-update 
--service=http://:3030/ds/update 
--file=/tmp/sparql_update_59a68dfd76a5e


/tmp/sparql_update_59a68dfd76a5e exists. What could case that error?




Br,



Re: Backup TDB and named model

2017-08-30 Thread Andy Seaborne



On 30/08/17 08:36, george.n...@gmx.net wrote:

*From: *Andy Seaborne 
*Sent: *martes, 29 de agosto de 2017 19:31
*To: *users@jena.apache.org 
*Subject: *Re: Backup TDB and named model

On 29/08/17 15:45, aj...@apache.org wrote:

 > tdbdump (along with all of the TDB shell utilities) is available in the

 > Jena full distribution:

 >

 > https://jena.apache.org/download/index.cgi

 >

 >

 > ajs6f

 >

 > George News wrote on 8/29/17 2:30 AM:

 >> Hi,

 >>

 >> I have a named graph that is becoming very big, and therefore searches

 >> on it are quite slow. I'm planning on make a backup from time to time

 >> and reset the data in the original.

 >>

 >> The code that I'm currently using is the one below, which summarizing

 >> consists on creating a new graph based on the original one, delete the

 >> original and create it from scratch.

 >>

 >> public void reset() {

 >>   dataset.begin(ReadWrite.WRITE);

 >>   try {

 >> LocalDateTime date = LocalDateTime.now();

 >> DateTimeFormatter formatter =

 >> DateTimeFormatter.ofPattern("MMddHHmm");

 >> String dateString = date.format(formatter);

 >> String backupModelName = modelName + "-" + dateString;

 >> dataset.addNamedModel(backupModelName, getModel());

A SPARQL UPDATE of using "MOVE" is neater.

For TDB, there is little choice but to do some kind of copy to rename.

It is a change to the quads for the graph with no indirection to flip

the name in the storage.

  Andy

Thanks. Can you provide an example please? When you say it’s neater, 
isnit also quicker and more robust?



Personal preference:

Txn.executeWrite(dataset,()->
 UpdateAction.parseExecute("MOVE  TO ")
);

You need to sort out the  and 
(untested)

MOVE works remotely.
You could use the RDFConnection as well.

It's not likely to be quicker - it's got to do the same amount of work 
and there is no TDB magic for this.


Andy





400 Not a file upload

2017-08-30 Thread Mikael Pesonen


Hi,

I'm getting that error from cmd:

apache-jena-fuseki-2.4.1/bin/s-update 
--service=http://:3030/ds/update 
--file=/tmp/sparql_update_59a68dfd76a5e


/tmp/sparql_update_59a68dfd76a5e exists. What could case that error?

Br,

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



Re: Fuseki TDB database size growth

2017-08-30 Thread Rob Vesse
No, it is perfectly usable as a primary database

However, if your use case regularly rewrites your entire database then you are 
going to have problems and this would be true of any database system, although 
obviously implementation specifics will have an impact on this.

Rob

On 22/08/2017 03:22, "Chris Tomlinson"  wrote:

Hi,

This is interesting to know about blank nodes and reference counting. Does 
the comment regarding deleting triples not recovering blank nodes apply if an 
entire named graph which includes some blank nodes is deleted?

If so it seems that in production Jena/TDB is expected to be periodically 
reloaded from scratch or to not use blank nodes very much. 

In this case is Jena/TDB more aimed at use cases where it perhaps functions 
like an index cache rather than a primary database. Is this accurate? If so 
what sort of primary database systems are typically found coupled with Jena/TDB?

Regards,
Chris

> On Aug 21, 2017, at 05:28, Rob Vesse  wrote:
> 
> All the data structures used in TDB are broadly speaking append only. 
This means that the database Will tend to grow in size overtime.
> 
> Certain ways of using the database can exacerbate this. In your example I 
would guess that you have a lot of blank nodes present in the data?
> 
> Each unique blank node generates a unique identifier inside the system 
and will continually expand the node table. TDB does not implement reference 
counting so even if you delete every triple that references a given RDF node it 
will never be removed from the node table.
> 
> Similarly as the indexes are updated they do not reclaim space so the 
B+Tree’s will continue to grow over time.
> 
> Reloading from scratch creates a smaller database because it is able to 
maximally pack the data into the Data structures on disk and you do not have 
any unused identifiers allocated.
> 
> Rob
> 
> On 21/08/2017 11:20, "Lorenzo Manzoni"  wrote:
> 
>Hi,
> 
>I'm writing you because we have a behavior of fuseki TDB  we can 
not 
>understand:
> 
>*/the fuseki database filesystem size continues to grow even if the 
>number of triples does not increase substantially./*
> 
>We are using the latest version of fuseki (3.4.0) as triple store of a 
>semantic media wiki (mw 1.24, smw 2.1.1) and all the night we have a 
>scheduled job that updates the wiki pages and executes maintenance 
>scripts(e.g. 
>
https://www.semantic-mediawiki.org/wiki/Help:Maintenance_script_%22rebuildData.php%22)
 
>. These scripts update the semantic data on the wiki and the triples 
on 
>fuseki. Basically every triple are rewritten.
> 
>We have observed that the fuseki database filesystem size grew over 
time 
>to 20Gb but when we recreate it from scratch the database size is only 
>500 Mb.
> 
>After that every day  fuseki database grows about 200Mb and the number 
>of triples does not change substantially
> 
>I originally assumed that the rebuild data script was the problem but 
>when I executed it alone the fuseki database space did not increase.
> 
>We are running fueski on a 64 bit redhat machine.
> 
>Someone can  help us?
> 
>Thanks in advance,
> 
>Lorenzo
> 
> 
> 
> 
> 







RE: Backup TDB and named model

2017-08-30 Thread george.news


From: Andy Seaborne
Sent: martes, 29 de agosto de 2017 19:31
To: users@jena.apache.org
Subject: Re: Backup TDB and named model



On 29/08/17 15:45, aj...@apache.org wrote:
> tdbdump (along with all of the TDB shell utilities) is available in the 
> Jena full distribution:
> 
> https://jena.apache.org/download/index.cgi
> 
> 
> ajs6f
> 
> George News wrote on 8/29/17 2:30 AM:
>> Hi,
>>
>> I have a named graph that is becoming very big, and therefore searches 
>> on it are quite slow. I'm planning on make a backup from time to time 
>> and reset the data in the original.
>>
>> The code that I'm currently using is the one below, which summarizing 
>> consists on creating a new graph based on the original one, delete the 
>> original and create it from scratch.
>>
>> public void reset() {
>>   dataset.begin(ReadWrite.WRITE);
>>   try {
>> LocalDateTime date = LocalDateTime.now();
>> DateTimeFormatter formatter = 
>> DateTimeFormatter.ofPattern("MMddHHmm");
>> String dateString = date.format(formatter);
>> String backupModelName = modelName + "-" + dateString;
>> dataset.addNamedModel(backupModelName, getModel());

A SPARQL UPDATE of using "MOVE" is neater.

For TDB, there is little choice but to do some kind of copy to rename. 
It is a change to the quads for the graph with no indirection to flip 
the name in the storage.

 Andy

Thanks. Can you provide an example please? When you say it’s neater, isnit also 
quicker and more robust?