Re: [Owlim-discussion] Backup in running system

2012-08-06 Thread Blasetti, Luciano (CIOK)
Ok, solved with 5.2
Thx

From: Barry Bishop [mailto:barry.bis...@ontotext.com]
Sent: 06 August 2012 4:34 PM
To: Blasetti, Luciano (CIOK)
Cc: Ontotext
Subject: Re: [Owlim-discussion] Backup in running system

Hello Luciano,

I'm afraid I don't remember the exact issue, but there was a recent fix to 
OWLIM (most likely included in 5.2) that would allow this method of making a 
backup to work.

Given that it works in 5.2, are you able to continue on this basis?

Regards,
barry



Barry Bishop

OWLIM Product Manager

Ontotext AD

Tel: +43 650 2000 237

email: barry.bis...@ontotext.com<mailto:barry.bis...@ontotext.com>

skype: bazbishop

www.ontotext.com<http://www.ontotext.com>
On 06/08/12 14:40, Blasetti, Luciano (CIOK) wrote:
Actually it seems to work fine on 5.2 but not on 5.1.

From: 
owlim-discussion-boun...@ontotext.com<mailto:owlim-discussion-boun...@ontotext.com>
 [mailto:owlim-discussion-boun...@ontotext.com] On Behalf Of Blasetti, Luciano 
(CIOK)
Sent: 06 August 2012 2:34 PM
To: Barry Bishop
Cc: Ontotext
Subject: Re: [Owlim-discussion] Backup in running system

Hi Barry,
I tried with

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.trig
but it seems that the produced trig file doesn't contain the original context 
names (all the triples are exported under the same unnamed context).

Any hints?

Thx,
Luciano

From: 
owlim-discussion-boun...@ontotext.com<mailto:owlim-discussion-boun...@ontotext.com>
 [mailto:owlim-discussion-boun...@ontotext.com] On Behalf Of Barry Bishop
Sent: 30 July 2012 9:18 AM
To: Marek Šurek
Cc: Ontotext
Subject: Re: [Owlim-discussion] Backup in running system

Hi Marek,

There are a number of ways to make a backup, each with their pros and cons. 
I'll try to list them all here:

1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out of the 
cluster, shut it down, back up the storage files, restart it and add it back to 
the cluster. Depending on the number of updates that have occurred while the 
worker node was absent, either just the missing updates are replayed by the 
active master to the worker node or a full replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, cluster 
could become read-only while if a deep replication is required after adding the 
worker node back to the cluster

2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special explicit 
graph name:

SELECT *
FROM <http://www.ontotext.com/explicit><http://www.ontotext.com/explicit>
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a suitable format, 
some overlap will occur with statements appearing named graphs and also the 
default graph, will not work with very large databases over Sesame HTTP 
protocol as the results are fetched in one go

3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements and store 
them in TriG format:

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.trig

pros: can be executed on the command line against a master or worker node, will 
work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - incurs the 
reasoning overhead

You can also use the N-Quads format which is supported in OWLIM 5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.nq


Later in the year we will implement a new OWLIM plug-in for making online 
back-ups, but this is not available yet. My choice would be to use the graph 
store protocol.

I hope this helps,
barry



Barry Bishop

OWLIM Product Manager

Ontotext AD

Tel: +43 650 2000 237

email: barry.bis...@ontotext.com<mailto:barry.bis...@ontotext.com>

skype: bazbishop

www.ontotext.com<http://www.ontotext.com>
On 27/07/12 10:00, Marek Šurek wrote:
Hi,
I want to ask how exactly backup is performed on running system. I read backup 
part in FAQ and everything seems fine, but I'm confused with wo

Re: [Owlim-discussion] Backup in running system

2012-08-06 Thread Barry Bishop

Hello Luciano,

I'm afraid I don't remember the exact issue, but there was a recent fix 
to OWLIM (most likely included in 5.2) that would allow this method of 
making a backup to work.


Given that it works in 5.2, are you able to continue on this basis?

Regards,
barry

Barry Bishop
OWLIM Product Manager
Ontotext AD
Tel: +43 650 2000 237
email: barry.bis...@ontotext.com
skype: bazbishop
www.ontotext.com

On 06/08/12 14:40, Blasetti, Luciano (CIOK) wrote:


Actually it seems to work fine on 5.2 but not on 5.1.

*From:*owlim-discussion-boun...@ontotext.com 
[mailto:owlim-discussion-boun...@ontotext.com] *On Behalf Of 
*Blasetti, Luciano (CIOK)

*Sent:* 06 August 2012 2:34 PM
*To:* Barry Bishop
*Cc:* Ontotext
*Subject:* Re: [Owlim-discussion] Backup in running system

Hi Barry,

I tried with

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit> 
> backup.trig


but it seems that the produced trig file doesn't contain the original 
context names (all the triples are exported under the same unnamed 
context).


Any hints?

Thx,

Luciano

*From:*owlim-discussion-boun...@ontotext.com 
<mailto:owlim-discussion-boun...@ontotext.com> 
[mailto:owlim-discussion-boun...@ontotext.com] *On Behalf Of *Barry Bishop

*Sent:* 30 July 2012 9:18 AM
*To:* Marek Šurek
*Cc:* Ontotext
*Subject:* Re: [Owlim-discussion] Backup in running system

Hi Marek,

There are a number of ways to make a backup, each with their pros and 
cons. I'll try to list them all here:


1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out 
of the cluster, shut it down, back up the storage files, restart it 
and add it back to the cluster. Depending on the number of updates 
that have occurred while the worker node was absent, either just the 
missing updates are replayed by the active master to the worker node 
or a full replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, 
cluster could become read-only while if a deep replication is required 
after adding the worker node back to the cluster


2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special 
explicit graph name:


SELECT *
FROM <http://www.ontotext.com/explicit> <http://www.ontotext.com/explicit>
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a suitable 
format, some overlap will occur with statements appearing named graphs 
and also the default graph, will not work with very large databases 
over Sesame HTTP protocol as the results are fetched in one go


3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements 
and store them in TriG format:


curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit> 
> backup.trig


pros: can be executed on the command line against a master or worker 
node, will work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - 
incurs the reasoning overhead


You can also use the N-Quads format which is supported in OWLIM 5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit> 
> backup.nq



Later in the year we will implement a new OWLIM plug-in for making 
online back-ups, but this is not available yet. My choice would be to 
use the graph store protocol.


I hope this helps,
barry

Barry Bishop
OWLIM Product Manager
Ontotext AD
Tel: +43 650 2000 237
email:barry.bis...@ontotext.com  <mailto:barry.bis...@ontotext.com>
skype: bazbishop
www.ontotext.com  <http://www.ontotext.com>

On 27/07/12 10:00, Marek Šurek wrote:

Hi,

I want to ask how exactly backup is performed on running system. I
read backup part in FAQ and everything seems fine, but I'm
confused with word 'seamlessly'. Therefore using programatical
approach :

1. Is backup consistency-safe on running system? Does it differ
using OWLIM-SE or OWLIM-EE? My best guess is whether in OWLIM-EE
is backup performed on one working node in following scenari

Re: [Owlim-discussion] Backup in running system

2012-08-06 Thread Blasetti, Luciano (CIOK)
Actually it seems to work fine on 5.2 but not on 5.1.

From: owlim-discussion-boun...@ontotext.com 
[mailto:owlim-discussion-boun...@ontotext.com] On Behalf Of Blasetti, Luciano 
(CIOK)
Sent: 06 August 2012 2:34 PM
To: Barry Bishop
Cc: Ontotext
Subject: Re: [Owlim-discussion] Backup in running system

Hi Barry,
I tried with

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.trig
but it seems that the produced trig file doesn't contain the original context 
names (all the triples are exported under the same unnamed context).

Any hints?

Thx,
Luciano

From: 
owlim-discussion-boun...@ontotext.com<mailto:owlim-discussion-boun...@ontotext.com>
 [mailto:owlim-discussion-boun...@ontotext.com] On Behalf Of Barry Bishop
Sent: 30 July 2012 9:18 AM
To: Marek Šurek
Cc: Ontotext
Subject: Re: [Owlim-discussion] Backup in running system

Hi Marek,

There are a number of ways to make a backup, each with their pros and cons. 
I'll try to list them all here:

1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out of the 
cluster, shut it down, back up the storage files, restart it and add it back to 
the cluster. Depending on the number of updates that have occurred while the 
worker node was absent, either just the missing updates are replayed by the 
active master to the worker node or a full replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, cluster 
could become read-only while if a deep replication is required after adding the 
worker node back to the cluster

2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special explicit 
graph name:

SELECT *
FROM <http://www.ontotext.com/explicit><http://www.ontotext.com/explicit>
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a suitable format, 
some overlap will occur with statements appearing named graphs and also the 
default graph, will not work with very large databases over Sesame HTTP 
protocol as the results are fetched in one go

3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements and store 
them in TriG format:

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.trig

pros: can be executed on the command line against a master or worker node, will 
work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - incurs the 
reasoning overhead

You can also use the N-Quads format which is supported in OWLIM 5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.nq


Later in the year we will implement a new OWLIM plug-in for making online 
back-ups, but this is not available yet. My choice would be to use the graph 
store protocol.

I hope this helps,
barry


Barry Bishop

OWLIM Product Manager

Ontotext AD

Tel: +43 650 2000 237

email: barry.bis...@ontotext.com<mailto:barry.bis...@ontotext.com>

skype: bazbishop

www.ontotext.com<http://www.ontotext.com>
On 27/07/12 10:00, Marek Šurek wrote:
Hi,
I want to ask how exactly backup is performed on running system. I read backup 
part in FAQ and everything seems fine, but I'm confused with word 'seamlessly'. 
Therefore using programatical approach :

1. Is backup consistency-safe on running system? Does it differ using OWLIM-SE 
or OWLIM-EE? My best guess is whether in OWLIM-EE is backup performed on one 
working node in following scenario :
a. One working node is chosen and it stops to be up-to-date/replicated with 
other working nodes
b. Full backup is made on this working node
c. Working node is added back to work/replication and is updated with other 
nodes

2. Doesn't performing backup on running system degrade performance on such 
level it is unusable by high number of users?

3. Is there any way of incremental backup? The used store has tens of GB and 
therefore backupfile size + time needed for backup will be enormous if 
consistent backup cannot be made on running

Re: [Owlim-discussion] Backup in running system

2012-08-06 Thread Blasetti, Luciano (CIOK)
Hi Barry,
I tried with

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.trig

but it seems that the produced trig file doesn't contain the original context 
names (all the triples are exported under the same unnamed context).

Any hints?

Thx,
Luciano

From: owlim-discussion-boun...@ontotext.com 
[mailto:owlim-discussion-boun...@ontotext.com] On Behalf Of Barry Bishop
Sent: 30 July 2012 9:18 AM
To: Marek Šurek
Cc: Ontotext
Subject: Re: [Owlim-discussion] Backup in running system

Hi Marek,

There are a number of ways to make a backup, each with their pros and cons. 
I'll try to list them all here:

1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out of the 
cluster, shut it down, back up the storage files, restart it and add it back to 
the cluster. Depending on the number of updates that have occurred while the 
worker node was absent, either just the missing updates are replayed by the 
active master to the worker node or a full replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, cluster 
could become read-only while if a deep replication is required after adding the 
worker node back to the cluster

2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special explicit 
graph name:

SELECT *
FROM <http://www.ontotext.com/explicit><http://www.ontotext.com/explicit>
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a suitable format, 
some overlap will occur with statements appearing named graphs and also the 
default graph, will not work with very large databases over Sesame HTTP 
protocol as the results are fetched in one go

3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements and store 
them in TriG format:

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.trig

pros: can be executed on the command line against a master or worker node, will 
work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - incurs the 
reasoning overhead

You can also use the N-Quads format which is supported in OWLIM 5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit>
 > backup.nq


Later in the year we will implement a new OWLIM plug-in for making online 
back-ups, but this is not available yet. My choice would be to use the graph 
store protocol.

I hope this helps,
barry



Barry Bishop

OWLIM Product Manager

Ontotext AD

Tel: +43 650 2000 237

email: barry.bis...@ontotext.com<mailto:barry.bis...@ontotext.com>

skype: bazbishop

www.ontotext.com<http://www.ontotext.com>
On 27/07/12 10:00, Marek Šurek wrote:
Hi,
I want to ask how exactly backup is performed on running system. I read backup 
part in FAQ and everything seems fine, but I'm confused with word 'seamlessly'. 
Therefore using programatical approach :

1. Is backup consistency-safe on running system? Does it differ using OWLIM-SE 
or OWLIM-EE? My best guess is whether in OWLIM-EE is backup performed on one 
working node in following scenario :
a. One working node is chosen and it stops to be up-to-date/replicated with 
other working nodes
b. Full backup is made on this working node
c. Working node is added back to work/replication and is updated with other 
nodes

2. Doesn't performing backup on running system degrade performance on such 
level it is unusable by high number of users?

3. Is there any way of incremental backup? The used store has tens of GB and 
therefore backupfile size + time needed for backup will be enormous if 
consistent backup cannot be made on running system.

Thank you for your support.

Best regards,
Marek




___

Owlim-discussion mailing list

Owlim-discussion@ontotext.com<mailto:Owlim-discussion@ontotext.com>

http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion

___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion


Re: [Owlim-discussion] Backup in running system

2012-07-30 Thread Barry Bishop

Hi Marek,

OWLIM implements transaction isolation, so once a query starts, its 
results will not be affected by any updated that is committed while it 
is running.


(This is achieved by associating a transaction with its own copy of the 
page index(es) and implementing copy-on-write semantics for database pages.)


getStatements() will stream results when used on a local instance of 
OWLIM, but not remotely using the HTTP interface (RemoteRepositoryManager).


However, I am told that the new SparqlRepository class will stream 
results properly and would be suitable for running your backup task. 
However, this class does not fully support all Repository operations 
(yet), so is not a direct replacement for HTTPRepository (yet).


I hope this helps,
barry

On 30/07/12 11:17, Marek Šurek wrote:

Thank you Barry for your answer,
I tried third option but one thing came to my mind. Is it consistency 
safe? The procedure takes on big dataset large amount of time. 
Therefore I would like to ask :

Let's say backup procedure takes 300seconds.

1. If one statement is added right after I run backup procedure, is it 
guaranteed it won't be involved in backup file? When I used method 
getStatements() in Sesame, it seems to me it is guaranteed because it 
is loaded into memory (which is useless for large databases as we 
couldn't reserve tens of GB of memory only to backup procedure).


2. Or is the repository during backup read-only?
I tried to do SELECT query during backuping and it worked fine, so I 
think only performance drop during backup procedure is occured. Am I 
right?


Thank you for answers,
Marek

*From:* Barry Bishop 
*To:* Marek Šurek 
*Cc:* Ontotext 
*Sent:* Monday, 30 July 2012, 9:17
*Subject:* Re: [Owlim-discussion] Backup in running system

Hi Marek,

There are a number of ways to make a backup, each with their pros and 
cons. I'll try to list them all here:


1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out 
of the cluster, shut it down, back up the storage files, restart it 
and add it back to the cluster. Depending on the number of updates 
that have occurred while the worker node was absent, either just the 
missing updates are replayed by the active master to the worker node 
or a full replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, 
cluster could become read-only while if a deep replication is required 
after adding the worker node back to the cluster


2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special 
explicit graph name:


SELECT *
FROM <http://www.ontotext.com/explicit> <http://www.ontotext.com/explicit>
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a suitable 
format, some overlap will occur with statements appearing named graphs 
and also the default graph, will not work with very large databases 
over Sesame HTTP protocol as the results are fetched in one go


3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements 
and store them in TriG format:


curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit> 
> backup.trig


pros: can be executed on the command line against a master or worker 
node, will work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - 
incurs the reasoning overhead


You can also use the N-Quads format which is supported in OWLIM 5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit> 
> backup.nq



Later in the year we will implement a new OWLIM plug-in for making 
online back-ups, but this is not available yet. My choice would be to 
use the graph store protocol.


I hope this helps,
barry

Barry Bishop
OWLIM Product Manager
Ontotext AD
Tel: +43 650 2000 237
email:barry.bis...@ontotext.com  <mailto:barry.bis...@ontotext.com>
skype: bazbishop
www.ontotext.com  <http://www.ontotext.com/>
On 27/07/12 10:00, Marek Šurek wrote:

Hi,
I want to ask how exactly backup is performed on running system. I 
read backup part in FAQ and everything seems fine, but I'm confused 
with word 'seamlessly'. Therefore using progr

Re: [Owlim-discussion] Backup in running system

2012-07-30 Thread Marek Šurek
Thank you Barry for your answer,
I tried third option but one thing came to my mind. Is it consistency safe? The 
procedure takes on big dataset large amount of time. Therefore I would like to 
ask :
Let's say backup procedure takes 300seconds. 


1. If one statement is added right after I run backup procedure, is it 
guaranteed it won't be involved in backup file? When I used method 
getStatements() in Sesame, it seems to me it is guaranteed because it is loaded 
into memory (which is useless for large databases as we couldn't reserve tens 
of GB of memory only to backup procedure). 

2. Or is the repository during backup read-only?
I tried to do SELECT query during backuping and it worked fine, so I think only 
performance drop during backup procedure is occured. Am I right?

Thank you for answers,
Marek



 From: Barry Bishop 
To: Marek Šurek  
Cc: Ontotext  
Sent: Monday, 30 July 2012, 9:17
Subject: Re: [Owlim-discussion] Backup in running system
 

Hi Marek,

There are a number of ways to make a backup, each with their pros
  and cons. I'll try to list them all here:

1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node
  out of the cluster, shut it down, back up the storage files,
  restart it and add it back to the cluster. Depending on the number
  of updates that have occurred while the worker node was absent,
  either just the missing updates are replayed by the active master
  to the worker node or a full replication takes place.
pros: a complete image is taken, a worker can be recreated without
  any loading/inference required
cons: cluster query performance can drop while a worker is
  offline, cluster could become read-only while if a deep
  replication is required after adding the worker node back to the
  cluster

2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the
  special explicit graph name:

SELECT *
FROM <http://www.ontotext.com/explicit>
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a
  suitable format, some overlap will occur with statements appearing
  named graphs and also the default graph, will not work with very
  large databases over Sesame HTTP protocol as the results are
  fetched in one go

3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit
  statements and store them in TriG format:

curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";
 > backup.trig

pros: can be executed on the command line against a master or
  worker node, will work with very large databases by streaming
  results
cons: restoring a backup requires loading it like a normal file -
  incurs the reasoning overhead

You can also use the N-Quads format which is supported in OWLIM
  5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit";
 > backup.nq


Later in the year we will implement a new OWLIM plug-in for making
  online back-ups, but this is not available yet. My choice would be
  to use the graph store protocol.

I hope this helps,
barry


Barry Bishop
OWLIM Product Manager
Ontotext AD
Tel: +43 650 2000 237
email: barry.bis...@ontotext.com skype: bazbishop www.ontotext.com
On 27/07/12 10:00, Marek Šurek wrote:

Hi,
>I want to ask how exactly backup is performed on running system. I read backup 
>part in FAQ and everything seems fine, but I'm confused with word 
>'seamlessly'. Therefore using programatical approach : 
>
>
>
>1. Is backup consistency-safe on running system? Does it differ using OWLIM-SE 
>or OWLIM-EE? My best guess is whether in OWLIM-EE is backup performed on one 
>working node in following scenario : 
>
>a. One working node is chosen and it stops to be up-to-date/replicated with 
>other working nodes
>b. Full backup is made on this working node
>c. Working node is added back to work/replication and is updated with other 
>nodes
>
>
>
>2. Doesn't performing backup on running system degrade performance on such 
>level it is unusable by high number of users?
>
>
>3. Is there any way of incremental backup? The used store has tens of GB and 
>therefore backupfile size + time needed for backup will be enormous if 
>consistent backup cannot be made on running system.
>
>
>Thank you for your support.
>
>
>Best regards,
>Marek
>
>
>___
Owlim-discussion mailing list Owlim-discussion@ontotext.com 
http://ontomail.semdata.

Re: [Owlim-discussion] Backup in running system

2012-07-30 Thread Barry Bishop

Hi Marek,

There are a number of ways to make a backup, each with their pros and 
cons. I'll try to list them all here:


1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out 
of the cluster, shut it down, back up the storage files, restart it and 
add it back to the cluster. Depending on the number of updates that have 
occurred while the worker node was absent, either just the missing 
updates are replayed by the active master to the worker node or a full 
replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, 
cluster could become read-only while if a deep replication is required 
after adding the worker node back to the cluster


2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special 
explicit graph name:


SELECT *
FROM 
{
  { ?s ?p ?o }
  UNION
  { GRAPH ?g { ?s ?p ?o } }
}

pros: easy to do
cons: some programming is required to store the results in a suitable 
format, some overlap will occur with statements appearing named graphs 
and also the default graph, will not work with very large databases over 
Sesame HTTP protocol as the results are fetched in one go


3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements and 
store them in TriG format:


curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
> backup.trig


pros: can be executed on the command line against a master or worker 
node, will work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - incurs 
the reasoning overhead


You can also use the N-Quads format which is supported in OWLIM 5.2:

curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"; 
> backup.nq



Later in the year we will implement a new OWLIM plug-in for making 
online back-ups, but this is not available yet. My choice would be to 
use the graph store protocol.


I hope this helps,
barry

Barry Bishop
OWLIM Product Manager
Ontotext AD
Tel: +43 650 2000 237
email: barry.bis...@ontotext.com
skype: bazbishop
www.ontotext.com

On 27/07/12 10:00, Marek Šurek wrote:

Hi,
I want to ask how exactly backup is performed on running system. I 
read backup part in FAQ and everything seems fine, but I'm confused 
with word 'seamlessly'. Therefore using programatical approach :


1. Is backup consistency-safe on running system? Does it differ using 
OWLIM-SE or OWLIM-EE? My best guess is whether in OWLIM-EE is backup 
performed on one working node in following scenario :
a. One working node is chosen and it stops to be up-to-date/replicated 
with other working nodes

b. Full backup is made on this working node
c. Working node is added back to work/replication and is updated with 
other nodes


2. Doesn't performing backup on running system degrade performance on 
such level it is unusable by high number of users?


3. Is there any way of incremental backup? The used store has tens of 
GB and therefore backupfile size + time needed for backup will be 
enormous if consistent backup cannot be made on running system.


Thank you for your support.

Best regards,
Marek


___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion



___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion


[Owlim-discussion] Backup in running system

2012-07-27 Thread Marek Šurek
Hi,
I want to ask how exactly backup is performed on running system. I read backup 
part in FAQ and everything seems fine, but I'm confused with word 'seamlessly'. 
Therefore using programatical approach : 


1. Is backup consistency-safe on running system? Does it differ using OWLIM-SE 
or OWLIM-EE? My best guess is whether in OWLIM-EE is backup performed on one 
working node in following scenario : 

a. One working node is chosen and it stops to be up-to-date/replicated with 
other working nodes
b. Full backup is made on this working node
c. Working node is added back to work/replication and is updated with other 
nodes


2. Doesn't performing backup on running system degrade performance on such 
level it is unusable by high number of users?

3. Is there any way of incremental backup? The used store has tens of GB and 
therefore backupfile size + time needed for backup will be enormous if 
consistent backup cannot be made on running system.

Thank you for your support.

Best regards,
Marek
___
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion