Hi Christian,

it's been a while now and I was testing quite some things. Your solution of 
writing an intermediate XML to disk and import it was the fastest and easiest 
solution. Thank you for that.


Best regards,

Michael


Mag. Michael Birkner
AK Wien - Bibliothek
1040, Prinz Eugen Straße 20-22
T: +43 1 501 65 12455
F: +43 1 501 65 142455
M: +43 664 88957669

michael.birk...@akwien.at<mailto:michael.birk...@akwien.at>
wien.arbeiterkammer.at<http://wien.arbeiterkammer.at/>

Besuchen Sie uns auch auf:
facebook<http://www.facebook.com/arbeiterkammer/> | 
twitter<https://twitter.com/Arbeiterkammer> | 
youtube<https://www.youtube.com/user/AKoesterreich>
--------------------------------------------------
AK Extra: Wohnen, Bildung, Pflege, Digitalisierung.
Das Zukunftsprogramm der AK. Mehr für die Mitglieder.
w.ak.at/zukunftsprogramm<https://w.ak.at/zukunftsprogramm>


________________________________
Von: Christian Grün <christian.gr...@gmail.com>
Gesendet: Dienstag, 1. Oktober 2019 15:01
An: BIRKNER Michael
Cc: BaseX
Betreff: Re: [basex-talk] "Out of Memory" when inserting data from one DB to 
another

Hi Michael,

Your query looks pretty straightforward. As you have already guessed, it’s 
simply the big number of inserted nodes that causes the memory error.

Is there any chance to assign more memory to your BaseX Java process? If not, 
you may need to write an intermediate XML document with the desired structure 
to disk and reimport this file in a second step. You could also call your 
function multiple times, and insert only parts of you source data in a single 
run.

Hope this helps,
Christian


On Fri, Sep 27, 2019 at 12:05 PM BIRKNER Michael 
<michael.birk...@akwien.at<mailto:michael.birk...@akwien.at>> wrote:

Hi to all,

I get an "Out of Memory" error (using the BaseX GUI on Ubuntu Linux) when I try 
to insert quite a lot of data into a BaseX database. The use case: I have a 
database (size is about 2600 MB, 13718400 nodes) with information in <record> 
elements that should be added to <record> elements in another database. The 
<record>s have a 1 to 1 connection identified by an ID that is available in 
both databases.

An example (simplified) of the DB with the information I want to add to the 
other DB:

<collection>
  <record>
    <id>1</id>
    <data>Some data</data>
    <data>More data</data>
    <data>More data</data>
    ...
  </record>
  <record>
    <id>2</id>
    <data>Some data</data>
    <data>More data</data>
    <data>More data</data>
    ...
  </record>
  <record>
    <id>3</id>
    <data>Some data</data>
    <data>More data</data>
    <data>More data</data>
    ...
  </record>
  ... many many more <record>s
</collection>

Here an example (simplified) of the DB to which the above <data> elements 
should be added:

<collection>
  <record>
    <id>1</id>
    <mainData>Main data</mainData>
    <mainData>More main data</mainData>
    <mainData>More main data</mainData>
    ...
    <!-- Insert <data> elements of record with id 1 of the other database here 
-->
  </record>
  <record>
    <id>2</id>
    <mainData>Main data</mainData>
    <mainData>More main data</mainData>
    <mainData>More main data</mainData>
    ...
    <!-- Insert <data> elements of record with id 2 of the other database here 
-->
  </record>
  <record>
    <id>3</id>
    <mainData>Main data</mainData>
    <mainData>More main data</mainData>
    <mainData>More main data</mainData>
    ...
    <!-- Insert <data> elements of record with id 3 of the other database here 
-->
  </record>
  ... many many more <record>s
</collection>

This is the XQuery I use to insert the given <data> elements from the <record> 
in one database to the corresponding <record> in the other database. It results 
in an "Out of Memory" error:

let $infoRecs := db:open('db-with-data')/collection/record
let $mainRecs := db:open('db-to-insert-data')/collection/record
for $infoRec in $infoRecs
  let $id := data($infoRec/id)
  let $mainRec := $mainRecs[id=$id]
  let $dataToInsert := $infoRec/*[not(name()='id')]
  return insert node ($dataToInsert) into $mainRec

I assume that the error is a result of the large amount of data that is 
processed. My question is if a strategy exists to work with such an amount of 
data without getting an "Out of Memory" error?

Thanks very much to everyone in advance for any hint and advice. If you need 
more information about DB setup or options just let me know.

Best regrads,
Michael


Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten 
AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte 
Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die 
AbsenderIn rechtswidrig sein kann.
Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und 
löschen Sie die Nachricht.
UID: ATU 16209706 I https://wien.arbeiterkammer.at/datenschutz

Reply via email to