Re: [basex-talk] Out of Main Memory

2024-03-15 Thread Thompson, Timothy
Hi, Greg,

Assuming you have multiple cores available, you can also execute a search in 
parallel using the (BaseX-specific) xquery:fork-join function[1]. That’s what I 
usually do when searching across databases.

All best,
Tim

[1] https://docs.basex.org/wiki/XQuery_Module#xquery:fork-join


--
Tim A. Thompson (he, him)
Librarian for Applied Metadata Research
Yale University Library
www.linkedin.com/in/timathompson<http://www.linkedin.com/in/timathompson>


From: BaseX-Talk  on behalf of 
Murray, Gregory 
Date: Friday, March 15, 2024 at 12:12 PM
To: Christian Grün 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
Thanks, Christian. Distributing documents across many databases sounds fine, as 
long as XPath expressions and full-text searching remain reasonably efficient. 
In the documentation, the example of addressing multiple databases uses a loop:

for $i in 1 to 100
return db:get('books' || $i)//book/title

Is that the preferred technique?

Also, is it possible to perform searches in the same manner without interfering 
with relevance scores?

Thanks,
Greg

From: Christian Grün 
Date: Friday, March 15, 2024 at 11:51 AM
To: Murray, Gregory 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

I would have guessed that 12 GB is enough for 4.7 GB; but it sometimes depends 
on the input. If you like, you can share a single typical document with us, and 
we can have a look at it. 61 GB will be too large for a complete full-text 
index, though. However, it’s always possible to distribute documets across 
multiple databases and access them with a single query [1].

The full-text index is not incremental (in opposition to the other index 
structures), which means it must be re-created it after updates. However, it’s 
possible to re-index updated database instances and query fully indexed 
databases at the same time.

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Databases


On Thu, Mar 14, 2024 at 10:58 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Thanks, Christian. I don’t think selective indexing is applicable in my use 
case, because I need to perform full-text searches on the entirety of each 
document. Each XML document represents a physical book that was digitized, and 
the structure of each document is essentially a header with metadata and a body 
with the OCR text of the book. The OCR text is split into pages, where one 
 element contains all the words from one corresponding printed page from 
the physical book. Obviously the number of words in each  varies widely 
based on the physical dimensions of the book and the typeface.

So far, I have loaded 12,331 documents, containing a total of 2,196,771 pages. 
The total size of those XML documents on disk is 4.7GB. But that is only a 
fraction of the total number of documents I want to load into BaseX. The total 
number is more like 160,000 documents. Assuming that the documents I’ve loaded 
so far are a representative sample, and I believe that’s true, then the total 
size of the XML documents on disk, prior to loading them into BaseX, would be 
about 4.7GB * 13 = 61.1GB.

Normally the OCR text, once loaded, almost never changes. But the metadata 
fields do change as corrections are made. Also we add more XML documents 
routinely as we digitize more books over time. Therefore updates and additions 
are commonplace, such that keeping indexes up to date is important, to allow 
full-text searches to stay performant. I’m wondering if there are techniques 
for optimizing such quantities of text.

Thanks,
Greg

From: Christian Grün 
mailto:christian.gr...@gmail.com>>
Date: Thursday, March 14, 2024 at 8:48 AM
To: Murray, Gregory mailto:gregory.mur...@ptsem.edu>>
Cc: 
basex-talk@mailman.uni-konstanz.de<mailto:basex-talk@mailman.uni-konstanz.de> 
mailto:basex-talk@mailman.uni-konstanz.de>>
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text 
queries, you can restrict the selection with the FTINDEX option (see [1] for 
more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Hello,

I’m working with a database that has a full-text index. I have found that if I 
iteratively add XML documents, then optimize, add more documents, optimize 
again, and so on, eventually the “optimize” command will fail with “Out of Main 
Memory.” I edited the basex startup script to change the memory allocation from 
-Xmx2g to -Xmx12g. My computer has 16 GB of memory, but of course the OS uses 
up some of it. I have found that if I exit memory-hungry programs (web browser, 
Oxygen), start basex, and then run the “optimize” command, I still get “Out of 
Main Memory.”

Re: [basex-talk] Out of Main Memory

2024-03-15 Thread Murray, Gregory
Thanks, Christian. Distributing documents across many databases sounds fine, as 
long as XPath expressions and full-text searching remain reasonably efficient. 
In the documentation, the example of addressing multiple databases uses a loop:

for $i in 1 to 100
return db:get('books' || $i)//book/title

Is that the preferred technique?

Also, is it possible to perform searches in the same manner without interfering 
with relevance scores?

Thanks,
Greg

From: Christian Grün 
Date: Friday, March 15, 2024 at 11:51 AM
To: Murray, Gregory 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

I would have guessed that 12 GB is enough for 4.7 GB; but it sometimes depends 
on the input. If you like, you can share a single typical document with us, and 
we can have a look at it. 61 GB will be too large for a complete full-text 
index, though. However, it’s always possible to distribute documets across 
multiple databases and access them with a single query [1].

The full-text index is not incremental (in opposition to the other index 
structures), which means it must be re-created it after updates. However, it’s 
possible to re-index updated database instances and query fully indexed 
databases at the same time.

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Databases


On Thu, Mar 14, 2024 at 10:58 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Thanks, Christian. I don’t think selective indexing is applicable in my use 
case, because I need to perform full-text searches on the entirety of each 
document. Each XML document represents a physical book that was digitized, and 
the structure of each document is essentially a header with metadata and a body 
with the OCR text of the book. The OCR text is split into pages, where one 
 element contains all the words from one corresponding printed page from 
the physical book. Obviously the number of words in each  varies widely 
based on the physical dimensions of the book and the typeface.

So far, I have loaded 12,331 documents, containing a total of 2,196,771 pages. 
The total size of those XML documents on disk is 4.7GB. But that is only a 
fraction of the total number of documents I want to load into BaseX. The total 
number is more like 160,000 documents. Assuming that the documents I’ve loaded 
so far are a representative sample, and I believe that’s true, then the total 
size of the XML documents on disk, prior to loading them into BaseX, would be 
about 4.7GB * 13 = 61.1GB.

Normally the OCR text, once loaded, almost never changes. But the metadata 
fields do change as corrections are made. Also we add more XML documents 
routinely as we digitize more books over time. Therefore updates and additions 
are commonplace, such that keeping indexes up to date is important, to allow 
full-text searches to stay performant. I’m wondering if there are techniques 
for optimizing such quantities of text.

Thanks,
Greg

From: Christian Grün 
mailto:christian.gr...@gmail.com>>
Date: Thursday, March 14, 2024 at 8:48 AM
To: Murray, Gregory mailto:gregory.mur...@ptsem.edu>>
Cc: 
basex-talk@mailman.uni-konstanz.de<mailto:basex-talk@mailman.uni-konstanz.de> 
mailto:basex-talk@mailman.uni-konstanz.de>>
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text 
queries, you can restrict the selection with the FTINDEX option (see [1] for 
more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Hello,

I’m working with a database that has a full-text index. I have found that if I 
iteratively add XML documents, then optimize, add more documents, optimize 
again, and so on, eventually the “optimize” command will fail with “Out of Main 
Memory.” I edited the basex startup script to change the memory allocation from 
-Xmx2g to -Xmx12g. My computer has 16 GB of memory, but of course the OS uses 
up some of it. I have found that if I exit memory-hungry programs (web browser, 
Oxygen), start basex, and then run the “optimize” command, I still get “Out of 
Main Memory.” I’m wondering if there are any known workarounds or strategies 
for this situation. If I understand the documentation about indexes correctly, 
index data is periodically written to disk during optimization. Does this mean 
that running optimize again will pick up where the previous attempt left off, 
such that running optimize repeatedly will eventually succeed?

Thanks,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary




Re: [basex-talk] Out of Main Memory

2024-03-15 Thread Christian Grün
Hi Greg,

I would have guessed that 12 GB is enough for 4.7 GB; but it sometimes
depends on the input. If you like, you can share a single typical document
with us, and we can have a look at it. 61 GB will be too large for a
complete full-text index, though. However, it’s always possible to
distribute documets across multiple databases and access them with a single
query [1].

The full-text index is not incremental (in opposition to the other index
structures), which means it must be re-created it after updates. However,
it’s possible to re-index updated database instances and query fully
indexed databases at the same time.

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Databases


On Thu, Mar 14, 2024 at 10:58 PM Murray, Gregory 
wrote:

> Thanks, Christian. I don’t think selective indexing is applicable in my
> use case, because I need to perform full-text searches on the entirety of
> each document. Each XML document represents a physical book that was
> digitized, and the structure of each document is essentially a header with
> metadata and a body with the OCR text of the book. The OCR text is split
> into pages, where one  element contains all the words from one
> corresponding printed page from the physical book. Obviously the number of
> words in each  varies widely based on the physical dimensions of the
> book and the typeface.
>
>
>
> So far, I have loaded 12,331 documents, containing a total of 2,196,771
> pages. The total size of those XML documents on disk is 4.7GB. But that is
> only a fraction of the total number of documents I want to load into BaseX.
> The total number is more like 160,000 documents. Assuming that the
> documents I’ve loaded so far are a representative sample, and I believe
> that’s true, then the total size of the XML documents on disk, prior to
> loading them into BaseX, would be about 4.7GB * 13 = 61.1GB.
>
>
>
> Normally the OCR text, once loaded, almost never changes. But the metadata
> fields do change as corrections are made. Also we add more XML documents
> routinely as we digitize more books over time. Therefore updates and
> additions are commonplace, such that keeping indexes up to date is
> important, to allow full-text searches to stay performant. I’m wondering if
> there are techniques for optimizing such quantities of text.
>
>
>
> Thanks,
>
> Greg
>
>
>
> *From: *Christian Grün 
> *Date: *Thursday, March 14, 2024 at 8:48 AM
> *To: *Murray, Gregory 
> *Cc: *basex-talk@mailman.uni-konstanz.de <
> basex-talk@mailman.uni-konstanz.de>
> *Subject: *Re: [basex-talk] Out of Main Memory
>
> Hi Greg,
>
>
>
> A quick reply: If only parts of your documents are relevant for full-text
> queries, you can restrict the selection with the FTINDEX option (see [1]
> for more information).
>
>
>
> How large is the total size of your input documents?
>
>
>
> Best,
>
> Christian
>
>
>
> [1] https://docs.basex.org/wiki/Indexes#Selective_Indexing
>
>
>
>
>
>
>
> On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
> wrote:
>
> Hello,
>
>
>
> I’m working with a database that has a full-text index. I have found that
> if I iteratively add XML documents, then optimize, add more documents,
> optimize again, and so on, eventually the “optimize” command will fail with
> “Out of Main Memory.” I edited the basex startup script to change the
> memory allocation from -Xmx2g to -Xmx12g. My computer has 16 GB of memory,
> but of course the OS uses up some of it. I have found that if I exit
> memory-hungry programs (web browser, Oxygen), start basex, and then run the
> “optimize” command, I still get “Out of Main Memory.” I’m wondering if
> there are any known workarounds or strategies for this situation. If I
> understand the documentation about indexes correctly, index data is
> periodically written to disk during optimization. Does this mean that
> running optimize again will pick up where the previous attempt left off,
> such that running optimize repeatedly will eventually succeed?
>
>
>
> Thanks,
>
> Greg
>
>
>
>
>
> Gregory Murray
>
> Director of Digital Initiatives
>
> Wright Library
>
> Princeton Theological Seminary
>
>
>
>
>
>


Re: [basex-talk] Out of Main Memory

2024-03-15 Thread Murray, Gregory
PS. I could ask the IT department here to set up a virtual server for me that 
would have ample memory and disk space. Do you have any idea how much memory 
would be needed to optimize something like 60+ GB of text?

From: BaseX-Talk  on behalf of 
Murray, Gregory 
Date: Thursday, March 14, 2024 at 5:55 PM
To: Christian Grün 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
Thanks, Christian. I don’t think selective indexing is applicable in my use 
case, because I need to perform full-text searches on the entirety of each 
document. Each XML document represents a physical book that was digitized, and 
the structure of each document is essentially a header with metadata and a body 
with the OCR text of the book. The OCR text is split into pages, where one 
 element contains all the words from one corresponding printed page from 
the physical book. Obviously the number of words in each  varies widely 
based on the physical dimensions of the book and the typeface.

So far, I have loaded 12,331 documents, containing a total of 2,196,771 pages. 
The total size of those XML documents on disk is 4.7GB. But that is only a 
fraction of the total number of documents I want to load into BaseX. The total 
number is more like 160,000 documents. Assuming that the documents I’ve loaded 
so far are a representative sample, and I believe that’s true, then the total 
size of the XML documents on disk, prior to loading them into BaseX, would be 
about 4.7GB * 13 = 61.1GB.

Normally the OCR text, once loaded, almost never changes. But the metadata 
fields do change as corrections are made. Also we add more XML documents 
routinely as we digitize more books over time. Therefore updates and additions 
are commonplace, such that keeping indexes up to date is important, to allow 
full-text searches to stay performant. I’m wondering if there are techniques 
for optimizing such quantities of text.

Thanks,
Greg

From: Christian Grün 
Date: Thursday, March 14, 2024 at 8:48 AM
To: Murray, Gregory 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text 
queries, you can restrict the selection with the FTINDEX option (see [1] for 
more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Hello,

I’m working with a database that has a full-text index. I have found that if I 
iteratively add XML documents, then optimize, add more documents, optimize 
again, and so on, eventually the “optimize” command will fail with “Out of Main 
Memory.” I edited the basex startup script to change the memory allocation from 
-Xmx2g to -Xmx12g. My computer has 16 GB of memory, but of course the OS uses 
up some of it. I have found that if I exit memory-hungry programs (web browser, 
Oxygen), start basex, and then run the “optimize” command, I still get “Out of 
Main Memory.” I’m wondering if there are any known workarounds or strategies 
for this situation. If I understand the documentation about indexes correctly, 
index data is periodically written to disk during optimization. Does this mean 
that running optimize again will pick up where the previous attempt left off, 
such that running optimize repeatedly will eventually succeed?

Thanks,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary




Re: [basex-talk] Out of Main Memory

2024-03-15 Thread Murray, Gregory
Hi Bridger,

Thank you for this tip. It looks like it might apply only to adding new 
documents, whereas my main problem at the moment is reindexing existing 
documents, but I will look into it further.

Thanks,
Greg

From: Bridger Dyson-Smith 
Date: Thursday, March 14, 2024 at 6:43 PM
To: Murray, Gregory 
Cc: Christian Grün , 
basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
You don't often get email from bdysonsm...@gmail.com. Learn why this is 
important<https://aka.ms/LearnAboutSenderIdentification>
Hi Greg,
Have you tried experimenting with the ADDCACHE[1] option when building your 
database? While it's been a bit, I recall having good results with, especially 
in a RAM-constrained environment.
Hope that's helpful!
Best,
Bridger

[1] https://docs.basex.org/wiki/Options#ADDCACHE

On Thu, Mar 14, 2024 at 9:55 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Thanks, Christian. I don’t think selective indexing is applicable in my use 
case, because I need to perform full-text searches on the entirety of each 
document. Each XML document represents a physical book that was digitized, and 
the structure of each document is essentially a header with metadata and a body 
with the OCR text of the book. The OCR text is split into pages, where one 
 element contains all the words from one corresponding printed page from 
the physical book. Obviously the number of words in each  varies widely 
based on the physical dimensions of the book and the typeface.

So far, I have loaded 12,331 documents, containing a total of 2,196,771 pages. 
The total size of those XML documents on disk is 4.7GB. But that is only a 
fraction of the total number of documents I want to load into BaseX. The total 
number is more like 160,000 documents. Assuming that the documents I’ve loaded 
so far are a representative sample, and I believe that’s true, then the total 
size of the XML documents on disk, prior to loading them into BaseX, would be 
about 4.7GB * 13 = 61.1GB.

Normally the OCR text, once loaded, almost never changes. But the metadata 
fields do change as corrections are made. Also we add more XML documents 
routinely as we digitize more books over time. Therefore updates and additions 
are commonplace, such that keeping indexes up to date is important, to allow 
full-text searches to stay performant. I’m wondering if there are techniques 
for optimizing such quantities of text.

Thanks,
Greg

From: Christian Grün 
mailto:christian.gr...@gmail.com>>
Date: Thursday, March 14, 2024 at 8:48 AM
To: Murray, Gregory mailto:gregory.mur...@ptsem.edu>>
Cc: 
basex-talk@mailman.uni-konstanz.de<mailto:basex-talk@mailman.uni-konstanz.de> 
mailto:basex-talk@mailman.uni-konstanz.de>>
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text 
queries, you can restrict the selection with the FTINDEX option (see [1] for 
more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Hello,

I’m working with a database that has a full-text index. I have found that if I 
iteratively add XML documents, then optimize, add more documents, optimize 
again, and so on, eventually the “optimize” command will fail with “Out of Main 
Memory.” I edited the basex startup script to change the memory allocation from 
-Xmx2g to -Xmx12g. My computer has 16 GB of memory, but of course the OS uses 
up some of it. I have found that if I exit memory-hungry programs (web browser, 
Oxygen), start basex, and then run the “optimize” command, I still get “Out of 
Main Memory.” I’m wondering if there are any known workarounds or strategies 
for this situation. If I understand the documentation about indexes correctly, 
index data is periodically written to disk during optimization. Does this mean 
that running optimize again will pick up where the previous attempt left off, 
such that running optimize repeatedly will eventually succeed?

Thanks,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary




Re: [basex-talk] Out of Main Memory

2024-03-14 Thread Bridger Dyson-Smith
Hi Greg,

Have you tried experimenting with the ADDCACHE[1] option when building your
database? While it's been a bit, I recall having good results with,
especially in a RAM-constrained environment.
Hope that's helpful!
Best,
Bridger

[1] https://docs.basex.org/wiki/Options#ADDCACHE

On Thu, Mar 14, 2024 at 9:55 PM Murray, Gregory 
wrote:

> Thanks, Christian. I don’t think selective indexing is applicable in my
> use case, because I need to perform full-text searches on the entirety of
> each document. Each XML document represents a physical book that was
> digitized, and the structure of each document is essentially a header with
> metadata and a body with the OCR text of the book. The OCR text is split
> into pages, where one  element contains all the words from one
> corresponding printed page from the physical book. Obviously the number of
> words in each  varies widely based on the physical dimensions of the
> book and the typeface.
>
>
>
> So far, I have loaded 12,331 documents, containing a total of 2,196,771
> pages. The total size of those XML documents on disk is 4.7GB. But that is
> only a fraction of the total number of documents I want to load into BaseX.
> The total number is more like 160,000 documents. Assuming that the
> documents I’ve loaded so far are a representative sample, and I believe
> that’s true, then the total size of the XML documents on disk, prior to
> loading them into BaseX, would be about 4.7GB * 13 = 61.1GB.
>
>
>
> Normally the OCR text, once loaded, almost never changes. But the metadata
> fields do change as corrections are made. Also we add more XML documents
> routinely as we digitize more books over time. Therefore updates and
> additions are commonplace, such that keeping indexes up to date is
> important, to allow full-text searches to stay performant. I’m wondering if
> there are techniques for optimizing such quantities of text.
>
>
>
> Thanks,
>
> Greg
>
>
>
> *From: *Christian Grün 
> *Date: *Thursday, March 14, 2024 at 8:48 AM
> *To: *Murray, Gregory 
> *Cc: *basex-talk@mailman.uni-konstanz.de <
> basex-talk@mailman.uni-konstanz.de>
> *Subject: *Re: [basex-talk] Out of Main Memory
>
> Hi Greg,
>
>
>
> A quick reply: If only parts of your documents are relevant for full-text
> queries, you can restrict the selection with the FTINDEX option (see [1]
> for more information).
>
>
>
> How large is the total size of your input documents?
>
>
>
> Best,
>
> Christian
>
>
>
> [1] https://docs.basex.org/wiki/Indexes#Selective_Indexing
>
>
>
>
>
>
>
> On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
> wrote:
>
> Hello,
>
>
>
> I’m working with a database that has a full-text index. I have found that
> if I iteratively add XML documents, then optimize, add more documents,
> optimize again, and so on, eventually the “optimize” command will fail with
> “Out of Main Memory.” I edited the basex startup script to change the
> memory allocation from -Xmx2g to -Xmx12g. My computer has 16 GB of memory,
> but of course the OS uses up some of it. I have found that if I exit
> memory-hungry programs (web browser, Oxygen), start basex, and then run the
> “optimize” command, I still get “Out of Main Memory.” I’m wondering if
> there are any known workarounds or strategies for this situation. If I
> understand the documentation about indexes correctly, index data is
> periodically written to disk during optimization. Does this mean that
> running optimize again will pick up where the previous attempt left off,
> such that running optimize repeatedly will eventually succeed?
>
>
>
> Thanks,
>
> Greg
>
>
>
>
>
> Gregory Murray
>
> Director of Digital Initiatives
>
> Wright Library
>
> Princeton Theological Seminary
>
>
>
>
>
>


Re: [basex-talk] Out of Main Memory

2024-03-14 Thread Murray, Gregory
Thanks, Christian. I don’t think selective indexing is applicable in my use 
case, because I need to perform full-text searches on the entirety of each 
document. Each XML document represents a physical book that was digitized, and 
the structure of each document is essentially a header with metadata and a body 
with the OCR text of the book. The OCR text is split into pages, where one 
 element contains all the words from one corresponding printed page from 
the physical book. Obviously the number of words in each  varies widely 
based on the physical dimensions of the book and the typeface.

So far, I have loaded 12,331 documents, containing a total of 2,196,771 pages. 
The total size of those XML documents on disk is 4.7GB. But that is only a 
fraction of the total number of documents I want to load into BaseX. The total 
number is more like 160,000 documents. Assuming that the documents I’ve loaded 
so far are a representative sample, and I believe that’s true, then the total 
size of the XML documents on disk, prior to loading them into BaseX, would be 
about 4.7GB * 13 = 61.1GB.

Normally the OCR text, once loaded, almost never changes. But the metadata 
fields do change as corrections are made. Also we add more XML documents 
routinely as we digitize more books over time. Therefore updates and additions 
are commonplace, such that keeping indexes up to date is important, to allow 
full-text searches to stay performant. I’m wondering if there are techniques 
for optimizing such quantities of text.

Thanks,
Greg

From: Christian Grün 
Date: Thursday, March 14, 2024 at 8:48 AM
To: Murray, Gregory 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Out of Main Memory
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text 
queries, you can restrict the selection with the FTINDEX option (see [1] for 
more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
mailto:gregory.mur...@ptsem.edu>> wrote:
Hello,

I’m working with a database that has a full-text index. I have found that if I 
iteratively add XML documents, then optimize, add more documents, optimize 
again, and so on, eventually the “optimize” command will fail with “Out of Main 
Memory.” I edited the basex startup script to change the memory allocation from 
-Xmx2g to -Xmx12g. My computer has 16 GB of memory, but of course the OS uses 
up some of it. I have found that if I exit memory-hungry programs (web browser, 
Oxygen), start basex, and then run the “optimize” command, I still get “Out of 
Main Memory.” I’m wondering if there are any known workarounds or strategies 
for this situation. If I understand the documentation about indexes correctly, 
index data is periodically written to disk during optimization. Does this mean 
that running optimize again will pick up where the previous attempt left off, 
such that running optimize repeatedly will eventually succeed?

Thanks,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary




Re: [basex-talk] Out of Main Memory

2024-03-14 Thread Christian Grün
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text
queries, you can restrict the selection with the FTINDEX option (see [1]
for more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
wrote:

> Hello,
>
>
>
> I’m working with a database that has a full-text index. I have found that
> if I iteratively add XML documents, then optimize, add more documents,
> optimize again, and so on, eventually the “optimize” command will fail with
> “Out of Main Memory.” I edited the basex startup script to change the
> memory allocation from -Xmx2g to -Xmx12g. My computer has 16 GB of memory,
> but of course the OS uses up some of it. I have found that if I exit
> memory-hungry programs (web browser, Oxygen), start basex, and then run the
> “optimize” command, I still get “Out of Main Memory.” I’m wondering if
> there are any known workarounds or strategies for this situation. If I
> understand the documentation about indexes correctly, index data is
> periodically written to disk during optimization. Does this mean that
> running optimize again will pick up where the previous attempt left off,
> such that running optimize repeatedly will eventually succeed?
>
>
>
> Thanks,
>
> Greg
>
>
>
>
>
> Gregory Murray
>
> Director of Digital Initiatives
>
> Wright Library
>
> Princeton Theological Seminary
>
>
>
>
>


[basex-talk] Out of Main Memory

2024-03-12 Thread Murray, Gregory
Hello,

I’m working with a database that has a full-text index. I have found that if I 
iteratively add XML documents, then optimize, add more documents, optimize 
again, and so on, eventually the “optimize” command will fail with “Out of Main 
Memory.” I edited the basex startup script to change the memory allocation from 
-Xmx2g to -Xmx12g. My computer has 16 GB of memory, but of course the OS uses 
up some of it. I have found that if I exit memory-hungry programs (web browser, 
Oxygen), start basex, and then run the “optimize” command, I still get “Out of 
Main Memory.” I’m wondering if there are any known workarounds or strategies 
for this situation. If I understand the documentation about indexes correctly, 
index data is periodically written to disk during optimization. Does this mean 
that running optimize again will pick up where the previous attempt left off, 
such that running optimize repeatedly will eventually succeed?

Thanks,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary




Re: [basex-talk] out of main memory

2016-03-04 Thread michele . greco2
Hi Christian, i solved and managed to increase the memory,changing basex.bat
and not basexgui.bat.


Thanks for your help.


Regards,


Michele



- Original Message 

 Da: "Christian Grün" <christian.gr...@gmail.com>

 To: 

 Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>, "BaseX"
<basex-talk@mailman.uni-konstanz.de>

 Oggetto: Re: [basex-talk] out of main memory

 Data: 03/03/16 14:38

 

  

 

 > I Christian, i tried and this is my error:

 > Error:Could not create the Java Virtual Machine

 

 Did you exactly proceed as described in my last mail? How does the

 modified line in your script look like?

 

 >

 >

 > - Original Message 

 > Da: "Christian Grün" <christian.gr...@gmail.com>

 > To:

 > Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>,
"BaseX"

 > <basex-talk@mailman.uni-konstanz.de>

 > Oggetto: Re: [basex-talk] out of main memory

 > Data: 03/03/16 11:57

 >

 >> I tried to do what i suggested Costantine with the value 1024m,
but my

 >> memory is alwais 495m.

 >

 > Please try the following:

 >

 > 1. Download the ZIP version of BaseX

 > (http://files.basex.org/releases/latest/)

 >

 > 2. Unzip it, edit bin/basexgui.bat (provided that you are using
Windows):

 >

 > OLD:

 > set BASEX_JVM=-Xmx512m %BASEX_JVM%

 >

 > NEW:

 > set BASEX_JVM=-Xmx4g %BASEX_JVM%

 >

 > 3. Start bin/basexgui.bat

 >

 > 4. Click on the lower right memory bar and compare it with the

 > attached screenshot.

 >

 > Does this help?

 > Christian

 >

 >

 >

 > On Thu, Mar 3, 2016 at 12:26 PM, <michele.gre...@email.it>
wrote:

 >>

 >> ----- Original Message 

 >> Da: "Christian Grün"
<christian.gr...@gmail.com>

 >> To:

 >> Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>,
"BaseX"

 >> <basex-talk@mailman.uni-konstanz.de>

 >> Oggetto: Re: [basex-talk] out of main memory

 >> Data: 03/03/16 11:03

 >>

 >>

 >>

 >>> queries work. The Query on the extraction of text from files.
I 495m

 >>> available and would like increase it to at least 1024m.

 >>

 >> So which value did you assign via -Xmx? Do you start BaseX via

 >> basexgui.bat?

 >>

 >>

 >>> Michele

 >>>

 >>> - Original Message 

 >>> Da: "Christian Grün"
<christian.gr...@gmail.com>

 >>> To:

 >>> Cc: "Hondros Constantine ELS-AMS"
<c.hond...@elsevier.com>, "BaseX"

 >>> <basex-talk@mailman.uni-konstanz.de>

 >>> Oggetto: Re: [basex-talk] out of main memory

 >>> Data: 03/03/16 10:42

 >>>

 >>>

 >>>

 >>> Hi Michele,

 >>>

 >>> How much memory do you have available on your system, and how
much did

 >>> you assign? What kind of operations are triggering the
out-of-memory

 >>> errors?

 >>>

 >>> Christian

 >>>

 >>>

 >>>

 >>> On Thu, Mar 3, 2016 at 9:34 AM,
<michele.gre...@email.it> wrote:

 >>>> Hi Constantine, thanks for your answer.

 >>>> I'm trying to do what you suggested but all my tests so
far have had no

 >>>> effect.

 >>>> Do you have any particular which have to be careful as you
suggest?

 >>>>

 >>>> Regards,

 >>>> Michele

 >>>>

 >>>> - Original Message 

 >>>> Da: "Hondros Constantine ELS-AMS"
<c.hond...@elsevier.com>

 >>>> To: "basex-talk@mailman.uni-konstanz.de"

 >>>> <basex-talk@mailman.uni-konstanz.de>

 >>>> Oggetto: RE: [basex-talk] out of main memory

 >>>> Data: 01/03/16 10:07

 >>>>

 >>>> Hi Michele,

 >>>>

 >>>> 1. Go to <BASEX_INSTALL>/bin

 >>>> 2. Open basexgui.bat (Windows) or basexgui (*nix)

 >>>> 3. Increase value of Xmx in BASEX_JVM variable, from 512
MB to

 >>>> something that your system can support

 >>>> 4. Restart the GUI

 >>>>

 >>>> For example, here is the line from my config I allocate
almost 5

 >>>> Gigabytes of memory

 >>>>

 >>>> set BASEX_JVM=-Xmx5000m %BASEX_JVM%

 >>>>

 >>>> Regards,

 >>>> Cosntantine

 >>>>

 >>>>

 >>>> From: basex-talk-boun...@mailman.uni-konstanz.de

 >>>> [

Re: [basex-talk] out of main memory

2016-03-03 Thread michele . greco2
I Christian, i tried and this is my error:


Error:Could not create the Java Virtual Machine


Error:A fatal Exception has occured.Program will exit


 



- Original Message 

 Da: "Christian Grün" <christian.gr...@gmail.com>

 To: 

 Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>, "BaseX"
<basex-talk@mailman.uni-konstanz.de>

 Oggetto: Re: [basex-talk] out of main memory

 Data: 03/03/16 11:57

 

  > I tried to do what i suggested Costantine with the value 1024m, but
my

 > memory is alwais 495m.

 

 Please try the following:

 

 1. Download the ZIP version of BaseX
(http://files.basex.org/releases/latest/)

 

 2. Unzip it, edit bin/basexgui.bat (provided that you are using Windows):

 

 OLD:

 set BASEX_JVM=-Xmx512m %BASEX_JVM%

 

 NEW:

 set BASEX_JVM=-Xmx4g %BASEX_JVM%

 

 3. Start bin/basexgui.bat

 

 4. Click on the lower right memory bar and compare it with the

 attached screenshot.

 

 Does this help?

 Christian

 

 

 

 On Thu, Mar 3, 2016 at 12:26 PM, <michele.gre...@email.it> wrote:

 >

 > - Original Message 

 > Da: "Christian Grün" <christian.gr...@gmail.com>

 > To:

 > Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>,
"BaseX"

 > <basex-talk@mailman.uni-konstanz.de>

 > Oggetto: Re: [basex-talk] out of main memory

 > Data: 03/03/16 11:03

 >

 >

 >

 >> queries work. The Query on the extraction of text from files. I
495m

 >> available and would like increase it to at least 1024m.

 >

 > So which value did you assign via -Xmx? Do you start BaseX via
basexgui.bat?

 >

 >

 >> Michele

 >>

 >> - Original Message 

 >> Da: "Christian Grün"
<christian.gr...@gmail.com>

 >> To:

 >> Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>,
"BaseX"

 >> <basex-talk@mailman.uni-konstanz.de>

 >> Oggetto: Re: [basex-talk] out of main memory

 >> Data: 03/03/16 10:42

 >>

 >>

 >>

 >> Hi Michele,

 >>

 >> How much memory do you have available on your system, and how much
did

 >> you assign? What kind of operations are triggering the
out-of-memory

 >> errors?

 >>

 >> Christian

 >>

 >>

 >>

 >> On Thu, Mar 3, 2016 at 9:34 AM, <michele.gre...@email.it>
wrote:

 >>> Hi Constantine, thanks for your answer.

 >>> I'm trying to do what you suggested but all my tests so far
have had no

 >>> effect.

 >>> Do you have any particular which have to be careful as you
suggest?

 >>>

 >>> Regards,

 >>> Michele

 >>>

 >>> - Original Message 

 >>> Da: "Hondros Constantine ELS-AMS"
<c.hond...@elsevier.com>

 >>> To: "basex-talk@mailman.uni-konstanz.de"

 >>> <basex-talk@mailman.uni-konstanz.de>

 >>> Oggetto: RE: [basex-talk] out of main memory

 >>> Data: 01/03/16 10:07

 >>>

 >>> Hi Michele,

 >>>

 >>> 1. Go to <BASEX_INSTALL>/bin

 >>> 2. Open basexgui.bat (Windows) or basexgui (*nix)

 >>> 3. Increase value of Xmx in BASEX_JVM variable, from 512 MB to

 >>> something that your system can support

 >>> 4. Restart the GUI

 >>>

 >>> For example, here is the line from my config I allocate almost
5

 >>> Gigabytes of memory

 >>>

 >>> set BASEX_JVM=-Xmx5000m %BASEX_JVM%

 >>>

 >>> Regards,

 >>> Cosntantine

 >>>

 >>>

 >>> From: basex-talk-boun...@mailman.uni-konstanz.de

 >>> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf
Of

 >>> michele.gre...@email.it

 >>> Sent: 01 March 2016 10:48

 >>> To: basex-talk@mailman.uni-konstanz.de

 >>> Subject: [basex-talk] out of main memory

 >>>

 >>> Hi, i get the error "out of main memory", when i run my query
on a

 >>> collection in basex gui.

 >>> How can i fix?

 >>> How can i increase my memory?

 >>> Regards

 >>> Michele

 >>>

 >>> 

 >>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio
email.it, per

 >>> tutti i dettagli clicca qui

 >>>

 >>> Sponsor:

 >>> Idee regalo classiche o alternative? Trova l'offerta migliore
in un click

 >>> Clicca qui

 >>>

 >>> 

 >>> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX
Amsterdam, The

 >>> Netherlands, Registrati

Re: [basex-talk] out of main memory

2016-03-03 Thread Christian Grün
> I Christian, i tried and this is my error:
> Error:Could not create the Java Virtual Machine

Did you exactly proceed as described in my last mail? How does the
modified line in your script look like?

>
>
> - Original Message 
> Da: "Christian Grün" 
> To:
> Cc: "Hondros Constantine ELS-AMS" , "BaseX"
> 
> Oggetto: Re: [basex-talk] out of main memory
> Data: 03/03/16 11:57
>
>> I tried to do what i suggested Costantine with the value 1024m, but my
>> memory is alwais 495m.
>
> Please try the following:
>
> 1. Download the ZIP version of BaseX
> (http://files.basex.org/releases/latest/)
>
> 2. Unzip it, edit bin/basexgui.bat (provided that you are using Windows):
>
> OLD:
> set BASEX_JVM=-Xmx512m %BASEX_JVM%
>
> NEW:
> set BASEX_JVM=-Xmx4g %BASEX_JVM%
>
> 3. Start bin/basexgui.bat
>
> 4. Click on the lower right memory bar and compare it with the
> attached screenshot.
>
> Does this help?
> Christian
>
>
>
> On Thu, Mar 3, 2016 at 12:26 PM,  wrote:
>>
>> ----- Original Message 
>> Da: "Christian Grün" 
>> To:
>> Cc: "Hondros Constantine ELS-AMS" , "BaseX"
>> 
>> Oggetto: Re: [basex-talk] out of main memory
>> Data: 03/03/16 11:03
>>
>>
>>
>>> queries work. The Query on the extraction of text from files. I 495m
>>> available and would like increase it to at least 1024m.
>>
>> So which value did you assign via -Xmx? Do you start BaseX via
>> basexgui.bat?
>>
>>
>>> Michele
>>>
>>> - Original Message 
>>> Da: "Christian Grün" 
>>> To:
>>> Cc: "Hondros Constantine ELS-AMS" , "BaseX"
>>> 
>>> Oggetto: Re: [basex-talk] out of main memory
>>> Data: 03/03/16 10:42
>>>
>>>
>>>
>>> Hi Michele,
>>>
>>> How much memory do you have available on your system, and how much did
>>> you assign? What kind of operations are triggering the out-of-memory
>>> errors?
>>>
>>> Christian
>>>
>>>
>>>
>>> On Thu, Mar 3, 2016 at 9:34 AM,  wrote:
>>>> Hi Constantine, thanks for your answer.
>>>> I'm trying to do what you suggested but all my tests so far have had no
>>>> effect.
>>>> Do you have any particular which have to be careful as you suggest?
>>>>
>>>> Regards,
>>>> Michele
>>>>
>>>> - Original Message 
>>>> Da: "Hondros Constantine ELS-AMS" 
>>>> To: "basex-talk@mailman.uni-konstanz.de"
>>>> 
>>>> Oggetto: RE: [basex-talk] out of main memory
>>>> Data: 01/03/16 10:07
>>>>
>>>> Hi Michele,
>>>>
>>>> 1. Go to /bin
>>>> 2. Open basexgui.bat (Windows) or basexgui (*nix)
>>>> 3. Increase value of Xmx in BASEX_JVM variable, from 512 MB to
>>>> something that your system can support
>>>> 4. Restart the GUI
>>>>
>>>> For example, here is the line from my config I allocate almost 5
>>>> Gigabytes of memory
>>>>
>>>> set BASEX_JVM=-Xmx5000m %BASEX_JVM%
>>>>
>>>> Regards,
>>>> Cosntantine
>>>>
>>>>
>>>> From: basex-talk-boun...@mailman.uni-konstanz.de
>>>> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of
>>>> michele.gre...@email.it
>>>> Sent: 01 March 2016 10:48
>>>> To: basex-talk@mailman.uni-konstanz.de
>>>> Subject: [basex-talk] out of main memory
>>>>
>>>> Hi, i get the error "out of main memory", when i run my query on a
>>>> collection in basex gui.
>>>> How can i fix?
>>>> How can i increase my memory?
>>>> Regards
>>>> Michele
>>>>
>>>> 
>>>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>>>> tutti i dettagli clicca qui
>>>>
>>>> Sponsor:
>>>> Idee regalo classiche o alternative? Trova l'offerta migliore in un
>>>> click
>>>> Clicca qui
>>>>
>>>> 
>>>> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
>>>> Netherlands, Registration No. 33156677, Registered in The Netherlands.
>>>>
>>>>
>>>> 
>>>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>>>> tutti i dettagli clicca qui
>>>>
>>>> Sponsor:
>>>> Idee regalo classiche o alternative? Trova l'offerta migliore in un
>>>> click
>>>> Clicca qui
>>>
>>>
>>> 
>>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>>> tutti i dettagli clicca qui
>>>
>>> Sponsor:
>>> Caselle con tuo dominio su piattaforma Zimbra, fino a 30 GB di spazio,
>>> sincronizzazione dati e backup
>>> Clicca qui
>>
>>
>> 
>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>> tutti i dettagli clicca qui
>>
>> Sponsor:
>> Soluzioni di email hosting per tutte le esigenze: dalle caselle gratuite a
>> quelle professionali su piattaforma Zimbra, da quelle su proprio dominio a
>> quelle certificate PEC. Confronta le soluzioni
>> Clicca qui
>
>
> 
> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
> tutti i dettagli clicca qui
>
> Sponsor:
> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
> Clicca qui


Re: [basex-talk] out of main memory

2016-03-03 Thread Christian Grün
> I tried to do what i suggested Costantine with the value 1024m, but my
> memory is alwais 495m.

Please try the following:

1. Download the ZIP version of BaseX (http://files.basex.org/releases/latest/)

2. Unzip it, edit bin/basexgui.bat (provided that you are using Windows):

OLD:
set BASEX_JVM=-Xmx512m %BASEX_JVM%

NEW:
set BASEX_JVM=-Xmx4g %BASEX_JVM%

3. Start bin/basexgui.bat

4. Click on the lower right memory bar and compare it with the
attached screenshot.

Does this help?
Christian



On Thu, Mar 3, 2016 at 12:26 PM,   wrote:
>
> - Original Message 
> Da: "Christian Grün" 
> To:
> Cc: "Hondros Constantine ELS-AMS" , "BaseX"
> 
> Oggetto: Re: [basex-talk] out of main memory
> Data: 03/03/16 11:03
>
>
>
>> queries work. The Query on the extraction of text from files. I 495m
>> available and would like increase it to at least 1024m.
>
> So which value did you assign via -Xmx? Do you start BaseX via basexgui.bat?
>
>
>> Michele
>>
>> - Original Message ----
>> Da: "Christian Grün" 
>> To:
>> Cc: "Hondros Constantine ELS-AMS" , "BaseX"
>> 
>> Oggetto: Re: [basex-talk] out of main memory
>> Data: 03/03/16 10:42
>>
>>
>>
>> Hi Michele,
>>
>> How much memory do you have available on your system, and how much did
>> you assign? What kind of operations are triggering the out-of-memory
>> errors?
>>
>> Christian
>>
>>
>>
>> On Thu, Mar 3, 2016 at 9:34 AM,  wrote:
>>> Hi Constantine, thanks for your answer.
>>> I'm trying to do what you suggested but all my tests so far have had no
>>> effect.
>>> Do you have any particular which have to be careful as you suggest?
>>>
>>> Regards,
>>> Michele
>>>
>>> - Original Message 
>>> Da: "Hondros Constantine ELS-AMS" 
>>> To: "basex-talk@mailman.uni-konstanz.de"
>>> 
>>> Oggetto: RE: [basex-talk] out of main memory
>>> Data: 01/03/16 10:07
>>>
>>> Hi Michele,
>>>
>>> 1. Go to /bin
>>> 2. Open basexgui.bat (Windows) or basexgui (*nix)
>>> 3. Increase value of Xmx in BASEX_JVM variable, from 512 MB to
>>> something that your system can support
>>> 4. Restart the GUI
>>>
>>> For example, here is the line from my config I allocate almost 5
>>> Gigabytes of memory
>>>
>>> set BASEX_JVM=-Xmx5000m %BASEX_JVM%
>>>
>>> Regards,
>>> Cosntantine
>>>
>>>
>>> From: basex-talk-boun...@mailman.uni-konstanz.de
>>> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of
>>> michele.gre...@email.it
>>> Sent: 01 March 2016 10:48
>>> To: basex-talk@mailman.uni-konstanz.de
>>> Subject: [basex-talk] out of main memory
>>>
>>> Hi, i get the error "out of main memory", when i run my query on a
>>> collection in basex gui.
>>> How can i fix?
>>> How can i increase my memory?
>>> Regards
>>> Michele
>>>
>>> 
>>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>>> tutti i dettagli clicca qui
>>>
>>> Sponsor:
>>> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
>>> Clicca qui
>>>
>>> 
>>> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
>>> Netherlands, Registration No. 33156677, Registered in The Netherlands.
>>>
>>>
>>> 
>>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>>> tutti i dettagli clicca qui
>>>
>>> Sponsor:
>>> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
>>> Clicca qui
>>
>>
>> 
>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>> tutti i dettagli clicca qui
>>
>> Sponsor:
>> Caselle con tuo dominio su piattaforma Zimbra, fino a 30 GB di spazio,
>> sincronizzazione dati e backup
>> Clicca qui
>
>
> 
> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
> tutti i dettagli clicca qui
>
> Sponsor:
> Soluzioni di email hosting per tutte le esigenze: dalle caselle gratuite a
> quelle professionali su piattaforma Zimbra, da quelle su proprio dominio a
> quelle certificate PEC. Confronta le soluzioni
> Clicca qui


Re: [basex-talk] out of main memory

2016-03-03 Thread michele . greco2
I tried to do what i suggested Costantine with the value 1024m, but my
memory is alwais 495m.



- Original Message 

 Da: "Christian Grün" <christian.gr...@gmail.com>

 To: 

 Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>, "BaseX"
<basex-talk@mailman.uni-konstanz.de>

 Oggetto: Re: [basex-talk] out of main memory

 Data: 03/03/16 11:03

 

  

 

 > queries work. The Query on the extraction of text from files. I 495m

 > available and would like increase it to at least 1024m.

 

 So which value did you assign via -Xmx? Do you start BaseX via
basexgui.bat?

 

 

 > Michele

 >

 > - Original Message 

 > Da: "Christian Grün" <christian.gr...@gmail.com>

 > To:

 > Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>,
"BaseX"

 > <basex-talk@mailman.uni-konstanz.de>

 > Oggetto: Re: [basex-talk] out of main memory

 > Data: 03/03/16 10:42

 >

 >

 >

 > Hi Michele,

 >

 > How much memory do you have available on your system, and how much did

 > you assign? What kind of operations are triggering the out-of-memory

 > errors?

 >

 > Christian

 >

 >

 >

 > On Thu, Mar 3, 2016 at 9:34 AM, <michele.gre...@email.it> wrote:

 >> Hi Constantine, thanks for your answer.

 >> I'm trying to do what you suggested but all my tests so far have
had no

 >> effect.

 >> Do you have any particular which have to be careful as you
suggest?

 >>

 >> Regards,

 >> Michele

 >>

 >> - Original Message 

 >> Da: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>

 >> To: "basex-talk@mailman.uni-konstanz.de"

 >> <basex-talk@mailman.uni-konstanz.de>

 >> Oggetto: RE: [basex-talk] out of main memory

 >> Data: 01/03/16 10:07

 >>

 >> Hi Michele,

 >>

 >> 1. Go to <BASEX_INSTALL>/bin

 >> 2. Open basexgui.bat (Windows) or basexgui (*nix)

 >> 3. Increase value of Xmx in BASEX_JVM variable, from 512 MB to

 >> something that your system can support

 >> 4. Restart the GUI

 >>

 >> For example, here is the line from my config I allocate almost 5

 >> Gigabytes of memory

 >>

 >> set BASEX_JVM=-Xmx5000m %BASEX_JVM%

 >>

 >> Regards,

 >> Cosntantine

 >>

 >>

 >> From: basex-talk-boun...@mailman.uni-konstanz.de

 >> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of

 >> michele.gre...@email.it

 >> Sent: 01 March 2016 10:48

 >> To: basex-talk@mailman.uni-konstanz.de

 >> Subject: [basex-talk] out of main memory

 >>

 >> Hi, i get the error "out of main memory", when i run my query on a

 >> collection in basex gui.

 >> How can i fix?

 >> How can i increase my memory?

 >> Regards

 >> Michele

 >>

 >> 

 >> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio
email.it, per

 >> tutti i dettagli clicca qui

 >>

 >> Sponsor:

 >> Idee regalo classiche o alternative? Trova l'offerta migliore in
un click

 >> Clicca qui

 >>

 >> 

 >> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam,
The

 >> Netherlands, Registration No. 33156677, Registered in The
Netherlands.

 >>

 >>

 >> 

 >> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio
email.it, per

 >> tutti i dettagli clicca qui

 >>

 >> Sponsor:

 >> Idee regalo classiche o alternative? Trova l'offerta migliore in
un click

 >> Clicca qui

 >

 >

 > 

 > ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it,
per

 > tutti i dettagli clicca qui

 >

 > Sponsor:

 > Caselle con tuo dominio su piattaforma Zimbra, fino a 30 GB di spazio,

 > sincronizzazione dati e backup

 > Clicca qui

  


 
 
 --
 ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per tutti 
i dettagli 
Clicca qui 
http://posta.email.it/caselle-di-posta-z-email-it/?utm_campaign=email_Zimbra_102014=main_footer/f
 
 Sponsor:
 Registra i domini che desideri ed inizia a creare il tuo sito web
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=13323&d=3-3

Re: [basex-talk] out of main memory

2016-03-03 Thread Christian Grün
> queries work. The Query on  the extraction of text from files. I 495m
> available and would like increase it to at least 1024m.

So which value did you assign via -Xmx? Do you start BaseX via basexgui.bat?…


> Michele
>
> - Original Message 
> Da: "Christian Grün" 
> To:
> Cc: "Hondros Constantine ELS-AMS" , "BaseX"
> 
> Oggetto: Re: [basex-talk] out of main memory
> Data: 03/03/16 10:42
>
>
>
> Hi Michele,
>
> How much memory do you have available on your system, and how much did
> you assign? What kind of operations are triggering the out-of-memory
> errors?
>
> Christian
>
>
>
> On Thu, Mar 3, 2016 at 9:34 AM,  wrote:
>> Hi Constantine, thanks for your answer.
>> I'm trying to do what you suggested but all my tests so far have had no
>> effect.
>> Do you have any particular which have to be careful as you suggest?
>>
>> Regards,
>> Michele
>>
>> - Original Message 
>> Da: "Hondros Constantine ELS-AMS" 
>> To: "basex-talk@mailman.uni-konstanz.de"
>> 
>> Oggetto: RE: [basex-talk] out of main memory
>> Data: 01/03/16 10:07
>>
>> Hi Michele,
>>
>> 1. Go to /bin
>> 2. Open basexgui.bat (Windows) or basexgui (*nix)
>> 3. Increase value of Xmx in BASEX_JVM variable, from 512 MB to
>> something that your system can support
>> 4. Restart the GUI
>>
>> For example, here is the line from my config I allocate almost 5
>> Gigabytes of memory
>>
>> set BASEX_JVM=-Xmx5000m %BASEX_JVM%
>>
>> Regards,
>> Cosntantine
>>
>>
>> From: basex-talk-boun...@mailman.uni-konstanz.de
>> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of
>> michele.gre...@email.it
>> Sent: 01 March 2016 10:48
>> To: basex-talk@mailman.uni-konstanz.de
>> Subject: [basex-talk] out of main memory
>>
>> Hi, i get the error "out of main memory", when i run my query on a
>> collection in basex gui.
>> How can i fix?
>> How can i increase my memory?
>> Regards
>> Michele
>>
>> 
>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>> tutti i dettagli clicca qui
>>
>> Sponsor:
>> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
>> Clicca qui
>>
>> 
>> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
>> Netherlands, Registration No. 33156677, Registered in The Netherlands.
>>
>>
>> 
>> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
>> tutti i dettagli clicca qui
>>
>> Sponsor:
>> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
>> Clicca qui
>
>
> 
> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
> tutti i dettagli clicca qui
>
> Sponsor:
> Caselle con tuo dominio su piattaforma Zimbra, fino a 30 GB di spazio,
> sincronizzazione dati e backup
> Clicca qui


Re: [basex-talk] out of main memory

2016-03-03 Thread michele . greco2
Hi,


i created a collection of XML files, and so everithing is right. The error
out of memory happens when i run some queries about my collection,other
queries work. The Query on  the extraction of text from files. I 495m
available and would like increase it to at least 1024m.


Michele



- Original Message 

 Da: "Christian Grün" <christian.gr...@gmail.com>

 To: 

 Cc: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>, "BaseX"
<basex-talk@mailman.uni-konstanz.de>

 Oggetto: Re: [basex-talk] out of main memory

 Data: 03/03/16 10:42

 

  

 

 Hi Michele,

 

 How much memory do you have available on your system, and how much did

 you assign? What kind of operations are triggering the out-of-memory

 errors?

 

 Christian

 

 

 

 On Thu, Mar 3, 2016 at 9:34 AM, <michele.gre...@email.it> wrote:

 > Hi Constantine, thanks for your answer.

 > I'm trying to do what you suggested but all my tests so far have had
no

 > effect.

 > Do you have any particular which have to be careful as you suggest?

 >

 > Regards,

 > Michele

 >

 > - Original Message 

 > Da: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>

 > To: "basex-talk@mailman.uni-konstanz.de"

 > <basex-talk@mailman.uni-konstanz.de>

 > Oggetto: RE: [basex-talk] out of main memory

 > Data: 01/03/16 10:07

 >

 > Hi Michele,

 >

 > 1. Go to <BASEX_INSTALL>/bin

 > 2. Open basexgui.bat (Windows) or basexgui (*nix)

 > 3. Increase value of Xmx in BASEX_JVM variable, from 512 MB to

 > something that your system can support

 > 4. Restart the GUI

 >

 > For example, here is the line from my config I allocate almost 5

 > Gigabytes of memory

 >

 > set BASEX_JVM=-Xmx5000m %BASEX_JVM%

 >

 > Regards,

 > Cosntantine

 >

 >

 > From: basex-talk-boun...@mailman.uni-konstanz.de

 > [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of

 > michele.gre...@email.it

 > Sent: 01 March 2016 10:48

 > To: basex-talk@mailman.uni-konstanz.de

 > Subject: [basex-talk] out of main memory

 >

 > Hi, i get the error "out of main memory", when i run my query on a

 > collection in basex gui.

 > How can i fix?

 > How can i increase my memory?

 > Regards

 > Michele

 >

 > 

 > ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it,
per

 > tutti i dettagli clicca qui

 >

 > Sponsor:

 > Idee regalo classiche o alternative? Trova l'offerta migliore in un
click

 > Clicca qui

 >

 > 

 > Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The

 > Netherlands, Registration No. 33156677, Registered in The Netherlands.

 >

 >

 > 

 > ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it,
per

 > tutti i dettagli clicca qui

 >

 > Sponsor:

 > Idee regalo classiche o alternative? Trova l'offerta migliore in un
click

 > Clicca qui

  


 
 
 --
 ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per tutti 
i dettagli 
Clicca qui 
http://posta.email.it/caselle-di-posta-z-email-it/?utm_campaign=email_Zimbra_102014=main_footer/f
 
 Sponsor:
 Idee regalo classiche o alternative? Trova l'offerta migliore in un click
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=13327&d=3-3

Re: [basex-talk] out of main memory

2016-03-03 Thread Christian Grün
Hi Michele,

How much memory do you have available on your system, and how much did
you assign? What kind of operations are triggering the out-of-memory
errors?

Christian



On Thu, Mar 3, 2016 at 9:34 AM,   wrote:
> Hi Constantine, thanks for your answer.
> I'm trying to do what you suggested but all my tests so far have had no
> effect.
> Do you have any particular which have to be careful as you suggest?
>
> Regards,
> Michele
>
> - Original Message 
> Da: "Hondros Constantine ELS-AMS" 
> To: "basex-talk@mailman.uni-konstanz.de"
> 
> Oggetto: RE: [basex-talk] out of main memory
> Data: 01/03/16 10:07
>
> Hi Michele,
>
> 1.   Go to /bin
> 2.   Open basexgui.bat (Windows) or basexgui (*nix)
> 3.   Increase value of –Xmx in BASEX_JVM variable, from 512 MB to
> something that your system can support
> 4.   Restart the GUI
>
> For example, here is the line from my config –  I allocate almost 5
> Gigabytes of memory
>
> set BASEX_JVM=-Xmx5000m %BASEX_JVM%
>
> Regards,
> Cosntantine
>
>
> From: basex-talk-boun...@mailman.uni-konstanz.de
> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of
> michele.gre...@email.it
> Sent: 01 March 2016 10:48
> To: basex-talk@mailman.uni-konstanz.de
> Subject: [basex-talk] out of main memory
>
> Hi, i get the error "out of main memory", when i run my query on a
> collection in basex gui.
> How can i fix?
> How can i increase my memory?
> Regards
> Michele
>
> 
> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
> tutti i dettagli clicca qui
>
> Sponsor:
> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
> Clicca qui
>
> 
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33156677, Registered in The Netherlands.
>
>
> 
> ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
> tutti i dettagli clicca qui
>
> Sponsor:
> Idee regalo classiche o alternative? Trova l'offerta migliore in un click
> Clicca qui


Re: [basex-talk] out of main memory

2016-03-03 Thread michele . greco2
Hi Constantine, thanks for your answer.


I'm trying to do what you suggested but all my tests so far have had no
effect.


Do you have any particular which have to be careful as you suggest?


 


Regards,


Michele



- Original Message 

 Da: "Hondros Constantine ELS-AMS" <c.hond...@elsevier.com>

 To: "basex-talk@mailman.uni-konstanz.de"
<basex-talk@mailman.uni-konstanz.de>

 Oggetto: RE: [basex-talk] out of main memory

 Data: 01/03/16 10:07

 

 

Hi Michele,


 


1.   Go to <BASEX_INSTALL>/bin


2.   Open basexgui.bat (Windows) or
basexgui (*nix)


3.   Increase value of –Xmx in
BASEX_JVM variable, from 512 MB to something that your system can support


4.   Restart the GUI


 


For example, here is the line from my config –  I allocate almost
5 Gigabytes of memory


 


set BASEX_JVM=-Xmx5000m %BASEX_JVM%


 


Regards,


Cosntantine


 


 


From: basex-talk-boun...@mailman.uni-konstanz.de
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of
michele.gre...@email.it

 Sent: 01 March 2016 10:48

 To: basex-talk@mailman.uni-konstanz.de

 Subject: [basex-talk] out of main memory


 



Hi, i get the error "out of main memory", when i run my query on a
collection in basex gui.




How can i fix?




How can i increase my memory?




Regards 




Michele



 





 ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per
tutti i dettagli  clicca qui

 

 Sponsor:

 Idee regalo classiche o alternative? Trova l'offerta migliore in un click

 Clicca qui 






Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
Netherlands, Registration No. 33156677, Registered in The Netherlands. 





 
 
 --
 ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per tutti 
i dettagli 
Clicca qui 
http://posta.email.it/caselle-di-posta-z-email-it/?utm_campaign=email_Zimbra_102014=main_footer/f
 
 Sponsor:
 Idee regalo classiche o alternative? Trova l'offerta migliore in un click
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=13327&d=3-3

Re: [basex-talk] out of main memory

2016-03-01 Thread Hondros, Constantine (ELS-AMS)
Hi Michele,


1.   Go to /bin

2.   Open basexgui.bat (Windows) or basexgui (*nix)

3.   Increase value of -Xmx in BASEX_JVM variable, from 512 MB to something 
that your system can support

4.   Restart the GUI

For example, here is the line from my config -  I allocate almost 5 Gigabytes 
of memory

set BASEX_JVM=-Xmx5000m %BASEX_JVM%

Regards,
Cosntantine


From: basex-talk-boun...@mailman.uni-konstanz.de 
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of 
michele.gre...@email.it
Sent: 01 March 2016 10:48
To: basex-talk@mailman.uni-konstanz.de
Subject: [basex-talk] out of main memory

Hi, i get the error "out of main memory", when i run my query on a collection 
in basex gui.
How can i fix?
How can i increase my memory?
Regards
Michele


ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per tutti i 
dettagli clicca 
qui<http://posta.email.it/caselle-di-posta-z-email-it/?utm_campaign=email_Zimbra_102014=main_footer>

Sponsor:
Idee regalo classiche o alternative? Trova l'offerta migliore in un click
Clicca qui<http://adv.email.it/cgi-bin/foclick.cgi?mid=13327&d=20160301>



Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The 
Netherlands, Registration No. 33156677, Registered in The Netherlands.


[basex-talk] out of main memory

2016-03-01 Thread michele . greco2
Hi, i get the error "out of main memory", when i run my query on a
collection in basex gui.


How can i fix?


How can i increase my memory?


Regards 


Michele




 
 
 --
 ZE-Light e ZE-Pro: servizi zimbra per caselle con dominio email.it, per tutti 
i dettagli 
Clicca qui 
http://posta.email.it/caselle-di-posta-z-email-it/?utm_campaign=email_Zimbra_102014=main_footer/f
 
 Sponsor:
 Soluzioni di email hosting per tutte le esigenze: dalle caselle gratuite a 
quelle professionali su piattaforma Zimbra, da quelle su proprio dominio a 
quelle certificate PEC. Confronta le soluzioni
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=13326&d=1-3

Re: [basex-talk] Out of Main Memory - Inconsistent DB

2015-09-14 Thread Christian Grün
Hi Martín,

> I think one of our DB seems to have gotten corrupt. I wonder if there's
> a command that will recover it.

Unfortunately, that's generally a difficult task. However, it may be
important to find out how the problem was caused: Can you tell if
there was any update that was manually interrupted?

Christian


 Issuing REPLACE or OPTIMIZE on it will
> result in "Out of Main Memory" errors. We have DBs that occupy more space
> and have more documents, so I assume the problem must be that this DB broke
> somehow. I've tried raising the -Xmx parameter to 1024m, but still get the
> same "Out of Main Memory" error.
>
>  Thanks,
>   Martín.
>
>  INSPECT
>  Checking main table (11292441 nodes):
> - 0 invalid node kinds
> - 0 invalid parent references
> - 87 wrong parent/descendant relationships (pre: 10932445,..)
> Warning: Database is inconsistent.
>
>
> INFO DB
> Database Properties
>  Name: OrderInquiryV2_1
>  Size: 2138 MB
>  Nodes: 11292441
>  Documents: 251980
>  Binaries: 0
>  Timestamp: 2015-09-10T16:40:29.000Z
>
> Resource Properties
>  Timestamp: 2015-09-08T23:11:28.497Z
>  Encoding: UTF-8
>  CHOP: false
>
> Indexes
>  Up-to-date: false
>  TEXTINDEX: false
>  ATTRINDEX: false
>  FTINDEX: false
>  LANGUAGE: English
>  STEMMING: false
>  CASESENS: false
>  DIACRITICS: false
>  STOPWORDS:
>  UPDINDEX: false
>  AUTOOPTIMIZE: false
>  MAXCATS: 100
>  MAXLEN: 96
>


[basex-talk] Out of Main Memory - Inconsistent DB

2015-09-10 Thread Martín Ferrari
Hi all,I think one of our DB seems to have gotten corrupt. I wonder if 
there's a command that will recover it. Issuing REPLACE or OPTIMIZE on it will 
result in "Out of Main Memory" errors. We have DBs that occupy more space and 
have more documents, so I assume the problem must be that this DB broke 
somehow. I've tried raising the -Xmx parameter to 1024m, but still get the same 
"Out of Main Memory" error.
 Thanks,  Martín.
 INSPECT Checking main table (11292441 nodes):- 0 invalid node kinds- 0 invalid 
parent references- 87 wrong parent/descendant relationships (pre: 
10932445,..)Warning: Database is inconsistent.

INFO DB
Database Properties Name: OrderInquiryV2_1 Size: 2138 MB Nodes: 11292441 
Documents: 251980 Binaries: 0 Timestamp: 2015-09-10T16:40:29.000Z
Resource Properties Timestamp: 2015-09-08T23:11:28.497Z Encoding: UTF-8 CHOP: 
false
Indexes Up-to-date: false TEXTINDEX: false ATTRINDEX: false FTINDEX: false 
LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS:  
UPDINDEX: false AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96
  

Re: [basex-talk] "Out of Main Memory" error

2014-02-04 Thread Christian Grün
Hi Geoff,

have you assigned enough memory to the JVM on the 48GB machine (via
-Xmx)? You may need to edit the BaseX start scripts [1] for that (but
it depends on how you starting BaseX).

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Start_Scripts


> I get an "out of main memory" error when I try an xquery in Basex GUI on a
> ~5GB xml database file on an HP 48GB RAM 12 core Z600 workstation.  I could
> run the query on my 8GB RAM laptop without a problem.  Is there some way I
> can increase my available memory on my workstation to accommodate large xml
> files?
>
> Thanks.
>
> Geoff
>
> --
>
> *
> Geoffrey Hougland
> 7902 Kara Ct
> Greenbelt MD 20770
> 240-600-3730 cell
> 301-776-3212 home
> *
>
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Out Of Main Memory error while creating BaseX DB

2013-10-23 Thread Christian Grün
Hi Elangovan,

> I am using BaseX to evaluate Xpaths (Xqueries) for larger size of XML files.
> I have created BaseX database initially using CreateDB.execute(ctx) API.

What do you mean with "CreateDB.execute(ctx) API"? Could you provide
us with a minimized example?

Best,
Christian


>
> And every time when I need to evaluate Xpath I open database using
> Open("database name") API and get a context. Using this context I evaluate
> the xpath. Once I complete my evaluation I close the context using
> Close.execute(ctx) API.
>
> My application is running without any issues. If the server runs for few
> weeks, I have observed "Out Of Main memory error" while updating the BaseX
> database.
>
> Could any one please help me in this issue. Is there any possibility of
> memory leak in BaseX in my work flow?
>
> Note: I am using same CreateDB(...) API to update / overwrite the existing
> Database.
>
> XML file Size may vary from 3Mb to 10 Mb.
>
> --
> Regards,
> Elango.
>
> ___
> BaseX-Talk mailing list
> BaseX-Talk@mailman.uni-konstanz.de
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] Out Of Main Memory error while creating BaseX DB

2013-10-23 Thread elangovan MuthuSwamy
Hi All,

I am using BaseX to evaluate Xpaths (Xqueries) for larger size of XML
files. I have created BaseX database initially using *CreateDB*.execute(ctx)
API.

And every time when I need to evaluate Xpath I open database using
Open("database name") API and get a context. Using this context I evaluate
the xpath. Once I complete my evaluation I close the context using
Close.execute(ctx) API.

My application is running without any issues. If the server runs for few
weeks, I have observed "Out Of Main memory error" while updating the BaseX
database.

Could any one please help me in this issue. Is there any possibility of
memory leak in BaseX in my work flow?

Note: I am using same CreateDB(...) API to update / overwrite the existing
Database.

XML file Size may vary from 3Mb to 10 Mb.

-- 
Regards,
Elango.
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Out of Main Memory; bytea type

2013-06-24 Thread Alexander von Bernuth
Hi Christian,

after further reasearch I found one way it works:

> let $prepared := sql:prepare($conn, "INSERT INTO imagetest VALUES (?, 
> decode(?, 'base64'))")
> let $params :=  
> {$id}
>  type='string'>{$image}
> 
> return sql:execute-prepared($prepared, $params)


where $image contains a xs:base64Binary. This way PostgreSQL itself handles its 
type bytea and there is no need for a special parameter type. To receive your 
data from the table again, you may use

> SELECT i.id, encode(i.image, 'base64') FROM imagetest i;

Nonetheless, thank you very much for your support.

Best,
Alex

-- 
| Alexander von Bernuth
| alexander.von-bern...@student.uni-tuebingen.de

Am 24.06.2013 um 16:58 schrieb Christian Grün:

> Hi Alex,
> 
> the mapping of types is defined in the BaseX FNSql class [1]. "bytea"
> seems to be a PostgreSQL-specific data type, so I’m not sure which
> mapping would be appropriate here. Could you do some research for us
> and try to find out which SQL types may give satifying results (see
> [2] for the existing setters)?
> 
> Thanks,
> Christian
> 
> [1] 
> https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/query/func/FNSql.java
> [2] http://docs.oracle.com/javase/6/docs/api/java/sql/PreparedStatement.html
> ___
> 
>> thank you very much, I am going to test this as soon as I am at home. I
>> think this is going to fix my issue.
>> However, do you happen to know which sql:parameter type I have to use when I
>> try to insert xs:base64binary into my bytea-column in postgres?
>> 
>> Thank you again,
>> Alex
>> 
>> --
>> | Alexander von Bernuth
>> | alexander.von-bern...@student.uni-tuebingen.de
>> 
>> Am 24.06.2013 um 11:48 schrieb Christian Grün:
>> 
>> Hi Alexander,
>> 
>> how does your XQuery/BaseX script look like? If you use the XQuery
>> doc() function, you could try to replace it with
>> parse-xml(fetch:text(...)), because the latter approach will close
>> your documents and free memory if the processed document is not
>> required anymore.
>> 
>> Best,
>> Christian
>> ___
>> 
>> 2013/6/24 Alexander von Bernuth
>> :
>> 
>> Hello all,
>> 
>> 
>> my basex-script should fetch 10.000something XML-files automatically from a
>> 
>> website and insert their content into a external PostgreSQL-database. After
>> 
>> about 8.000 files my script stops and I get "Out of Main Memory".
>> 
>> I found your discussion with "kgfhjjgrn" [1] regarding this issue, but I'm
>> 
>> not sure whether these options apply to my problem - I do not build a
>> 
>> basex-database but an external one. Will autoflush=false and flushing by
>> 
>> myself help with this?
>> 
>> 
>> Second, I want to insert some xs:base64Binary into my PostgreSQL database,
>> 
>> but I cannot find the correct sql:parameter type for the bytea-column.
>> 
>> 
>> Could you please help me with my issues?
>> 
>> 
>> Thank you very much,
>> 
>> Alexander
>> 
>> 
>> 
>> 
>> [1] http://comments.gmane.org/gmane.text.xml.basex.talk/2540
>> 
>> 
>> 
>> 
>> --
>> 
>> | Alexander von Bernuth
>> 
>> | alexander.von-bern...@student.uni-tuebingen.de
>> 
>> 
>> 
>> ___
>> 
>> BaseX-Talk mailing list
>> 
>> BaseX-Talk@mailman.uni-konstanz.de
>> 
>> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>> 
>> 
>> 

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Out of Main Memory; bytea type

2013-06-24 Thread Christian Grün
Hi Alex,

the mapping of types is defined in the BaseX FNSql class [1]. "bytea"
seems to be a PostgreSQL-specific data type, so I’m not sure which
mapping would be appropriate here. Could you do some research for us
and try to find out which SQL types may give satifying results (see
[2] for the existing setters)?

Thanks,
Christian

[1] 
https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/query/func/FNSql.java
[2] http://docs.oracle.com/javase/6/docs/api/java/sql/PreparedStatement.html
___

> thank you very much, I am going to test this as soon as I am at home. I
> think this is going to fix my issue.
> However, do you happen to know which sql:parameter type I have to use when I
> try to insert xs:base64binary into my bytea-column in postgres?
>
> Thank you again,
> Alex
>
> --
> | Alexander von Bernuth
> | alexander.von-bern...@student.uni-tuebingen.de
>
> Am 24.06.2013 um 11:48 schrieb Christian Grün:
>
> Hi Alexander,
>
> how does your XQuery/BaseX script look like? If you use the XQuery
> doc() function, you could try to replace it with
> parse-xml(fetch:text(...)), because the latter approach will close
> your documents and free memory if the processed document is not
> required anymore.
>
> Best,
> Christian
> ___
>
> 2013/6/24 Alexander von Bernuth
> :
>
> Hello all,
>
>
> my basex-script should fetch 10.000something XML-files automatically from a
>
> website and insert their content into a external PostgreSQL-database. After
>
> about 8.000 files my script stops and I get "Out of Main Memory".
>
> I found your discussion with "kgfhjjgrn" [1] regarding this issue, but I'm
>
> not sure whether these options apply to my problem - I do not build a
>
> basex-database but an external one. Will autoflush=false and flushing by
>
> myself help with this?
>
>
> Second, I want to insert some xs:base64Binary into my PostgreSQL database,
>
> but I cannot find the correct sql:parameter type for the bytea-column.
>
>
> Could you please help me with my issues?
>
>
> Thank you very much,
>
> Alexander
>
>
>
>
> [1] http://comments.gmane.org/gmane.text.xml.basex.talk/2540
>
>
>
>
> --
>
> | Alexander von Bernuth
>
> | alexander.von-bern...@student.uni-tuebingen.de
>
>
>
> ___
>
> BaseX-Talk mailing list
>
> BaseX-Talk@mailman.uni-konstanz.de
>
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
>
>
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Out of Main Memory; bytea type

2013-06-24 Thread Alexander von Bernuth
Hi Christian,

thank you very much, I am going to test this as soon as I am at home. I think 
this is going to fix my issue.
However, do you happen to know which sql:parameter type I have to use when I 
try to insert xs:base64binary into my bytea-column in postgres?

Thank you again,
Alex

-- 
| Alexander von Bernuth
| alexander.von-bern...@student.uni-tuebingen.de

Am 24.06.2013 um 11:48 schrieb Christian Grün:

> Hi Alexander,
> 
> how does your XQuery/BaseX script look like? If you use the XQuery
> doc() function, you could try to replace it with
> parse-xml(fetch:text(...)), because the latter approach will close
> your documents and free memory if the processed document is not
> required anymore.
> 
> Best,
> Christian
> ___
> 
> 2013/6/24 Alexander von Bernuth
> :
>> Hello all,
>> 
>> my basex-script should fetch 10.000something XML-files automatically from a
>> website and insert their content into a external PostgreSQL-database. After
>> about 8.000 files my script stops and I get "Out of Main Memory".
>> I found your discussion with "kgfhjjgrn" [1] regarding this issue, but I'm
>> not sure whether these options apply to my problem - I do not build a
>> basex-database but an external one. Will autoflush=false and flushing by
>> myself help with this?
>> 
>> Second, I want to insert some xs:base64Binary into my PostgreSQL database,
>> but I cannot find the correct sql:parameter type for the bytea-column.
>> 
>> Could you please help me with my issues?
>> 
>> Thank you very much,
>> Alexander
>> 
>> 
>> 
>> [1] http://comments.gmane.org/gmane.text.xml.basex.talk/2540
>> 
>> 
>> 
>> --
>> | Alexander von Bernuth
>> | alexander.von-bern...@student.uni-tuebingen.de
>> 
>> 
>> ___
>> BaseX-Talk mailing list
>> BaseX-Talk@mailman.uni-konstanz.de
>> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>> 

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Out of Main Memory; bytea type

2013-06-24 Thread Christian Grün
Hi Alexander,

how does your XQuery/BaseX script look like? If you use the XQuery
doc() function, you could try to replace it with
parse-xml(fetch:text(...)), because the latter approach will close
your documents and free memory if the processed document is not
required anymore.

Best,
Christian
___

2013/6/24 Alexander von Bernuth
:
> Hello all,
>
> my basex-script should fetch 10.000something XML-files automatically from a
> website and insert their content into a external PostgreSQL-database. After
> about 8.000 files my script stops and I get "Out of Main Memory".
> I found your discussion with "kgfhjjgrn" [1] regarding this issue, but I'm
> not sure whether these options apply to my problem - I do not build a
> basex-database but an external one. Will autoflush=false and flushing by
> myself help with this?
>
> Second, I want to insert some xs:base64Binary into my PostgreSQL database,
> but I cannot find the correct sql:parameter type for the bytea-column.
>
> Could you please help me with my issues?
>
> Thank you very much,
> Alexander
>
>
>
> [1] http://comments.gmane.org/gmane.text.xml.basex.talk/2540
>
>
>
> --
> | Alexander von Bernuth
> | alexander.von-bern...@student.uni-tuebingen.de
>
>
> ___
> BaseX-Talk mailing list
> BaseX-Talk@mailman.uni-konstanz.de
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] Out of Main Memory; bytea type

2013-06-24 Thread Alexander von Bernuth
Hello all,

my basex-script should fetch 10.000something XML-files automatically from a 
website and insert their content into a external PostgreSQL-database. After 
about 8.000 files my script stops and I get "Out of Main Memory".
I found your discussion with "kgfhjjgrn" [1] regarding this issue, but I'm not 
sure whether these options apply to my problem - I do not build a 
basex-database but an external one. Will autoflush=false and flushing by myself 
help with this?

Second, I want to insert some xs:base64Binary into my PostgreSQL database, but 
I cannot find the correct sql:parameter type for the bytea-column.

Could you please help me with my issues?

Thank you very much,
Alexander



[1] http://comments.gmane.org/gmane.text.xml.basex.talk/2540



-- 
| Alexander von Bernuth
| alexander.von-bern...@student.uni-tuebingen.de

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Out of Main Memory

2013-03-22 Thread Hans-Juergen Rennau

Dear Fabrice,

thank you for your response! The crash happened 
during db creation (using db:create). Interesting idea to disable 
indexes - I will try this out.

Hans-Juergen






 Von: Fabrice Etanchaud 
An: Hans-Juergen Rennau  
Gesendet: 14:08 Freitag, 22.März 2013
Betreff: RE: [basex-talk] Out of Main Memory
 

 
Dear Hans-Juergen,
 
Did it happen during db creation, or during index creation (in that case, the 
collection is available and can be opened) ?
With big collections like this, I usually disable indices before creation, and 
create them later.
 
Best,
Fabrice___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] Out of Main Memory

2013-03-22 Thread Hans-Juergen Rennau
Dear BaseX team, 


trying to create a database from a document with a size of 6.5 GB I get this 
error:
"Out of Main Memory".

Using the latest snapshot.

Could this be expected? I suppose that data loading is done in a streaming 
fashion? Is there something I can do to upload a doc of that size?

Thank you very much for hints,

Hans-Juergen___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk