Dear BaseX team

I am planning an update on our previous custom indexing system [1]. But to do 
this I have a couple of questions. The major ones will be how to write an 
efficient custom indexing query in XQuery, but that'll be for another email. 
(In fact, we have a dual indexing system, so two index files per main file.) 
For now I am mainly interested in different documents in a single databases, 
and the doc() functionality.

Intuitively, I'd say that documents that are related to each other should be 
put in the same database. E.g. one database with different documents for 
plants, and one database with different documents for animals. But when I was 
scrolling through the documentation of BaseX, I noticed that when creating 
custom indices you do not put those in the same db as the original content, so 
you have on database for the content and one for the index [2]. Is this the way 
it's typically done?

More generally, the questions that I have are the following:

*         What is the actual difference in BaseX between using separate 
documents in a single database, or using different databases all together?

*         Is there a performance difference when I would put my index file in 
the same database as the content, vs. when using different databases altogether?

*         What is the max allowed size for a document in a database and a 
database itself respectively? (I have files that are 100's of GB in size. It 
might not be plausible to have a file and its index file in the same database.)


Thank you in advance
Kind regards

Bram Vanroy
Doctoral Research at Ghent University, Belgium
https://www.lt3.ugent.be/people/bram-vanroy/


[1] https://biblio.ugent.be/publication/8534144
[2] http://docs.basex.org/wiki/Indexes#Custom_Index_Structures

Reply via email to