RE: Impact of number of XLM files versus file size; Performance;

Conal Tuohy Fri, 31 May 2002 00:28:55 -0700

Thomas Ruth wrote:

> From the application logic point of view we 
> could split the 1 XML file into 2 files, each around  800k of 
> relevant data or we could create from this one file 1000 
> files each containing a set of parent and child, each of this 
> smaller files is between 1k and 40k. 
> 
> In which direction should we design our system to get the 
> best performance when accessing the XML files. I know it 
> depends on what you do with XSL, but is there a general 
> direction with Cocoon2 to go: access the data using 1 big XML 
> file or work/access the data using smaller XML files.


In Cocoon you can extract elements from a large file (and cache these
fragments) using XSL, or you can aggregate many small files into one (and
cache the aggregation). There are performance issues involved in how you do
this, but you can you deal with them so that the question should be
immaterial to Cocoon.

For instance, if you have lots of pipelines which present a small extract
from a large xml document, you can base all these pipelines on an
"extraction" pipeline, so that the extraction can be performed once, cached,
and provided as a convenient input for a set of other pipelines. You can use
a pipeline with XInclude or map:aggregate to do the inverse operation and
assemble a single document from parts. Generally speaking, the cache will
minimise a lot of the performance considerations, which gives you a lot of
freedom in how you choose to do it.

So you can consider yourself free to do whatever is most convenient for you
in terms of storing and updating the data. Do you want to maintain a lot of
little files? Do you want to have to edit huge xml documents when you are
updating the data? Those are the type of questions to ask, I think.

Cheers!

Con

<<attachment: winmail.dat>>

---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>

RE: Impact of number of XLM files versus file size; Performance;

Reply via email to