Re: [Sedna-discussion] performance issues

Ivan Shcheklein Thu, 29 Jan 2009 02:16:33 -0800

Hi Ming,

Thank you for scripts.


Seems, that this performance fall is caused by index.
If you want to have roughly the same speed during load you should create
index after bulk load is completed.

However, we found another anomaly while reproducing the issue. We have faced
that data expansion factor is too large. For example, it takes ~49GB to load
400.000 documents (~570MB). Have you experienced similar problem? I hope we
will fix this as soon as possible.

BTW, what's about another issue:

>>In our experience concurrent access in the same collection causes
>>performance issurs.
>>The access with 2 threads has just 20%-25% performance as with 1 thread.

can you provide us with a script to reproduce this?

Ivan Shcheklein,
Sedna Team

On Tue, Jan 27, 2009 at 4:57 PM, Ming Zha-zimmermann <
[email protected]> wrote:

> Hi Ivan,
>
> I posted the cripts to create our database. Excute just createDB_testdb.sh,
> a database with the name "testdb" with indexes will be created (Indexes are
> defined in createDataCollection.sh).
> The database has 17 collections "LAND00" - "LAND16". In every collection
> there are same indexes defined with unique names.
>
> In our test, we insert 2 resources into the database within a transaction
> in each loop. One resource is a xml document, the other is a image file.
> Before we store the xml document with the root-element "observationContext",
> we query the database to check, whether a observationContext in appropriate
> context exists or not.
>
> In the case that it not exists, we insert the xml document
> (observationContext.xml)in the database. If it exists, then we excute update
> insert(this case never happens in our test, because in our test we change
> the observationUnitID in each loop).
>
> Fanally we insert the image file into database as Base64.
>
>
> Following are the Query/Statement we use:
>
> **Begin sample query**
>
> declare default element namespace 'http://www.destatis.de/schema/idb/1.0';
>
> index-scan('ILAND08UNIT','002000752835','EQ')/.[observationContextKey/statisticsID='0003'][observationContextKey/referencePeriod='620070000']
>
> **End query**
>
> We observe constant speed by excuting this query(16ms).
>
>
> **Begin statement to insert xml document into database**
>
> statement.loadDocument(documentAsString, docName, container);
>
> **End statement**
>
> We observe slower speed by excuting this statment with large amount of
> documents in the collection.
> As mentioned, 30ms (7,000docs)/100ms(230,000docs)/730ms(660,000docs).
>
> **Begin statement to insert image into database**
>
> statement.loadDocument(resourceAsStream, resourceName, container);
>
> **End statement**
>
> We observe constant speed by excuting this statement(16ms).
>
> Thanks,
>
> Ming
>

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword

_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion

Re: [Sedna-discussion] performance issues

Reply via email to