Re: CLOB performance

Kristian Waagan Fri, 06 Feb 2009 03:50:02 -0800

Knut Anders Hatlen wrote:

Kristian Waagan <[email protected]> writes:

Dag H. Wanvik wrote:

Thanks for the good work!

Kristian Waagan <[email protected]> writes:

testFetchLargeClobPieceByPiece               673707     624639      3370
testFetchLargeClobPieceByPieceBackwards     1138559    1059045      2863

Interesting; fetching backwards is faster than forwards? :) Test
artifact, or?

Hi Dag,

I certainly hope I haven't optimized for fetching LOBs backwards!


It looks like there are some differences between those tests. They fetch
chunks of different sizes, and one of them performs sanity checking
against a LoopingAlphabetReader whereas the other one doesn't. So I
don't think we can say that fetching backwards is faster just by looking
at those numbers.


Right.

As I mentioned before, the results from the tests are really onlycomparable for runs made on the same machine and for one and one testmethod.Some of the tests fetch all ten Clobs (repeated five times), whereassome tests only fetch a single Clob (repeated as well).

Looking at the tests again, I see the test does something completelydifferent. It just fetches a small part of the Clob!The reason why fetching the whole Clob backwards would be a lot slower,is that Derby would have to skip parts of the data many times.

What the test really tests, is the ability of UTF8Reader to go backwardsin its internal buffer, which it couldn't do before. If going backwardsin this buffer weren't possible, the test would cause Derby to skiparound 7.5 MB (size=15MB, pieceSize=10, pos=size/2-pieceSize,intBuf=8192) over 800 times.If you read backwards with chunks equal to or larger than the internalbuffer, Derby must reposition size / chunk times. With a 15 MB Clob anda 32 KB chunk size, this would give 480 repositions. If my formula fromDERBY-3766 is correct, Derby has to skip approx 3.5 GB of data in this case!

One situation where the backwards repositioning could be a greattime-saver, is when you are searching for a pattern in the Clob (usingClob.position). If the pattern is relatively short compared to theinternal buffer, the Clob contains many partial matches again thepattern, and the internal buffer boundary isn't crossed when restartingthe search, the win will be big.

Back to the results I posted originally,'testFetchLargeClobPieceByPiece' fetches all 15 MB five times in 3370ms, and 'testFetchLargeClobPieceByPieceBackwards' fetches around 8 K in10 character chunks five times. A quick investigation revealed thatDerby had to reposition the stream twice: once for the first request(skipping to read position 7864310) and once for the last request(skipping to read position 7856120). The first position is close to theend of the internal buffer, and the last position is on the "wrong side"of the lower buffer boundary. For all other requests Derby was able togo 20 characters backwards in the internal buffer in UTF8Reader, whichconsists of only changing a position variable.

Hope this made things a little clearer, and please, don't optimize yourapplication by reading Clobs backwards ;)



--
Kristian

Re: CLOB performance

Reply via email to