Knut Anders Hatlen wrote:
Kristian Waagan <[email protected]> writes:

Dag H. Wanvik wrote:
Thanks for the good work!

Kristian Waagan <[email protected]> writes:

testFetchLargeClobPieceByPiece               673707     624639      3370
testFetchLargeClobPieceByPieceBackwards     1138559    1059045      2863
Interesting; fetching backwards is faster than forwards? :) Test
artifact, or?
Hi Dag,

I certainly hope I haven't optimized for fetching LOBs backwards!

It looks like there are some differences between those tests. They fetch
chunks of different sizes, and one of them performs sanity checking
against a LoopingAlphabetReader whereas the other one doesn't. So I
don't think we can say that fetching backwards is faster just by looking
at those numbers.


Right.

As I mentioned before, the results from the tests are really only comparable for runs made on the same machine and for one and one test method. Some of the tests fetch all ten Clobs (repeated five times), whereas some tests only fetch a single Clob (repeated as well).

Looking at the tests again, I see the test does something completely different. It just fetches a small part of the Clob! The reason why fetching the whole Clob backwards would be a lot slower, is that Derby would have to skip parts of the data many times.

What the test really tests, is the ability of UTF8Reader to go backwards in its internal buffer, which it couldn't do before. If going backwards in this buffer weren't possible, the test would cause Derby to skip around 7.5 MB (size=15MB, pieceSize=10, pos=size/2-pieceSize, intBuf=8192) over 800 times. If you read backwards with chunks equal to or larger than the internal buffer, Derby must reposition size / chunk times. With a 15 MB Clob and a 32 KB chunk size, this would give 480 repositions. If my formula from DERBY-3766 is correct, Derby has to skip approx 3.5 GB of data in this case!

One situation where the backwards repositioning could be a great time-saver, is when you are searching for a pattern in the Clob (using Clob.position). If the pattern is relatively short compared to the internal buffer, the Clob contains many partial matches again the pattern, and the internal buffer boundary isn't crossed when restarting the search, the win will be big.


Back to the results I posted originally, 'testFetchLargeClobPieceByPiece' fetches all 15 MB five times in 3370 ms, and 'testFetchLargeClobPieceByPieceBackwards' fetches around 8 K in 10 character chunks five times. A quick investigation revealed that Derby had to reposition the stream twice: once for the first request (skipping to read position 7864310) and once for the last request (skipping to read position 7856120). The first position is close to the end of the internal buffer, and the last position is on the "wrong side" of the lower buffer boundary. For all other requests Derby was able to go 20 characters backwards in the internal buffer in UTF8Reader, which consists of only changing a position variable.


Hope this made things a little clearer, and please, don't optimize your application by reading Clobs backwards ;)


--
Kristian

Reply via email to