Knut Anders Hatlen wrote:
Kristian Waagan <[email protected]> writes:
Dag H. Wanvik wrote:
Thanks for the good work!
Kristian Waagan <[email protected]> writes:
testFetchLargeClobPieceByPiece 673707 624639 3370
testFetchLargeClobPieceByPieceBackwards 1138559 1059045 2863
Interesting; fetching backwards is faster than forwards? :) Test
artifact, or?
Hi Dag,
I certainly hope I haven't optimized for fetching LOBs backwards!
It looks like there are some differences between those tests. They fetch
chunks of different sizes, and one of them performs sanity checking
against a LoopingAlphabetReader whereas the other one doesn't. So I
don't think we can say that fetching backwards is faster just by looking
at those numbers.
Right.
As I mentioned before, the results from the tests are really only
comparable for runs made on the same machine and for one and one test
method.
Some of the tests fetch all ten Clobs (repeated five times), whereas
some tests only fetch a single Clob (repeated as well).
Looking at the tests again, I see the test does something completely
different. It just fetches a small part of the Clob!
The reason why fetching the whole Clob backwards would be a lot slower,
is that Derby would have to skip parts of the data many times.
What the test really tests, is the ability of UTF8Reader to go backwards
in its internal buffer, which it couldn't do before. If going backwards
in this buffer weren't possible, the test would cause Derby to skip
around 7.5 MB (size=15MB, pieceSize=10, pos=size/2-pieceSize,
intBuf=8192) over 800 times.
If you read backwards with chunks equal to or larger than the internal
buffer, Derby must reposition size / chunk times. With a 15 MB Clob and
a 32 KB chunk size, this would give 480 repositions. If my formula from
DERBY-3766 is correct, Derby has to skip approx 3.5 GB of data in this case!
One situation where the backwards repositioning could be a great
time-saver, is when you are searching for a pattern in the Clob (using
Clob.position). If the pattern is relatively short compared to the
internal buffer, the Clob contains many partial matches again the
pattern, and the internal buffer boundary isn't crossed when restarting
the search, the win will be big.
Back to the results I posted originally,
'testFetchLargeClobPieceByPiece' fetches all 15 MB five times in 3370
ms, and 'testFetchLargeClobPieceByPieceBackwards' fetches around 8 K in
10 character chunks five times. A quick investigation revealed that
Derby had to reposition the stream twice: once for the first request
(skipping to read position 7864310) and once for the last request
(skipping to read position 7856120). The first position is close to the
end of the internal buffer, and the last position is on the "wrong side"
of the lower buffer boundary. For all other requests Derby was able to
go 20 characters backwards in the internal buffer in UTF8Reader, which
consists of only changing a position variable.
Hope this made things a little clearer, and please, don't optimize your
application by reading Clobs backwards ;)
--
Kristian