[
https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639082#action_12639082
]
Kristian Waagan commented on DERBY-3907:
----------------------------------------
[Header format]
Mike wrote:
-----
What does the following mean? Will the changes apply to all sql which inserts
clobs, or to only particular jdbc interfaces?
1) Clob modifications are done on a copy (i.e. TemporaryClob).
-----
With Clob modifications I mean updates of parts of an existing Clob. To get
into this state, one must first do a select to get the Clob that has already
been stored in the database. I think updating parts of the Clob can only be
done through the Clob interface. Is that correct?
The ResultSet.updateXXX-methods can be seen as inserting a new Clob.
My current hope is that all insertion will go through ReaderToUTF8Stream, which
seems like a good place to count characters (and bytes) and obtain bytes per
char statistics.
There might be a slight complication as we allow using setString on Clob
columns.
-----
What is the expected call sequence to store, and what is the goal performance
characteristic?
-----
The expected call sequence is exactly as you describe it (see Mike's comment
from 10/Oct/08 10:10 AM).
Depending on the information we need to obtain, the header can be written at
once or as the last step of insertion. Even if we only store length
information, we need to support the latter due to the lengthless JDBC methods.
The goal performance characteristic for the length operation is that getting
the length for the largest storable Clob should be as fast as for the shortest
one (read first page and decode stream header bytes). This is not the case
today, because the Clob data must be decoded to find the length. Besides from
Clob.getLength, this is hurting us where other methods do argument checking
using the Clob length.
Positioning can be expressed with costs like this:
[reset stream] + decode_chars + skip_bytes
In certain cases, we can remove the decoding costs by knowing that all chars
are represented by one, two or three bytes. In these cases, the positioning
cost should be as for Blob. This is the motivation for the bytes per char
information.
> Save useful length information for Clobs in store
> -------------------------------------------------
>
> Key: DERBY-3907
> URL: https://issues.apache.org/jira/browse/DERBY-3907
> Project: Derby
> Issue Type: Improvement
> Components: JDBC, Store
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
>
> The store should save useful length information for Clobs. This allows the
> length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also
> contains some background information:
> http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be
> discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.