Hello,

While investigating the LOB code, it occurred to me that ClobUpdatableReader and some related code could be changed for two main reasons; performance and code readability/simplification. I'll focus on the latter one in this mail.

At the moment, the updatable reader functionality is tightly coupled to TemporaryClob and has special handling for the various internal Clob representations. I believe the functionality can be provided efficiently on a more general level, and the responsibilities can also be more clearly separated. Below I try to outline a solution to this problem - I would like some high level feedback on whether the suggestion is sound or not. (please ask if needed, the description omits information in an attempt to keep it short)

* Responsibilities
  - positioning: handled by/through UTF8Reader.
  - detecting modifications and handling them: ClobUpdatableReader.

* New classes/interfaces
- PositionedStream (generalization of PositionedStoreStream): extends InputStream; getPosition(), reposition(long). The idea here is to exploit the fact that TemporaryClob is directly addressable (by byte position, *not* by char position). - CharacterStreamDescriptor: a class containing information about a byte stream representing characters. Will be passed in to UTF8Reader, so that it can configure itself appropriately (current b/c pos, b/c length, is bufferable/positionAware, max char length, dataOffset).

* Changes
- InternalClob: add isReleased() and getUpdateCount() to support the updatable reader functionality. The first is used to check if the internal representation has changed, the latter to detect content modifications.
  - ClobUpdatableReader: will be simplified, practically rewritten.
- UTF8Reader: new constructors and other minor changes. One notable change is that it will no longer be this class' responsibility to read the encoded length information in the streams from store. I'm hoping this can be done in a utility class to avoid duplicating that code.

If I don't get any pushback on this suggestion, I will create a parent Jira issue (probably describing the performance problem) and a set of subtasks under it. The diff for my current prototype patch is at around 1200 lines.


--
Kristian

Reply via email to