[EMAIL PROTECTED] wrote:
Mike Matrigali <[EMAIL PROTECTED]> writes:


It would be fine to use an unchecked and/or an ASSERT based check for
 readFieldLengthAndSetStreamPosition.  The "store" module owns this
access and is not counting on limit checks to catch anything here.

Another question:

All observed calls to setLimit (single tuple select load) come from the same
method: StoredPage.readRecordFromArray(...) (setLimit is also called from readRecordFromStream() but this does not
seem to happen with this type of load)

And the argument to setLimit()is always the local variable fieldDataLength
which is the return value from
StoredFieldHeader.readLengthAndSetStreamPosition().

So if readLengthAndSetStreamPosition() can update the position without
checking, presumably the return value from this method can be trusted
as well? Is it then necessary to check this value again in setLimit(),
or could we have used an unchecked version of setLimit here?

I'm worried by this approach of removing checking of the limit or the position, it's much like saying we don't needs bounds checking on arrays because I know my code is correct.

The current code provides some protection from a software bug, corrupted page or hacked page. Removing those checks may lead to hard to detect bugs where a position and/or limit is calculated incorrectly and subsequently leads to corrupted data or random exceptions being thrown.

My feeling is that the integrity of the store and the code is better served by keeping these checks.

I also think we need more performance numbers to justify such a change, a single set of runs from a single vm does not justify it. I will run numbers on linux with a number of vm's when I get the chance.

Also often in these cases it is better to try and optimize at a higher level, rather than try to optimize at the lowest level (especially when removing such checks). In this case see if the number of calls to setLimit() or setPosition() could be reduced rather than micro-optimizing these methods by changing their core functionality.

As an example the setLimit() call around the readExternalFromArray method. Maybe this responsibility could be pushed into the data type itself, and for some builtin types we trust their read mechanism to read the correct amount of data. E.g. reading a INTEGER will always read four bytes, so no need to set a limit around it. The limit is pushed there to support types that do not know their length on disk (e.g. some character types, some binary types, user defined types) and was to support arbitrary user types when the engine cannot trust or require that the de-serialization will read the complete stream and only its data.

Dan.

Reply via email to