GitHub user paul-rogers opened a pull request:
https://github.com/apache/drill/pull/800
DRILL-5385: Vector serializer fails to read saved SV2
Unit testing revealed that the VectorAccessorSerializable class claims
to serialize SV2s, but, in fact, does not. Actually, it writes them,
but does not read them, resulting in corrupted data on read.
Fortunately, no code appears to serialize sv2s at present. Still, it is
a bug and needs to be fixed.
First task is to add serialization code for the sv2.
That revealed that the recently-added code to save DrillBufs using a
shared buffer had a bug: it relied on the writer index to know how much
data is in the buffer. Turns out sv2 buffers donât set this index. So,
new versions of the write function takes a write length.
Then, closer inspection of the read code revealed duplicated code. So,
DrillBuf allocation moved into a version of the read function that now
does reading and DrillBuf allocation.
Turns out that value vectors, but not SV2s, can be built from a
Drillbuf. Added a matching constructor to the SV2 class.
Finally, cleaned up the code a bit to make it easier to follow. Also
allowed test code to access the handy timer already present in the code.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/paul-rogers/drill DRILL-5385
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/800.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #800
----
commit 26a6cf9eed5347a06640bd72fb3720ea9369c001
Author: Paul Rogers <[email protected]>
Date: 2017-03-26T02:51:43Z
DRILL-5385: Vector serializer fails to read saved SV2
Unit testing revealed that the VectorAccessorSerializable class claims
to serialize SV2s, but, in fact, does not. Actually, it writes them,
but does not read them, resulting in corrupted data on read.
Fortunately, no code appears to serialize sv2s at present. Still, it is
a bug and needs to be fixed.
First task is to add serialization code for the sv2.
That revealed that the recently-added code to save DrillBufs using a
shared buffer had a bug: it relied on the writer index to know how much
data is in the buffer. Turns out sv2 buffers donât set this index. So,
new versions of the write function takes a write length.
Then, closer inspection of the read code revealed duplicated code. So,
DrillBuf allocation moved into a version of the read function that now
does reading and DrillBuf allocation.
Turns out that value vectors, but not SV2s, can be built from a
Drillbuf. Added a matching constructor to the SV2 class.
Finally, cleaned up the code a bit to make it easier to follow. Also
allowed test code to access the handy timer already present in the code.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---