[
https://issues.apache.org/jira/browse/ORC-187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237167#comment-16237167
]
ASF GitHub Bot commented on ORC-187:
------------------------------------
Github user t3rmin4t0r commented on a diff in the pull request:
https://github.com/apache/orc/pull/186#discussion_r148716091
--- Diff: java/core/src/java/org/apache/orc/impl/BitFieldReader.java ---
@@ -162,53 +92,32 @@ public void seek(PositionProvider index) throws
IOException {
consumed + " in " + input);
} else if (consumed != 0) {
readByte();
- bitsLeft = 8 - consumed;
+ currentIdx = (byte) consumed;
} else {
- bitsLeft = 0;
+ currentIdx = 8;
}
}
public void skip(long items) throws IOException {
- long totalBits = bitSize * items;
- if (bitsLeft >= totalBits) {
- bitsLeft -= totalBits;
+ final long totalBits = bitSize * items;
+ final int availableBits = 8 - (currentIdx + 1);
+ if (totalBits <= availableBits) {
+ currentIdx += totalBits;
} else {
- totalBits -= bitsLeft;
- input.skip(totalBits / 8);
- current = input.next();
- bitsLeft = (int) (8 - (totalBits % 8));
+ final long bitsToSkip = (totalBits - availableBits);
+ input.skip(Math.min(1, bitsToSkip / 8));
--- End diff --
Looks a bit odd - the min(1) might be a problem if
availableBits = 0 and items=3, it will end up skip(1) where it should
really be doing skip(0).
> BitFieldReader has an unnecessary loop
> --------------------------------------
>
> Key: ORC-187
> URL: https://issues.apache.org/jira/browse/ORC-187
> Project: ORC
> Issue Type: Bug
> Components: Java, Reader
> Affects Versions: 1.4.0
> Reporter: Gopal V
> Assignee: Rajesh Balamohan
> Priority: Major
> Attachments: perf-top-bitReader-asm.png, perf-top-bitReader.png,
> perf_asm.png, perf_with_fix.png
>
>
> {code}
> /** The number of bits in one item. Non-test code always uses 1. */
> private final int bitSize;
> {code}
> The bitField reader was originally supposed to be extensible as an Integer
> reader with packing - but HIVE-7219 introduced parallel unpack routines which
> were better.
> This makes BitFieldReader::next() the core hotspot for integer sequences.
> !perf-top-bitReader.png!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)