Yiqun Zhang created ORC-931: ------------------------------- Summary: Optimize RunLengthIntegerWriterV2 code for better readability Key: ORC-931 URL: https://issues.apache.org/jira/browse/ORC-931 Project: ORC Issue Type: Improvement Reporter: Yiqun Zhang
RunLengthIntegerWriterV2.java 512-546 line {code:java} if (diffBitsLH > 1) { for (int i = 0; i < numLiterals; i++) { baseRedLiterals[i] = literals[i] - min; } brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 0.95); brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 1.0); if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) { encoding = EncodingType.PATCHED_BASE; preparePatchedBlob(); return; } else { encoding = EncodingType.DIRECT; return; } } else { // if difference in bits between 95th percentile and 100th percentile is // 0, then patch length will become 0. Hence we will fallback to direct encoding = EncodingType.DIRECT; return; } {code} All three conditional branch logics have been completed and the return statement is redundant. 691-704 line {code:java} if (fixedRunLength < MIN_REPEAT) { variableRunLength = fixedRunLength; fixedRunLength = 0; determineEncoding(); writeValues(); } else if (fixedRunLength >= MIN_REPEAT && fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) { encoding = EncodingType.SHORT_REPEAT; writeValues(); } else { encoding = EncodingType.DELTA; isFixedDelta = true; writeValues(); } {code} fixedRunLength >= MIN_REPEAT is redundant, the previous condition already ensures this. Extract the writeValues() method to the end. It seems better for conditional judgements to deal only with encoding and state. 772-781 line {code:java} if (fixedRunLength >= MIN_REPEAT) { if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) { encoding = EncodingType.SHORT_REPEAT; writeValues(); } else { encoding = EncodingType.DELTA; isFixedDelta = true; writeValues(); } } {code} Ditto -- This message was sent by Atlassian Jira (v8.3.4#803005)