[GitHub] lucene-solr pull request #525: LUCENE-8585: Index-time jump-tables for DocVa...

tokee Fri, 28 Dec 2018 06:22:00 -0800

Github user tokee commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/525#discussion_r244331405
  
    --- Diff: 
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java ---
    @@ -0,0 +1,542 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.lucene.codecs.lucene80;
    +
    +import java.io.DataInput;
    +import java.io.IOException;
    +
    +import org.apache.lucene.search.DocIdSetIterator;
    +import org.apache.lucene.store.IndexInput;
    +import org.apache.lucene.store.IndexOutput;
    +import org.apache.lucene.store.RandomAccessInput;
    +import org.apache.lucene.util.ArrayUtil;
    +import org.apache.lucene.util.BitSetIterator;
    +import org.apache.lucene.util.FixedBitSet;
    +import org.apache.lucene.util.RoaringDocIdSet;
    +
    +/**
    + * Disk-based implementation of a {@link DocIdSetIterator} which can return
    + * the index of the current document, i.e. the ordinal of the current 
document
    + * among the list of documents that this iterator can return. This is 
useful
    + * to implement sparse doc values by only having to encode values for 
documents
    + * that actually have a value.
    + * <p>Implementation-wise, this {@link DocIdSetIterator} is inspired of
    + * {@link RoaringDocIdSet roaring bitmaps} and encodes ranges of {@code 
65536}
    + * documents independently and picks between 3 encodings depending on the
    + * density of the range:<ul>
    + *   <li>{@code ALL} if the range contains 65536 documents exactly,
    + *   <li>{@code DENSE} if the range contains 4096 documents or more; in 
that
    + *       case documents are stored in a bit set,
    + *   <li>{@code SPARSE} otherwise, and the lower 16 bits of the doc IDs are
    + *       stored in a {@link DataInput#readShort() short}.
    + * </ul>
    + * <p>Only ranges that contain at least one value are encoded.
    + * <p>This implementation uses 6 bytes per document in the worst-case, 
which happens
    + * in the case that all ranges contain exactly one document.
    + *
    + * 
    + * To avoid O(n) lookup time complexity, with n being the number of 
documents, two lookup
    + * tables are used: A lookup table for block blockCache and index, and a 
rank structure
    + * for DENSE block lookups.
    + *
    + * The lookup table is an array of {@code long}s with an entry for each 
block. It allows for
    + * direct jumping to the block, as opposed to iteration from the current 
position and forward
    + * one block at a time.
    + *
    + * Each long entry consists of 2 logical parts:
    + *
    + * The first 31 bits hold the index (number of set bits in the blocks) up 
to just before the
    + * wanted block. The next 33 bits holds the offset in bytes into the 
underlying slice.
    + * As there is a maximum of 2^16 blocks, it follows that the maximum size 
of any block must
    + * not exceed 2^17 bits to avoid overflow. This is currently the case, 
with the largest
    + * block being DENSE and using 2^16 + 288 bits, and is likely to continue 
to hold as using
    + * more than double the amount of bits is unlikely to be an efficient 
representation.
    + * The cache overhead is numDocs/1024 bytes.
    + *
    + * Note: There are 4 types of blocks: ALL, DENSE, SPARSE and non-existing 
(0 set bits).
    + * In the case of non-existing blocks, the entry in the lookup table has 
index equal to the
    + * previous entry and offset equal to the next non-empty block.
    + *
    + * The block lookup table is stored at the end of the total block 
structure.
    + *
    + *
    + * The rank structure for DENSE blocks is an array of unsigned {@code 
short}s with an entry
    + * for each sub-block of 512 bits out of the 65536 bits in the outer DENSE 
block.
    + *
    + * Each rank-entry states the number of set bits within the block up to 
the bit before the
    + * bit positioned at the start of the sub-block.
    + * Note that that the rank entry of the first sub-block is always 0 and 
that the last entry can
    + * at most be 65536-512 = 65024 and thus will always fit into an unsigned 
short.
    + *
    + * The rank structure for a given DENSE block is stored at the beginning 
of the DENSE block.
    + * This ensures locality and keeps logistics simple.
    + *
    + * @lucene.internal
    + */
    +final class IndexedDISI extends DocIdSetIterator {
    +
    +  // jump-table time/space trade-offs to consider:
    +  // The block offsets and the block indexes could be stored in more 
compressed form with
    +  // two PackedInts or two MonotonicDirectReaders.
    +  // The DENSE ranks (128 shorts = 256 bytes) could likewise be 
compressed. As there is at least
    +  // 4096 set bits in DENSE blocks, there will be at least one rank with 
2^12 bits, so it is
    +  // doubtful if there is much to gain here.
    +  
    +  private static final int BLOCK_SIZE = 65536;   // The number of docIDs 
that a single block represents
    +  static final int BLOCK_BITS = 16;
    +  private static final long BLOCK_INDEX_SHIFT = 33; // Number of bits to 
shift a lookup entry to get the index
    +  private static final long BLOCK_INDEX_MASK = ~0L << BLOCK_INDEX_SHIFT; 
// The index bits in a lookup entry
    +  private static final long BLOCK_LOOKUP_MASK = ~BLOCK_INDEX_MASK; // The 
offset bits in a lookup entry
    +
    +  private static final int DENSE_BLOCK_LONGS = BLOCK_SIZE/Long.SIZE; // 
1024
    +  private static final int RANK_BLOCK_SIZE = 512; // The number of 
docIDs/bits in each rank-sub-block within a DENSE block
    +  private static final int RANK_BLOCK_LONGS = RANK_BLOCK_SIZE/Long.SIZE; 
// The number of longs making up a rank-block (8)
    +  private static final int RANK_BLOCK_BITS = 9;
    +  private static final int RANKS_PER_BLOCK = BLOCK_SIZE / RANK_BLOCK_SIZE; 
// 128
    +
    +  static final int MAX_ARRAY_LENGTH = (1 << 12) - 1;
    +
    +  private static void flush(int block, FixedBitSet buffer, int 
cardinality, IndexOutput out) throws IOException {
    +    assert block >= 0 && block < 65536;
    +    out.writeShort((short) block);
    +    assert cardinality > 0 && cardinality <= 65536;
    +    out.writeShort((short) (cardinality - 1));
    +    if (cardinality > MAX_ARRAY_LENGTH) {
    +      if (cardinality != 65536) { // all docs are set
    +        final byte[] rank = createRank(buffer);
    +        out.writeBytes(rank, rank.length);
    +        for (long word : buffer.getBits()) {
    +          out.writeLong(word);
    +        }
    +      }
    +    } else {
    +      BitSetIterator it = new BitSetIterator(buffer, cardinality);
    +      for (int doc = it.nextDoc(); doc != DocIdSetIterator.NO_MORE_DOCS; 
doc = it.nextDoc()) {
    +        out.writeShort((short) doc);
    +      }
    +    }
    +  }
    +
    +  // Creates a DENSE rank-entry (the number of set bits up to a given 
point) for the buffer.
    +  // One rank-entry for every 512 bits/8 longs for a total of 128 * 16 
bits.
    +  // Represented as a byte[] for fast flushing and mirroring of the 
retrieval representation.
    +  private static byte[] createRank(FixedBitSet buffer) {
    +    final byte[] rank = new byte[RANKS_PER_BLOCK*2];
    +    final long[] bits = buffer.getBits();
    +    int bitCount = 0;
    +    for (int word = 0 ; word < DENSE_BLOCK_LONGS ; word++) {
    +      if ((word & 0x07) == 0) { // Every 8 longs
    +        rank[word >> 2] = (byte)(bitCount>>8);
    +        rank[(word >> 2)+1] = (byte)(bitCount & 0xFF);
    +      }
    +      bitCount += Long.bitCount(bits[word]);
    +    }
    +    return rank;
    +  }
    +
    +  /**
    +   * Writes the docIDs from it to out, in logical blocks, one for each 
65536 docIDs in monotonically
    +   * increasing gap-less order.
    +   * @param it  the document IDs.
    +   * @param out destination for the blocks.
    +   * @throws IOException if there was an error writing to out.
    +   * @return the number of jump-table entries following the blocks, -1 for 
no entries. This should be stored in meta.
    +   */
    +  static short writeBitSet(DocIdSetIterator it, IndexOutput out) throws 
IOException {
    +    final long origo = out.getFilePointer(); // All jumps are relative to 
the origo
    +    int totalCardinality = 0;
    +    int blockCardinality = 0;
    +    final FixedBitSet buffer = new FixedBitSet(1<<16);
    +    long[] jumps = new long[ArrayUtil.oversize(1, Long.BYTES)];
    +    jumps[0] = out.getFilePointer()-origo; // First block starts at index 0
    +    int prevBlock = -1;
    +    int jumpBlockIndex = 0;
    +
    +    for (int doc = it.nextDoc(); doc != DocIdSetIterator.NO_MORE_DOCS; doc 
= it.nextDoc()) {
    +      final int block = doc >>> 16;
    +      if (prevBlock != -1 && block != prevBlock) {
    +        // Track offset+index from previous block up to current
    +        jumps = addJumps(jumps, out.getFilePointer()-origo, 
totalCardinality, jumpBlockIndex, prevBlock+1);
    +        jumpBlockIndex = prevBlock+1;
    +        // Flush block
    +        flush(prevBlock, buffer, blockCardinality, out);
    +        // Reset for next block
    +        buffer.clear(0, buffer.length());
    +        totalCardinality += blockCardinality;
    +        blockCardinality = 0;
    +      }
    +      buffer.set(doc & 0xFFFF);
    +      blockCardinality++;
    +      prevBlock = block;
    +    }
    +    if (blockCardinality > 0) {
    +      jumps = addJumps(jumps, out.getFilePointer()-origo, 
totalCardinality, jumpBlockIndex, prevBlock+1);
    +      flush(prevBlock, buffer, blockCardinality, out);
    +      buffer.clear(0, buffer.length());
    +      prevBlock++;
    +    }
    +    final int lastBlock = prevBlock == -1 ? 0 : prevBlock;
    +    // NO_MORE_DOCS is stored explicitly
    +    buffer.set(DocIdSetIterator.NO_MORE_DOCS & 0xFFFF);
    +    flush(DocIdSetIterator.NO_MORE_DOCS >>> 16, buffer, 1, out);
    +    // offset+index jump-table stored at the end
    +    return flushBlockJumps(jumps, lastBlock, out, origo);
    +  }
    +
    +  // Adds entries to the offset & index jump-table for blocks
    +  private static long[] addJumps(long[] jumps, long offset, long index, 
int startBlock, int endBlock) {
    +    jumps = ArrayUtil.grow(jumps, endBlock +1);
    +    final long jump = (index << BLOCK_INDEX_SHIFT) | offset;
    +    for (int b = startBlock; b < endBlock; b++) {
    +      jumps[b] = jump;
    +    }
    +    return jumps;
    +  }
    +
    +  // Flushes the offet & index jump-table for blocks. This should be the 
last data written to out
    +  // This method returns the blockCount for the blocks reachable for the 
jump_table or -1 for no jump-table
    +  private static short flushBlockJumps(long[] jumps, int blockCount, 
IndexOutput out, long origo) throws IOException {
    +    if (blockCount == 1) { // A single jump is just wasted space so we 
ignore that
    +      blockCount = 0;
    +    }
    +    for (int i = 0 ; i < blockCount ; i++) {
    +      out.writeLong(jumps[i]);
    +    }
    +    // As there are at most 32k blocks, the count is a short
    +    // The jumpTableOffset will be at lastPos - (blockCount * Long.BYTES)
    +    return (short)blockCount;
    +  }
    +
    +  /** The slice that stores the {@link DocIdSetIterator}. */
    +  private final IndexInput slice;
    +  private final int jumpTableEntryCount;
    +  private final RandomAccessInput jumpTable; // Skip blocks of 64K bits
    +  private final byte[] rankTable = new byte[256]; // rank-table (logically 
128 * 16 bit) for DENSE blocks
    +  private final long cost;
    +
    +  /**
    +   * @param in backing data.
    +   * @param offset starting offset for blocks in backing data.
    +   * @param length the number of bytes in the backing data.
    +   * @param jumpTableEntryCount the number of blocks convered by the 
jump-table.
    +   * @param cost normally the number of logical docIDs.
    +   */
    +  IndexedDISI(IndexInput in, long offset, long length, int 
jumpTableEntryCount, long cost) throws IOException {
    +    this(in.slice("docs", offset, length), jumpTableEntryCount, cost);
    +  }
    +
    +  /**
    +   * This constructor allows to pass the slice directly in case it helps 
reuse.
    +   * see eg. Lucene80 norms producer's merge instance.
    +   * @param slice backing data, starting from position 0 to slice.length
    +   * @param jumpTableEntryCount the number of blocks convered by the 
jump-table.
    +   * @param cost normally the number of logical docIDs.
    +   */
    +  IndexedDISI(IndexInput slice, int jumpTableEntryCount, long cost) throws 
IOException {
    +    this.slice = slice;
    +    this.cost = cost;
    +
    +    this.jumpTableEntryCount = jumpTableEntryCount;
    +    //slice.seek(slice.length()-Short.BYTES);
    +    if (jumpTableEntryCount <= 0) {
    +      jumpTable = null;
    +    } else {
    +      long jumpTableOffset = slice.length()-jumpTableEntryCount*Long.BYTES;
    +      jumpTable = slice.randomAccessSlice(jumpTableOffset, 
slice.length()-jumpTableOffset);
    +    }
    --- End diff --
    
    It worked before because jumpTables are relative to the length of the 
slice, not the current position. Anyway, now I understand what is happening. 
Re-use of jumpTables is in the next commit.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] lucene-solr pull request #525: LUCENE-8585: Index-time jump-tables for DocVa...

Reply via email to