itschrispeck commented on code in PR #11739:
URL: https://github.com/apache/pinot/pull/11739#discussion_r1346462826
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/json/ImmutableJsonIndexReader.java:
##########
@@ -76,6 +77,63 @@ public ImmutableJsonIndexReader(PinotDataBuffer dataBuffer,
int numDocs) {
_docIdMapping = dataBuffer.view(invertedIndexEndOffset,
docIdMappingEndOffset, ByteOrder.LITTLE_ENDIAN);
}
+ /**
+ * Accepts a JSON key and array of docIds used to filter the response
+ * return a String[] where String[i] gives the value of $.key for document i
+ */
+ @Override
+ public String[] getValuesForKeyAndDocs(String key, int[] docIds) {
+ ImmutableRoaringBitmap docIdMask = ImmutableRoaringBitmap.bitmapOf(docIds);
Review Comment:
From flame graphs, this generally accounts for 30-50% of total time. I
didn't find a simple way to optimize this, input is appreciated
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/json/ImmutableJsonIndexReader.java:
##########
@@ -76,6 +77,63 @@ public ImmutableJsonIndexReader(PinotDataBuffer dataBuffer,
int numDocs) {
_docIdMapping = dataBuffer.view(invertedIndexEndOffset,
docIdMappingEndOffset, ByteOrder.LITTLE_ENDIAN);
}
+ /**
+ * Accepts a JSON key and array of docIds used to filter the response
+ * return a String[] where String[i] gives the value of $.key for document i
+ */
+ @Override
+ public String[] getValuesForKeyAndDocs(String key, int[] docIds) {
+ ImmutableRoaringBitmap docIdMask = ImmutableRoaringBitmap.bitmapOf(docIds);
+ int[] dictIds = getDictIdsForKey(key);
+ String[] values = new String[(int) _numDocs];
+ for (int dictId = dictIds[0]; dictId < dictIds[1]; dictId++) {
+ // get docIds from posting list, convert these to the actual docIds
+ ImmutableRoaringBitmap flattenedDocIds =
_invertedIndex.getDocIds(dictId);
+ PeekableIntIterator it = flattenedDocIds.getIntIterator();
+ MutableRoaringBitmap postingList = new MutableRoaringBitmap();
+ while (it.hasNext()) {
+ postingList.add(getDocId(it.next()));
Review Comment:
This also generally accounts for 30-50% of total time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]