Jackie-Jiang commented on code in PR #11826:
URL: https://github.com/apache/pinot/pull/11826#discussion_r1364618944


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/readers/LazyRow.java:
##########
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.local.segment.readers;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Set;
+import org.apache.pinot.segment.spi.IndexSegment;
+
+
+/**
+ * A wrapper class to read column values of a row for a given {@link 
IndexSegment} and docId.<br>
+ * The advantage of having wrapper over segment and docId is column values are 
read only when
+ * {@link LazyRow#getValue(String)} is invoked.
+ * This is useful to reduce the disk reads incurred due to loading the 
previous row during merge step.
+ * There isn't any advantage to have a LazyRow wrap a GenericRow but has been 
kept for syntactic sugar.
+ */
+public class LazyRow {
+  private IndexSegment _segment;

Review Comment:
   Suggest making `_segment` and `_fieldToValueMap` final, and only allow 
changing `_docId`.
   
   The constructor always take a `IndexSegment`, and `init(int docId)` sets the 
`_docId` and clears the value map



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/readers/LazyRow.java:
##########
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.local.segment.readers;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Set;
+import org.apache.pinot.segment.spi.IndexSegment;
+
+
+/**
+ * A wrapper class to read column values of a row for a given {@link 
IndexSegment} and docId.<br>
+ * The advantage of having wrapper over segment and docId is column values are 
read only when
+ * {@link LazyRow#getValue(String)} is invoked.
+ * This is useful to reduce the disk reads incurred due to loading the 
previous row during merge step.
+ * There isn't any advantage to have a LazyRow wrap a GenericRow but has been 
kept for syntactic sugar.
+ */
+public class LazyRow {
+  private IndexSegment _segment;
+  private int _docId;
+  private HashMap<String, Object> _fieldToValueMap = new HashMap<>();
+
+  public LazyRow() {
+  }
+
+  public LazyRow(IndexSegment segment, int docId) {
+    _segment = segment;
+    _docId = docId;
+  }
+
+
+  public void init(IndexSegment segment, int docId) {
+    this.clear();
+    _segment = segment;
+    _docId = docId;
+  }
+
+
+  public Object getValue(String column) {
+    if (_segment == null) {
+      throw new IllegalStateException("Index segment for Lazy row is 
uninitialized.");
+    }
+    return _fieldToValueMap.computeIfAbsent(column, col -> {

Review Comment:
   Currently it will repeatedly read value when the value is `null`. We should 
enhance it so that even for `null` values, it is only read once



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/readers/LazyRow.java:
##########
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.local.segment.readers;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Set;
+import org.apache.pinot.segment.spi.IndexSegment;
+
+
+/**
+ * A wrapper class to read column values of a row for a given {@link 
IndexSegment} and docId.<br>
+ * The advantage of having wrapper over segment and docId is column values are 
read only when
+ * {@link LazyRow#getValue(String)} is invoked.
+ * This is useful to reduce the disk reads incurred due to loading the 
previous row during merge step.
+ * There isn't any advantage to have a LazyRow wrap a GenericRow but has been 
kept for syntactic sugar.
+ */
+public class LazyRow {
+  private IndexSegment _segment;
+  private int _docId;
+  private HashMap<String, Object> _fieldToValueMap = new HashMap<>();
+
+  public LazyRow() {
+  }
+
+  public LazyRow(IndexSegment segment, int docId) {
+    _segment = segment;
+    _docId = docId;
+  }
+
+
+  public void init(IndexSegment segment, int docId) {
+    this.clear();
+    _segment = segment;
+    _docId = docId;
+  }
+
+
+  public Object getValue(String column) {
+    if (_segment == null) {
+      throw new IllegalStateException("Index segment for Lazy row is 
uninitialized.");
+    }
+    return _fieldToValueMap.computeIfAbsent(column, col -> {

Review Comment:
   To solve this problem, we can potentially add another set to track `null` 
columns, and add an API to read whether the value is `null`



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/readers/LazyRow.java:
##########
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.local.segment.readers;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Set;
+import org.apache.pinot.segment.spi.IndexSegment;
+
+
+/**
+ * A wrapper class to read column values of a row for a given {@link 
IndexSegment} and docId.<br>
+ * The advantage of having wrapper over segment and docId is column values are 
read only when
+ * {@link LazyRow#getValue(String)} is invoked.
+ * This is useful to reduce the disk reads incurred due to loading the 
previous row during merge step.
+ * There isn't any advantage to have a LazyRow wrap a GenericRow but has been 
kept for syntactic sugar.
+ */
+public class LazyRow {
+  private IndexSegment _segment;
+  private int _docId;
+  private HashMap<String, Object> _fieldToValueMap = new HashMap<>();
+
+  public LazyRow() {
+  }
+
+  public LazyRow(IndexSegment segment, int docId) {
+    _segment = segment;
+    _docId = docId;
+  }
+
+
+  public void init(IndexSegment segment, int docId) {
+    this.clear();
+    _segment = segment;
+    _docId = docId;
+  }
+
+
+  public Object getValue(String column) {
+    if (_segment == null) {

Review Comment:
   This check is no longer needed when we make `_segment` final



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to