imply-cheddar commented on code in PR #14408:
URL: https://github.com/apache/druid/pull/14408#discussion_r1234712369


##########
processing/src/main/java/org/apache/druid/query/aggregation/first/DoubleFirstVectorAggregator.java:
##########
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.first;
+
+import org.apache.druid.collections.SerializablePair;
+import org.apache.druid.segment.vector.VectorValueSelector;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+public class DoubleFirstVectorAggregator extends NumericFirstVectorAggregator
+{
+  double firstValue;
+
+  public DoubleFirstVectorAggregator(VectorValueSelector timeSelector, 
VectorValueSelector valueSelector)
+  {
+    super(timeSelector, valueSelector);
+    firstValue = 0;
+  }
+
+  @Override
+  public void initValue(ByteBuffer buf, int position)
+  {
+    buf.putDouble(position, 0);
+  }
+
+
+  @Override
+  void putValue(ByteBuffer buf, int position, int index)
+  {
+    firstValue = valueSelector.getDoubleVector()[index];
+    buf.putDouble(position, firstValue);
+  }
+
+
+  /**
+   * @return The primitive object stored at the position in the buffer.

Review Comment:
   This comment says that it's returning a primitive, but the method is 
returning a SerializablePair.  Which one is supposed to be correct?



##########
processing/src/main/java/org/apache/druid/query/aggregation/first/DoubleFirstAggregatorFactory.java:
##########
@@ -125,6 +138,23 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
     }
   }
 
+  @Override
+  public VectorAggregator factorizeVector(
+      VectorColumnSelectorFactory columnSelectorFactory
+  )
+  {
+    ColumnCapabilities capabilities = 
columnSelectorFactory.getColumnCapabilities(fieldName);
+    VectorValueSelector valueSelector = 
columnSelectorFactory.makeValueSelector(fieldName);
+    //time is always long
+    BaseLongVectorValueSelector timeSelector = (BaseLongVectorValueSelector) 
columnSelectorFactory.makeValueSelector(
+        timeColumn);

Review Comment:
   Two things:
   
   1) you don't need either of these until after you've checked capabilities.  
Don't bother creating them if you don't need them.
   2) This is casting to `BaseLongVectorValueSelector`, but the arguments on 
`DoubleFirstVectorAggregator` don't seem to care about the cast at all.  Either 
it's important that we cast and we force the case, OR it's not important and we 
shouldn't force the case.  The current code makes me think that it's not 
important.



##########
processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstAggregatorFactory.java:
##########
@@ -154,6 +160,26 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
     }
   }
 
+  @Override
+  public VectorAggregator factorizeVector(VectorColumnSelectorFactory 
selectorFactory)
+  {
+    ColumnCapabilities capabilities = 
selectorFactory.getColumnCapabilities(fieldName);
+    VectorObjectSelector vSelector = 
selectorFactory.makeObjectSelector(fieldName);
+    BaseLongVectorValueSelector timeSelector = (BaseLongVectorValueSelector) 
selectorFactory.makeValueSelector(
+        timeColumn);
+    if (capabilities != null) {
+      return new StringFirstVectorAggregator(timeSelector, vSelector, 
maxStringBytes);
+    } else {
+      return new StringFirstVectorAggregator(null, vSelector, maxStringBytes);
+    }

Review Comment:
   We can/should do this a bit more intelligently.  Specifically, there are 3 
different types of vector selectors that could be needed here and you will need 
to check column capabilities ahead of time to tell the difference:
   
   1. If it is a STRING and multi-valued, use the multivalue-dimension version
   2. If it is a STRING and single-valued, use the single value dimension 
version
   3. Otherwise use a VectorObjectSelector
   
   Your implementation for (3) is in this PR already, for (1) and (2), you can 
read only the dictionary ids and just keep track of only the earliest 
dictionaryId (not the string, the dictionary id).  Then, when `get()` is 
called, convert the dictionary id into the String and truncate the size if 
necessary.



##########
processing/src/main/java/org/apache/druid/query/aggregation/first/FloatFirstAggregatorFactory.java:
##########
@@ -123,6 +130,27 @@ public BufferAggregator 
factorizeBuffered(ColumnSelectorFactory metricFactory)
     }
   }
 
+  @Override
+  public VectorAggregator factorizeVector(VectorColumnSelectorFactory 
columnSelectorFactory)
+  {
+    ColumnCapabilities capabilities = 
columnSelectorFactory.getColumnCapabilities(fieldName);
+    VectorValueSelector valueSelector = 
columnSelectorFactory.makeValueSelector(fieldName);
+    //time is always long
+    BaseLongVectorValueSelector timeSelector = (BaseLongVectorValueSelector) 
columnSelectorFactory.makeValueSelector(
+        timeColumn);
+    if (capabilities == null || capabilities.isNumeric()) {
+      return new FloatFirstVectorAggregator(timeSelector, valueSelector);
+    } else {
+      return NumericNilVectorAggregator.floatNilVectorAggregator();
+    }

Review Comment:
   This looks like the Double one which I had comments on, please apply here too



##########
processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstVectorAggregator.java:
##########
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.first;
+
+import org.apache.druid.java.util.common.DateTimes;
+import org.apache.druid.query.aggregation.SerializablePairLongString;
+import org.apache.druid.query.aggregation.VectorAggregator;
+import org.apache.druid.segment.DimensionHandlerUtils;
+import org.apache.druid.segment.vector.BaseLongVectorValueSelector;
+import org.apache.druid.segment.vector.VectorObjectSelector;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+public class StringFirstVectorAggregator implements VectorAggregator
+{
+  private static final SerializablePairLongString INIT = new 
SerializablePairLongString(
+      DateTimes.MAX.getMillis(),
+      null
+  );
+  private final BaseLongVectorValueSelector timeSelector;
+  private final VectorObjectSelector valueSelector;
+  private final int maxStringBytes;
+  //protected long firstTime;

Review Comment:
   commented code alert



##########
processing/src/test/java/org/apache/druid/query/aggregation/first/DoubleFirstVectorAggregationTest.java:
##########
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.first;
+
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.query.aggregation.VectorAggregator;
+import org.apache.druid.segment.vector.BaseLongVectorValueSelector;
+import org.apache.druid.segment.vector.VectorColumnSelectorFactory;
+import org.apache.druid.segment.vector.VectorValueSelector;
+import org.apache.druid.testing.InitializedNullHandlingTest;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.mockito.Answers;
+import org.mockito.Mock;
+import org.mockito.Mockito;
+import org.mockito.junit.MockitoJUnitRunner;
+
+import java.nio.ByteBuffer;
+import java.util.concurrent.ThreadLocalRandom;
+
+@RunWith(MockitoJUnitRunner.class)
+public class DoubleFirstVectorAggregationTest extends 
InitializedNullHandlingTest
+{
+  private static final double EPSILON = 1e-5;
+  private static final double[] VALUES = new double[]{7.8d, 11, 23.67, 60};
+  private static final boolean[] NULLS = new boolean[]{false, false, true, 
false};
+  private long[] times = {2436, 6879, 7888, 8224};
+
+  private static final String NAME = "NAME";
+  private static final String FIELD_NAME = "FIELD_NAME";
+  private static final String TIME_COL = "__time";
+
+  @Mock
+  private VectorValueSelector selector;
+  @Mock
+  private BaseLongVectorValueSelector timeSelector;

Review Comment:
   These are both interfaces, if there don't already exist test-oriented 
implementations of these interfaces, please create them instead of mocking 
things.
   
   1) Mockito needs to be killed from the codebase, it should not be used.
   2) The tests will always be easier to understand and debug if there is a 
test class implementation of the interface instead of using mocks.



##########
processing/src/test/java/org/apache/druid/query/aggregation/first/DoubleFirstVectorAggregationTest.java:
##########
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.first;
+
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.query.aggregation.VectorAggregator;
+import org.apache.druid.segment.vector.BaseLongVectorValueSelector;
+import org.apache.druid.segment.vector.VectorColumnSelectorFactory;
+import org.apache.druid.segment.vector.VectorValueSelector;
+import org.apache.druid.testing.InitializedNullHandlingTest;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.mockito.Answers;
+import org.mockito.Mock;
+import org.mockito.Mockito;
+import org.mockito.junit.MockitoJUnitRunner;
+
+import java.nio.ByteBuffer;
+import java.util.concurrent.ThreadLocalRandom;
+
+@RunWith(MockitoJUnitRunner.class)

Review Comment:
   Please re-write this to not use Mockito.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to