[GitHub] [incubator-pinot] haibow commented on a change in pull request #5345: Create PULL_REQUEST_TEMPLATE.md

2020-05-09 Thread GitBox


haibow commented on a change in pull request #5345:
URL: https://github.com/apache/incubator-pinot/pull/5345#discussion_r422594460



##
File path: .github/PULL_REQUEST_TEMPLATE.md
##
@@ -0,0 +1,13 @@
+## Upgrade Notes
+Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
+* [ ] Yes (Please label as **backward-incompat**)
+
+Does this PR fix a zero-downtime upgrade introduced earlier?
+* [ ] Yes (Please label this as **backward-incompat**)
+
+Does this PR otherwise need attention when creating release notes? Things to 
consider:
+- New configuration options
+- Deprecation of configurations
+- Signature changes to public methods/interfaces
+- New plugins added or old plugins removed
+* [ ] Yes (Please label this PR as **release-notes**)

Review comment:
   Even with the "Things to consider", the release notes section is still 
just a Yes/No question. When the answer is Yes, better explicitly ask the 
author to prepare some release notes, to make the release master's life easier.
   
   e.g. (borrowing the norm from Presto)
   
   ```
   == RELEASE NOTES ==
   
   General Changes
   * ...
   * ...
   ```
   
   If release note is NOT required, use:
   
   ```
   == NO RELEASE NOTE ==
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (9246d7e -> 44b0a91)

2020-05-09 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 9246d7e  Update package.json to resolve sharp version to fix the local 
build issue. (#5215)
 add 44b0a91  Fix NoSuchMethodError on ServerConfiguration.addHttpHandler 
(#5357)

No new revisions were added by this update.

Summary of changes:
 pom.xml | 6 --
 1 file changed, 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] codecov-io commented on pull request #5357: Fix NoSuchMethodError on ServerConfiguration.addHttpHandler

2020-05-09 Thread GitBox


codecov-io commented on pull request #5357:
URL: https://github.com/apache/incubator-pinot/pull/5357#issuecomment-626278291


   # 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5357?src=pr&el=h1) 
Report
   > Merging 
[#5357](https://codecov.io/gh/apache/incubator-pinot/pull/5357?src=pr&el=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-pinot/commit/9246d7efc7ce282f35215c887a96a1bca1a2a6a8&el=desc)
 will **decrease** coverage by `9.40%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-pinot/pull/5357/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/5357?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5357  +/-   ##
   ==
   - Coverage   66.36%   56.96%   -9.41% 
   ==
 Files1075 1075  
 Lines   5477354773  
 Branches 8168 8168  
   ==
   - Hits3635131199-5152 
   - Misses  1575521129+5374 
   + Partials 2667 2445 -222 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-pinot/pull/5357?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...a/org/apache/pinot/minion/metrics/MinionMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/pinot/minion/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...he/pinot/core/query/reduce/ComparisonFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvQ29tcGFyaXNvbkZ1bmN0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...pinot/minion/exception/TaskCancelledException.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXhjZXB0aW9uL1Rhc2tDYW5jZWxsZWRFeGNlcHRpb24uamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...ot/minion/events/EventObserverFactoryRegistry.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXZlbnRzL0V2ZW50T2JzZXJ2ZXJGYWN0b3J5UmVnaXN0cnkuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...t/minion/executor/TaskExecutorFactoryRegistry.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXhlY3V0b3IvVGFza0V4ZWN1dG9yRmFjdG9yeVJlZ2lzdHJ5LmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...minion/executor/ConvertToRawIndexTaskExecutor.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXhlY3V0b3IvQ29udmVydFRvUmF3SW5kZXhUYXNrRXhlY3V0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...nion/events/DefaultMinionEventObserverFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXZlbnRzL0RlZmF1bHRNaW5pb25FdmVudE9ic2VydmVyRmFjdG9yeS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...startree/executor/StarTreeAggregationExecutor.java](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zdGFydHJlZS9leGVjdXRvci9TdGFyVHJlZUFnZ3JlZ2F0aW9uRXhlY3V0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [312 
more](https://codecov.io/gh/apache/incubator-pinot/pull/5357/diff?src=pr&el=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5357?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/a

[GitHub] [incubator-pinot] jackjlli opened a new pull request #5357: Fix NoSuchMethodError on ServerConfiguration.addHttpHandler

2020-05-09 Thread GitBox


jackjlli opened a new pull request #5357:
URL: https://github.com/apache/incubator-pinot/pull/5357


   We encounter the following exception when running any command for 
`PerfBenchmarkRunner`:
   ```
   Exception in thread "main" java.lang.NoSuchMethodError: 
org.glassfish.grizzly.http.server.ServerConfiguration.addHttpHandler(Lorg/glassfish/grizzly/http/server/HttpHandler;[Lorg/glassfish/grizzly/http/server/HttpHandlerRegistration;)V
at 
org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpServerFactory.createHttpServer(GrizzlyHttpServerFactory.java:258)
at 
org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpServerFactory.createHttpServer(GrizzlyHttpServerFactory.java:93)
at 
org.apache.pinot.controller.api.ControllerAdminApiApplication.start(ControllerAdminApiApplication.java:85)
at 
org.apache.pinot.controller.ControllerStarter.setUpPinotController(ControllerStarter.java:359)
at 
org.apache.pinot.controller.ControllerStarter.start(ControllerStarter.java:231)
at 
org.apache.pinot.tools.perf.PerfBenchmarkDriver.startController(PerfBenchmarkDriver.java:200)
at 
org.apache.pinot.tools.perf.PerfBenchmarkDriver.run(PerfBenchmarkDriver.java:172)
at 
org.apache.pinot.tools.perf.PerfBenchmarkRunner.startAllButServer(PerfBenchmarkRunner.java:110)
at 
org.apache.pinot.tools.perf.PerfBenchmarkRunner.execute(PerfBenchmarkRunner.java:96)
at 
org.apache.pinot.tools.PinotToolLauncher.execute(PinotToolLauncher.java:67)
at 
org.apache.pinot.tools.PinotToolLauncher.main(PinotToolLauncher.java:79)
   ```
   This PR removes the exclusion that caused the exception.
   
   Tested by running `mvn clean install -DskipTests -Pbin-dist` and tests 
passed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5345: Create PULL_REQUEST_TEMPLATE.md

2020-05-09 Thread GitBox


mcvsubbu commented on a change in pull request #5345:
URL: https://github.com/apache/incubator-pinot/pull/5345#discussion_r422568460



##
File path: .github/PULL_REQUEST_TEMPLATE.md
##
@@ -0,0 +1,13 @@
+## Upgrade Notes
+Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
+* [ ] Yes (Please label as **backward-incompat**)
+
+Does this PR fix a zero-downtime upgrade introduced earlier?

Review comment:
   1. LinkedIn pulls weekly from open source and deploys into production. 
LinkedIn (and others who may pull regularly from open source pinot) can get 
some information of what may break the deployment (even if retro-fixed).
   2. It is a good practice to have release notes that describe the upgrade 
scenarios that may break. If we have good labels on PRs, whoever cuts the 
release can compile the release notes easily.
   3. It hopefully makes the reviewers think whether something may break 
compatibility.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5345: Create PULL_REQUEST_TEMPLATE.md

2020-05-09 Thread GitBox


mcvsubbu commented on a change in pull request #5345:
URL: https://github.com/apache/incubator-pinot/pull/5345#discussion_r422568206



##
File path: .github/PULL_REQUEST_TEMPLATE.md
##
@@ -0,0 +1,13 @@
+## Upgrade Notes

Review comment:
   I had the before, but the problem is that when someone creates a PR, the 
description shows up _before_ the Description section. I can add it if that is 
blocking merge of this PR





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] reallocf commented on a change in pull request #5326: Add toDateTime DateTimeFunction (#5313)

2020-05-09 Thread GitBox


reallocf commented on a change in pull request #5326:
URL: https://github.com/apache/incubator-pinot/pull/5326#discussion_r422564834



##
File path: 
pinot-core/src/test/java/org/apache/pinot/core/data/function/DateTimeFunctionEvaluatorTest.java
##
@@ -52,12 +54,14 @@ public void testDateTimeTransformFunctions(String 
transformFunction, List

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5240: Supporting range queries using indexes

2020-05-09 Thread GitBox


siddharthteotia commented on a change in pull request #5240:
URL: https://github.com/apache/incubator-pinot/pull/5240#discussion_r422563677



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java
##
@@ -0,0 +1,471 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.segment.creator.impl.inv;
+
+import com.google.common.base.Preconditions;
+import it.unimi.dsi.fastutil.Arrays;
+import it.unimi.dsi.fastutil.Swapper;
+import it.unimi.dsi.fastutil.ints.IntComparator;
+import java.io.BufferedOutputStream;
+import java.io.Closeable;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Constants;
+import org.apache.pinot.core.query.utils.Pair;
+import org.apache.pinot.core.segment.creator.InvertedIndexCreator;
+import org.apache.pinot.core.segment.memory.PinotDataBuffer;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.MetricFieldSpec;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+import static com.google.common.base.Charsets.UTF_8;
+import static 
org.apache.pinot.core.segment.creator.impl.V1Constants.Indexes.BITMAP_RANGE_INDEX_FILE_EXTENSION;
+
+
+/**
+ * Implementation of {@link InvertedIndexCreator} that uses off-heap memory.
+ * We use 2 passes to create the range index.
+ * 
+ *
+ *   
+ * A
+ *   
+ *   
+ * In the first pass (adding values phase), when add() method is called, 
store the raw values into the forward index
+ * value buffer (for multi-valued column also store number of values for 
each docId into forward index length
+ * buffer). We also compute the inverted index length for each dictId 
while adding values.
+ *   
+ *   
+ * In the second pass (processing values phase), when seal() method is 
called, all the dictIds should already been
+ * added. We first reorder the values into the inverted index buffers by 
going over the dictIds in forward index
+ * value buffer (for multi-valued column we also need forward index length 
buffer to get the docId for each dictId).
+ * Once we have the inverted index buffers, we simply go over them and 
create the bitmap for each dictId and
+ * serialize them into a file.
+ *   
+ * 
+ * Based on the number of values we need to store, we use direct memory or 
MMap file to allocate the buffer.
+ */
+public final class RangeIndexCreator implements InvertedIndexCreator {
+
+  private static final int RANGE_INDEX_VERSION = 1;
+
+  // Use MMapBuffer if the value buffer size is larger than 2G
+  private static final int NUM_VALUES_THRESHOLD_FOR_MMAP_BUFFER = 500_000_000;
+  private static final int DEFAULT_NUM_RANGES = 20;
+
+  private static final String FORWARD_INDEX_VALUE_BUFFER_SUFFIX = 
".fwd.idx.val.buf";
+  private static final String DOC_ID_VALUE_BUFFER_SUFFIX = ".doc.id.val.buf";
+
+  private final File _invertedIndexFile;
+  private final File _forwardIndexValueBufferFile;
+  private final File _docIdBufferFileForSorting;
+  private final int _numValues;
+  private final boolean _useMMapBuffer;
+
+  // Forward index buffers (from docId to dictId)
+  private int _nextDocId;
+  private PinotDataBuffer _forwardIndexValueBuffer;
+  private NumberValueBuffer _valueBuffer;
+  //for sorting
+  private PinotDataBuffer _docIdValueBuffer;
+  private IntValueBuffer _docIdBuffer;
+  // For multi-valued column only because each docId can have multiple dictIds
+  private int _nextValueId;
+  private int _numDocsPerRange;
+  private FieldSpec.DataType _valueType;
+
+  public RangeIndexCreator(File indexDir, FieldSpec fieldSpec, 
FieldSpec.DataType valueType, int numRanges, int numDocsPerRange, int numDocs, 
int numValues)
+  throws IOException {
+_valueType = valueType;
+String columnName = fieldSpec.getName();
+_invertedIndexFile = new File(i

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5240: Supporting range queries using indexes

2020-05-09 Thread GitBox


siddharthteotia commented on a change in pull request #5240:
URL: https://github.com/apache/incubator-pinot/pull/5240#discussion_r422563310



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java
##
@@ -0,0 +1,471 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.segment.creator.impl.inv;
+
+import com.google.common.base.Preconditions;
+import it.unimi.dsi.fastutil.Arrays;
+import it.unimi.dsi.fastutil.Swapper;
+import it.unimi.dsi.fastutil.ints.IntComparator;
+import java.io.BufferedOutputStream;
+import java.io.Closeable;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Constants;
+import org.apache.pinot.core.query.utils.Pair;
+import org.apache.pinot.core.segment.creator.InvertedIndexCreator;
+import org.apache.pinot.core.segment.memory.PinotDataBuffer;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.MetricFieldSpec;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+import static com.google.common.base.Charsets.UTF_8;
+import static 
org.apache.pinot.core.segment.creator.impl.V1Constants.Indexes.BITMAP_RANGE_INDEX_FILE_EXTENSION;
+
+
+/**
+ * Implementation of {@link InvertedIndexCreator} that uses off-heap memory.
+ * We use 2 passes to create the range index.
+ * 
+ *
+ *   
+ * A
+ *   
+ *   
+ * In the first pass (adding values phase), when add() method is called, 
store the raw values into the forward index
+ * value buffer (for multi-valued column also store number of values for 
each docId into forward index length
+ * buffer). We also compute the inverted index length for each dictId 
while adding values.
+ *   
+ *   
+ * In the second pass (processing values phase), when seal() method is 
called, all the dictIds should already been
+ * added. We first reorder the values into the inverted index buffers by 
going over the dictIds in forward index
+ * value buffer (for multi-valued column we also need forward index length 
buffer to get the docId for each dictId).
+ * Once we have the inverted index buffers, we simply go over them and 
create the bitmap for each dictId and
+ * serialize them into a file.
+ *   
+ * 
+ * Based on the number of values we need to store, we use direct memory or 
MMap file to allocate the buffer.
+ */
+public final class RangeIndexCreator implements InvertedIndexCreator {
+
+  private static final int RANGE_INDEX_VERSION = 1;
+
+  // Use MMapBuffer if the value buffer size is larger than 2G
+  private static final int NUM_VALUES_THRESHOLD_FOR_MMAP_BUFFER = 500_000_000;
+  private static final int DEFAULT_NUM_RANGES = 20;
+
+  private static final String FORWARD_INDEX_VALUE_BUFFER_SUFFIX = 
".fwd.idx.val.buf";
+  private static final String DOC_ID_VALUE_BUFFER_SUFFIX = ".doc.id.val.buf";
+
+  private final File _invertedIndexFile;
+  private final File _forwardIndexValueBufferFile;
+  private final File _docIdBufferFileForSorting;
+  private final int _numValues;
+  private final boolean _useMMapBuffer;
+
+  // Forward index buffers (from docId to dictId)
+  private int _nextDocId;
+  private PinotDataBuffer _forwardIndexValueBuffer;
+  private NumberValueBuffer _valueBuffer;
+  //for sorting
+  private PinotDataBuffer _docIdValueBuffer;
+  private IntValueBuffer _docIdBuffer;
+  // For multi-valued column only because each docId can have multiple dictIds
+  private int _nextValueId;
+  private int _numDocsPerRange;
+  private FieldSpec.DataType _valueType;
+
+  public RangeIndexCreator(File indexDir, FieldSpec fieldSpec, 
FieldSpec.DataType valueType, int numRanges, int numDocsPerRange, int numDocs, 
int numValues)
+  throws IOException {
+_valueType = valueType;
+String columnName = fieldSpec.getName();
+_invertedIndexFile = new File(i

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5240: Supporting range queries using indexes

2020-05-09 Thread GitBox


siddharthteotia commented on a change in pull request #5240:
URL: https://github.com/apache/incubator-pinot/pull/5240#discussion_r422562613



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java
##
@@ -0,0 +1,471 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.segment.creator.impl.inv;
+
+import com.google.common.base.Preconditions;
+import it.unimi.dsi.fastutil.Arrays;
+import it.unimi.dsi.fastutil.Swapper;
+import it.unimi.dsi.fastutil.ints.IntComparator;
+import java.io.BufferedOutputStream;
+import java.io.Closeable;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Constants;
+import org.apache.pinot.core.query.utils.Pair;
+import org.apache.pinot.core.segment.creator.InvertedIndexCreator;
+import org.apache.pinot.core.segment.memory.PinotDataBuffer;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.MetricFieldSpec;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+import static com.google.common.base.Charsets.UTF_8;
+import static 
org.apache.pinot.core.segment.creator.impl.V1Constants.Indexes.BITMAP_RANGE_INDEX_FILE_EXTENSION;
+
+
+/**
+ * Implementation of {@link InvertedIndexCreator} that uses off-heap memory.
+ * We use 2 passes to create the range index.
+ * 
+ *
+ *   
+ * A
+ *   
+ *   
+ * In the first pass (adding values phase), when add() method is called, 
store the raw values into the forward index
+ * value buffer (for multi-valued column also store number of values for 
each docId into forward index length
+ * buffer). We also compute the inverted index length for each dictId 
while adding values.
+ *   
+ *   
+ * In the second pass (processing values phase), when seal() method is 
called, all the dictIds should already been
+ * added. We first reorder the values into the inverted index buffers by 
going over the dictIds in forward index
+ * value buffer (for multi-valued column we also need forward index length 
buffer to get the docId for each dictId).
+ * Once we have the inverted index buffers, we simply go over them and 
create the bitmap for each dictId and
+ * serialize them into a file.
+ *   
+ * 
+ * Based on the number of values we need to store, we use direct memory or 
MMap file to allocate the buffer.
+ */
+public final class RangeIndexCreator implements InvertedIndexCreator {
+
+  private static final int RANGE_INDEX_VERSION = 1;
+
+  // Use MMapBuffer if the value buffer size is larger than 2G
+  private static final int NUM_VALUES_THRESHOLD_FOR_MMAP_BUFFER = 500_000_000;
+  private static final int DEFAULT_NUM_RANGES = 20;
+
+  private static final String FORWARD_INDEX_VALUE_BUFFER_SUFFIX = 
".fwd.idx.val.buf";
+  private static final String DOC_ID_VALUE_BUFFER_SUFFIX = ".doc.id.val.buf";
+
+  private final File _invertedIndexFile;
+  private final File _forwardIndexValueBufferFile;
+  private final File _docIdBufferFileForSorting;
+  private final int _numValues;
+  private final boolean _useMMapBuffer;
+
+  // Forward index buffers (from docId to dictId)
+  private int _nextDocId;
+  private PinotDataBuffer _forwardIndexValueBuffer;
+  private NumberValueBuffer _valueBuffer;

Review comment:
   Let's please rename it indexKeyBuffer? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5240: Supporting range queries using indexes

2020-05-09 Thread GitBox


siddharthteotia commented on a change in pull request #5240:
URL: https://github.com/apache/incubator-pinot/pull/5240#discussion_r422562477



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java
##
@@ -0,0 +1,471 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.segment.creator.impl.inv;
+
+import com.google.common.base.Preconditions;
+import it.unimi.dsi.fastutil.Arrays;
+import it.unimi.dsi.fastutil.Swapper;
+import it.unimi.dsi.fastutil.ints.IntComparator;
+import java.io.BufferedOutputStream;
+import java.io.Closeable;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Constants;
+import org.apache.pinot.core.query.utils.Pair;
+import org.apache.pinot.core.segment.creator.InvertedIndexCreator;
+import org.apache.pinot.core.segment.memory.PinotDataBuffer;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.MetricFieldSpec;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+import static com.google.common.base.Charsets.UTF_8;
+import static 
org.apache.pinot.core.segment.creator.impl.V1Constants.Indexes.BITMAP_RANGE_INDEX_FILE_EXTENSION;
+
+
+/**
+ * Implementation of {@link InvertedIndexCreator} that uses off-heap memory.
+ * We use 2 passes to create the range index.
+ * 
+ *
+ *   
+ * A
+ *   
+ *   
+ * In the first pass (adding values phase), when add() method is called, 
store the raw values into the forward index
+ * value buffer (for multi-valued column also store number of values for 
each docId into forward index length
+ * buffer). We also compute the inverted index length for each dictId 
while adding values.
+ *   
+ *   
+ * In the second pass (processing values phase), when seal() method is 
called, all the dictIds should already been
+ * added. We first reorder the values into the inverted index buffers by 
going over the dictIds in forward index
+ * value buffer (for multi-valued column we also need forward index length 
buffer to get the docId for each dictId).
+ * Once we have the inverted index buffers, we simply go over them and 
create the bitmap for each dictId and
+ * serialize them into a file.
+ *   
+ * 
+ * Based on the number of values we need to store, we use direct memory or 
MMap file to allocate the buffer.
+ */
+public final class RangeIndexCreator implements InvertedIndexCreator {
+
+  private static final int RANGE_INDEX_VERSION = 1;
+
+  // Use MMapBuffer if the value buffer size is larger than 2G
+  private static final int NUM_VALUES_THRESHOLD_FOR_MMAP_BUFFER = 500_000_000;
+  private static final int DEFAULT_NUM_RANGES = 20;
+
+  private static final String FORWARD_INDEX_VALUE_BUFFER_SUFFIX = 
".fwd.idx.val.buf";
+  private static final String DOC_ID_VALUE_BUFFER_SUFFIX = ".doc.id.val.buf";
+
+  private final File _invertedIndexFile;
+  private final File _forwardIndexValueBufferFile;
+  private final File _docIdBufferFileForSorting;
+  private final int _numValues;
+  private final boolean _useMMapBuffer;
+
+  // Forward index buffers (from docId to dictId)
+  private int _nextDocId;
+  private PinotDataBuffer _forwardIndexValueBuffer;
+  private NumberValueBuffer _valueBuffer;
+  //for sorting
+  private PinotDataBuffer _docIdValueBuffer;
+  private IntValueBuffer _docIdBuffer;

Review comment:
   This is very confusing. _docIdBuffer stores docIds. What does 
_docIdValueBuffer store?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5240: Supporting range queries using indexes

2020-05-09 Thread GitBox


siddharthteotia commented on a change in pull request #5240:
URL: https://github.com/apache/incubator-pinot/pull/5240#discussion_r422561957



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java
##
@@ -0,0 +1,471 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.segment.creator.impl.inv;
+
+import com.google.common.base.Preconditions;
+import it.unimi.dsi.fastutil.Arrays;
+import it.unimi.dsi.fastutil.Swapper;
+import it.unimi.dsi.fastutil.ints.IntComparator;
+import java.io.BufferedOutputStream;
+import java.io.Closeable;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.core.common.Constants;
+import org.apache.pinot.core.query.utils.Pair;
+import org.apache.pinot.core.segment.creator.InvertedIndexCreator;
+import org.apache.pinot.core.segment.memory.PinotDataBuffer;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.data.MetricFieldSpec;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+import static com.google.common.base.Charsets.UTF_8;
+import static 
org.apache.pinot.core.segment.creator.impl.V1Constants.Indexes.BITMAP_RANGE_INDEX_FILE_EXTENSION;
+
+
+/**
+ * Implementation of {@link InvertedIndexCreator} that uses off-heap memory.
+ * We use 2 passes to create the range index.
+ * 
+ *
+ *   
+ * A
+ *   
+ *   
+ * In the first pass (adding values phase), when add() method is called, 
store the raw values into the forward index
+ * value buffer (for multi-valued column also store number of values for 
each docId into forward index length
+ * buffer). We also compute the inverted index length for each dictId 
while adding values.
+ *   
+ *   
+ * In the second pass (processing values phase), when seal() method is 
called, all the dictIds should already been
+ * added. We first reorder the values into the inverted index buffers by 
going over the dictIds in forward index
+ * value buffer (for multi-valued column we also need forward index length 
buffer to get the docId for each dictId).
+ * Once we have the inverted index buffers, we simply go over them and 
create the bitmap for each dictId and
+ * serialize them into a file.
+ *   
+ * 
+ * Based on the number of values we need to store, we use direct memory or 
MMap file to allocate the buffer.
+ */
+public final class RangeIndexCreator implements InvertedIndexCreator {
+
+  private static final int RANGE_INDEX_VERSION = 1;
+
+  // Use MMapBuffer if the value buffer size is larger than 2G
+  private static final int NUM_VALUES_THRESHOLD_FOR_MMAP_BUFFER = 500_000_000;
+  private static final int DEFAULT_NUM_RANGES = 20;
+
+  private static final String FORWARD_INDEX_VALUE_BUFFER_SUFFIX = 
".fwd.idx.val.buf";
+  private static final String DOC_ID_VALUE_BUFFER_SUFFIX = ".doc.id.val.buf";
+
+  private final File _invertedIndexFile;
+  private final File _forwardIndexValueBufferFile;
+  private final File _docIdBufferFileForSorting;
+  private final int _numValues;
+  private final boolean _useMMapBuffer;
+
+  // Forward index buffers (from docId to dictId)

Review comment:
   Let's not use the term forward index inside range index creator. Can we 
please call it keyBuffer. The key that are stored could be dictIds or raw 
values.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (af5a901 -> 9246d7e)

2020-05-09 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from af5a901  Optimize RealtimeDictionaryBasedRangePredicateEvaluator by 
not scanning the dictionary when cardinality is high (#5331)
 add 9246d7e  Update package.json to resolve sharp version to fix the local 
build issue. (#5215)

No new revisions were added by this update.

Summary of changes:
 website/package.json | 3 +++
 1 file changed, 3 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] reallocf commented on a change in pull request #5326: Add toDateTime DateTimeFunction (#5313)

2020-05-09 Thread GitBox


reallocf commented on a change in pull request #5326:
URL: https://github.com/apache/incubator-pinot/pull/5326#discussion_r422550932



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/function/DefaultFunctionEvaluator.java
##
@@ -50,11 +50,16 @@
 
   public DefaultFunctionEvaluator(String expression)
   throws Exception {
+this(expression, FunctionRegistryFactory.getFunctionRegistry());

Review comment:
   I think I'm creating individual `DefaultFunctionRegistry`s in the right 
place now, but still not 100% sure. Let me know - thanks! :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] reallocf commented on a change in pull request #5326: Add toDateTime DateTimeFunction (#5313)

2020-05-09 Thread GitBox


reallocf commented on a change in pull request #5326:
URL: https://github.com/apache/incubator-pinot/pull/5326#discussion_r422551022



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/function/DateTimeFunctions.java
##
@@ -40,4 +47,14 @@ static Long toEpochHours(Long millis) {
   static Long toEpochMinutes(Long millis, String bucket) {
 return TimeUnit.MILLISECONDS.toMinutes(millis) / Integer.parseInt(bucket);
   }
+
+  DateTime toDateTime(String dateTimeString, String pattern) {
+if (!_dateTimeFormatterMap.containsKey(pattern)) {
+  _dateTimeFormatterMap.put(pattern, DateTimeFormat.forPattern(pattern));
+}
+
+DateTimeFormatter dateTimeFormatter = _dateTimeFormatterMap.get(pattern);
+
+return dateTimeFormatter.parseDateTime(dateTimeString);

Review comment:
   Okay, adjusted this so toDateTime takes in a Long millis and returns a 
String formattedDateTime and fromDateTime takes in a String formattedDateTime 
and outputs a Long millis - does that seem right to you?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] reallocf commented on a change in pull request #5326: Add toDateTime DateTimeFunction (#5313)

2020-05-09 Thread GitBox


reallocf commented on a change in pull request #5326:
URL: https://github.com/apache/incubator-pinot/pull/5326#discussion_r422550841



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/function/DateTimeFunctions.java
##
@@ -27,6 +32,8 @@
  */
 public class DateTimeFunctions {
 
+  private Map _dateTimeFormatterMap = new 
HashMap<>();

Review comment:
   Good catch! Yeah, I'm still new to the concurrency model here - thanks 
for keeping an eye out for this! :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5345: Create PULL_REQUEST_TEMPLATE.md

2020-05-09 Thread GitBox


kishoreg commented on a change in pull request #5345:
URL: https://github.com/apache/incubator-pinot/pull/5345#discussion_r422304502



##
File path: .github/PULL_REQUEST_TEMPLATE.md
##
@@ -0,0 +1,13 @@
+## Upgrade Notes
+Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
+* [ ] Yes (Please label as **backward-incompat**)
+
+Does this PR fix a zero-downtime upgrade introduced earlier?

Review comment:
   whats the purpose of this question?

##
File path: .github/PULL_REQUEST_TEMPLATE.md
##
@@ -0,0 +1,13 @@
+## Upgrade Notes

Review comment:
   Please add a description section. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] reallocf commented on a change in pull request #5326: Add toDateTime DateTimeFunction (#5313)

2020-05-09 Thread GitBox


reallocf commented on a change in pull request #5326:
URL: https://github.com/apache/incubator-pinot/pull/5326#discussion_r422548708



##
File path: 
pinot-core/src/test/java/org/apache/pinot/core/data/function/DefaultFunctionEvaluatorTest.java
##
@@ -54,49 +54,73 @@ public void testExpressionWithColumn()
   @Test
   public void testExpressionWithConstant()
   throws Exception {
-FunctionRegistry
-
.registerStaticFunction(MyFunc.class.getDeclaredMethod("daysSinceEpoch", 
String.class, String.class));
+MyFunc myFunc = new MyFunc();
+FunctionRegistry functionRegistry = new FunctionRegistry(
+
Lists.newArrayList(myFunc.getClass().getDeclaredMethod("daysSinceEpoch", 
String.class, String.class)));
 String input = "1980-01-01";
 String format = "-MM-dd";
 String expression = String.format("daysSinceEpoch('%s', '%s')", input, 
format);
-DefaultFunctionEvaluator evaluator = new 
DefaultFunctionEvaluator(expression);
+DefaultFunctionEvaluator evaluator = new 
DefaultFunctionEvaluator(expression, functionRegistry);
 Assert.assertTrue(evaluator.getArguments().isEmpty());
 GenericRow row = new GenericRow();
 Object result = evaluator.evaluate(row);
-Assert.assertEquals(result, MyFunc.daysSinceEpoch(input, format));
+Assert.assertEquals(result, myFunc.daysSinceEpoch(input, format));
   }
 
   @Test
   public void testMultiFunctionExpression()
   throws Exception {
-
FunctionRegistry.registerStaticFunction(MyFunc.class.getDeclaredMethod("reverseString",
 String.class));
-FunctionRegistry
-
.registerStaticFunction(MyFunc.class.getDeclaredMethod("daysSinceEpoch", 
String.class, String.class));
+MyFunc myFunc = new MyFunc();
+FunctionRegistry functionRegistry = new FunctionRegistry(Lists
+.newArrayList(myFunc.getClass().getDeclaredMethod("reverseString", 
String.class),
+myFunc.getClass().getDeclaredMethod("daysSinceEpoch", 
String.class, String.class)));
 String input = "1980-01-01";
-String reversedInput = MyFunc.reverseString(input);
+String reversedInput = myFunc.reverseString(input);
 String format = "-MM-dd";
 String expression = String.format("daysSinceEpoch(reverseString('%s'), 
'%s')", reversedInput, format);
-DefaultFunctionEvaluator evaluator = new 
DefaultFunctionEvaluator(expression);
+DefaultFunctionEvaluator evaluator = new 
DefaultFunctionEvaluator(expression, functionRegistry);
 Assert.assertTrue(evaluator.getArguments().isEmpty());
 GenericRow row = new GenericRow();
 Object result = evaluator.evaluate(row);
-Assert.assertEquals(result, MyFunc.daysSinceEpoch(input, format));
+Assert.assertEquals(result, myFunc.daysSinceEpoch(input, format));
   }
 
-  private static class MyFunc {
-static String reverseString(String input) {
-  return new StringBuilder(input).reverse().toString();
-}
+  @Test
+  public void testStateSharedBetweenRowsForExecution()
+  throws Exception {

Review comment:
   This test basically just confirms that the internal state of the 
FunctionRegistry is shared between each row. I agree with the current 
implementation it's fairly self-explanatory, but you can imagine an 
implementation where the internals of the FunctionRegistry are different for 
each row. This is to make sure we don't somehow regress to that, because then 
we'd see a big performance hit for creating a SDF for each row.
   
   But can definitely remove if it just feels like clutter to you ☺️ 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5240: Supporting range queries using indexes

2020-05-09 Thread GitBox


kishoreg commented on a change in pull request #5240:
URL: https://github.com/apache/incubator-pinot/pull/5240#discussion_r422547267



##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/IndexingConfig.java
##
@@ -27,6 +27,7 @@
 
 public class IndexingConfig extends BaseJsonConfig {
   private List _invertedIndexColumns;
+  private List _rangeIndexColumns;

Review comment:
   I agree. will have to do this as part of another PR. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on pull request #5307: Admin UI

2020-05-09 Thread GitBox


kishoreg commented on pull request #5307:
URL: https://github.com/apache/incubator-pinot/pull/5307#issuecomment-626232744


   @xxsacxx lets resolve conflicts?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] damianoporta opened a new issue #5356: Is DataType.BYTES control in aggregate() aggregateGroupBySV() aggregateGroupByMV() needed?

2020-05-09 Thread GitBox


damianoporta opened a new issue #5356:
URL: https://github.com/apache/incubator-pinot/issues/5356


   Hello, in different aggregators like the following:
   - 
https://github.com/apache/incubator-pinot/blob/master/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AvgAggregationFunction.java#L100
   
   - 
https://github.com/apache/incubator-pinot/blob/master/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/MinMaxRangeAggregationFunction.java#L104
   
   I see a "pseudo duplication" of the code because of the column DataType. 
But, is this really needed?
   I think that serialization/deserialization is done in 
**aggregationResultHolder** object only, no?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] KKcorps commented on a change in pull request #5293: Adding support for Protobuf input format

2020-05-09 Thread GitBox


KKcorps commented on a change in pull request #5293:
URL: https://github.com/apache/incubator-pinot/pull/5293#discussion_r422464033



##
File path: 
pinot-plugins/pinot-input-format/pinot-protobuf/src/main/java/org/apache/pinot/plugin/inputformat/protobuf/ProtoBufRecordReaderConfig.java
##
@@ -0,0 +1,34 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.plugin.inputformat.protobuf;
+
+import org.apache.pinot.spi.data.readers.RecordReaderConfig;
+
+
+public class ProtoBufRecordReaderConfig implements RecordReaderConfig {
+  private String _descriptorFile;
+
+  public String getDescriptorFile() {

Review comment:
   done

##
File path: 
pinot-plugins/pinot-input-format/pinot-protobuf/src/main/java/org/apache/pinot/plugin/inputformat/protobuf/ProtoBufRecordReader.java
##
@@ -0,0 +1,136 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.plugin.inputformat.protobuf;
+
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DynamicMessage;
+import com.google.protobuf.Message;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.spi.data.Schema;
+import org.apache.pinot.spi.data.readers.GenericRow;
+import org.apache.pinot.spi.data.readers.RecordReader;
+import org.apache.pinot.spi.data.readers.RecordReaderConfig;
+import org.apache.pinot.spi.data.readers.RecordReaderUtils;
+import org.apache.pinot.spi.utils.SchemaFieldExtractorUtils;
+
+
+public class ProtoBufRecordReader implements RecordReader {
+  private File _dataFile;
+  private Schema _schema;
+  private ProtoBufRecordExtractor _recordExtractor;
+
+  private InputStream _inputStream;
+  private boolean _hasNext;
+  private Descriptors.Descriptor _descriptor;
+
+  private boolean hasMoreToRead()
+  throws IOException {
+_inputStream.mark(1);
+int nextByte = _inputStream.read();
+_inputStream.reset();
+return nextByte != -1;
+  }
+
+  private void init()
+  throws IOException {
+_inputStream = RecordReaderUtils.getBufferedInputStream(_dataFile);
+try {
+  _hasNext = hasMoreToRead();
+} catch (Exception e) {
+  _inputStream.close();
+  throw e;
+}
+  }
+
+  @Override
+  public void init(File dataFile, Schema schema, @Nullable RecordReaderConfig 
recordReaderConfig)
+  throws IOException {
+_dataFile = dataFile;
+_schema = schema;
+Set sourceFields = SchemaFieldExtractorUtils.extract(schema);
+ProtoBufRecordExtractorConfig recordExtractorConfig = new 
ProtoBufRecordExtractorConfig();
+ProtoBufRecordReaderConfig protoBufRecordReaderConfig = 
(ProtoBufRecordReaderConfig) recordReaderConfig;
+String descriptorFile = protoBufRecordReaderConfig.getDescriptorFile();
+FileInputStream fin = new FileInputStream(descriptorFile);

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: comm

[GitHub] [incubator-pinot] KKcorps commented on pull request #5293: Adding support for Protobuf input format

2020-05-09 Thread GitBox


KKcorps commented on pull request #5293:
URL: https://github.com/apache/incubator-pinot/pull/5293#issuecomment-626121988


   @kishoreg Yes, it is complete from my end



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org