Apache Pinot Daily Email Digest (2021-11-15)

Pinot Slack Email Digest Mon, 15 Nov 2021 18:00:46 -0800

#general

@kautsshukla: Hi All, : “_error”: “Permission is denied for access type ‘READ’ to the endpoint “. It’s happening while i’m trying to connect through pinot-jdbc 0.8.0 version client using user pwd.
@julien.picard: @julien.picard has joined the channel
@ken: Not sure what channel is best for this, but I was looking into Pinot-related issues (from other projects) on the Apache Jira site, and noticed that there’s a Pinot project with a handful of old issues. See
@ken: Wondering if it’s possible to add a single open issue that says “file issues using GitHub at xxx”, and migrate/remove all the other issues, and prevent new issues from being created?
@ken: Right now if someone stumbles on this, it looks like Pinot is a dead project :disappointed:
@mayanks: Thanks @ken, agree it would be a good idea to point folks to GH issues.
@ken: I don’t have sufficient Jira-fu to do the above, but it should be pretty easy (other than determining, for each real issue that’s still open, whether it’s been replicated to GitHub)
@shantanoo.sinha: @shantanoo.sinha has joined the channel

#random

@julien.picard: @julien.picard has joined the channel
@shantanoo.sinha: @shantanoo.sinha has joined the channel

#feat-upsert

@elon.azoulay: @elon.azoulay has joined the channel

#pinot-helix

@elon.azoulay: @elon.azoulay has joined the channel

#group-by-refactor

@elon.azoulay: @elon.azoulay has joined the channel

#inconsistent-segment

@elon.azoulay: @elon.azoulay has joined the channel

#minion-star-tree

@elon.azoulay: @elon.azoulay has joined the channel

#troubleshooting

@yash.agarwal: We have a pinot cluster, some of our users are running very heavy queries which results in ```java.lang.OutOfMemoryError: Java heap space``` This is fine, but as the result of this the server instance is becoming unhealthy. i.e. Live Instance Config becomes ```{ "_code": 404, "_error": "ZKPath /PinotCluster/LIVEINSTANCES/Server_node_8098 does not exist:" }``` How can we solve the same ?
@g.kishore: you can set a limit for maxQueryLimit and maxGroupBy limits
@yash.agarwal: Sure. but even then there are cases when the limits are quite small but it is doing a count distinct on a large column.
@yash.agarwal: ideally we are preventing all these queries .. in our middle layer .. and converting them to optimized versions .. but just in case we want to avoid our nodes from going down.
@g.kishore: how large is it?
@g.kishore: did you try partitionedDistinct?
@yash.agarwal: Yes we have that implemented .. but these are very fringe cases hence trying to understand how to avoid from getting the node down.
@yash.agarwal: Our memory settings are Xms4G Xmx8G on 16 G nodes. Should we bump down our Xmx even further ?
@g.kishore: Two options • increase the memory to ensure that distinct values fit in memory add configuration to limit max distinct values or Enhance distinct operator to start using HLL when the number of unique goes beyond a certain size.. this will require code change
@yash.agarwal: I am more worried about making sure the node is able to fix itself after such a query.
@yash.agarwal: Currently the only option for us is to restart the server instance
@g.kishore: Can you please file an issue
@yash.agarwal: I think there is a similar issue already created. . Hence not creating a duplicate.
@alihaydar.atil: Hello everyone, I am using version 0.8.0. When i run the RealtimeProvisioningHelper command below, it gives me an exception. Any idea why it happens? I have put one realtime table segment in sampleCompletedSegmentDir directory. Command: ```root@pinot-controller-0:/opt/pinot# bin/pinot-admin.sh RealtimeProvisioningHelper -tableConfigFile /opt/pinot/denizTableConfig.json -numPartitions 1 -numHosts 2 -numHours 6,12,18,24 -sampleCompletedSegmentDir /opt/pinot/samplesegment/realtime/ -ingestionRate 100``` Exception: ```Executing command: RealtimeProvisioningHelper -tableConfigFile /opt/pinot/denizTableConfig.json -numPartitions 1 -pushFrequency null -numHosts 2 -numHours 6,12,18,24 -sampleCompletedSegmentDir /opt/pinot/samplesegment/realtime/ -ingestionRate 100 -maxUsableHostMemory 48G -retentionHours 0 Exception caught: java.lang.RuntimeException: Caught exception when reading segment index dir at org.apache.pinot.controller.recommender.realtime.provisioning.MemoryEstimator.<init>(MemoryEstimator.java:117) ~[pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at org.apache.pinot.tools.admin.command.RealtimeProvisioningHelperCommand.execute(RealtimeProvisioningHelperCommand.java:268) ~[pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:169) [pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:189) [pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] Caused by: java.lang.NullPointerException: Cannot find segment metadata file under directory: /opt/pinot/samplesegment/realtime at shaded.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:864) ~[pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at org.apache.pinot.segment.spi.index.metadata.SegmentMetadataImpl.getPropertiesConfiguration(SegmentMetadataImpl.java:144) ~[pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at org.apache.pinot.segment.spi.index.metadata.SegmentMetadataImpl.<init>(SegmentMetadataImpl.java:117) ~[pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] at org.apache.pinot.controller.recommender.realtime.provisioning.MemoryEstimator.<init>(MemoryEstimator.java:115) ~[pinot-all-0.9.0-SNAPSHOT-jar-with-dependencies.jar:0.9.0-SNAPSHOT-517a0dcea48a7dcb8616addc403c20e0fc23484a] ... 3 more``` realtime table config file [-tableConfigFile /opt/pinot/denizTableConfig.json] ```{ "tableName": "denizhybrid", "tableType": "REALTIME", "segmentsConfig": { "timeColumnName": "messageTime", "timeType": "MILLISECONDS", "schemaName": "deniz", "replicasPerPartition": "1", "retentionTimeUnit":"DAYS", "retentionTimeValue":"2" }, "tenants": {}, "fieldConfigList": [ { "name": "location_st_point", "encodingType":"RAW", "indexType":"H3", "properties": { "resolutions": "5" } } ], "tableIndexConfig": { "loadMode": "MMAP", "rangeIndexColumns": [ "latitude", "longitude" ], "noDictionaryColumns": [ "location_st_point" ], "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "lowlevel", "stream.kafka.topic.name": "kafkadeniztest2", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": "kafka:9092", "realtime.segment.flush.threshold.size": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "50M", "stream.kafka.consumer.prop.auto.offset.reset": "smallest" } }, "query": { "timeoutMs": 60000 }, "metadata": { "customConfigs": {} }, "task": { "taskTypeConfigsMap": { "RealtimeToOfflineSegmentsTask": { "bucketTimePeriod":"6h", "bufferTimePeriod":"9h", "maxNumRecordsPerSegment":"1000000" } } } }``` Thanks in Advance.
@mayanks: Can you list files inside of segment dir that you provided?
@alihaydar.atil: there is only one segment file inside -sampleCompletedSegmentDir which is named denizhybrid__0__23__20211114T2333Z. it is around 30MB I also have tried putting only one offline segment file named denizhybrid_1636480800506_1636502392443_0 in that folder but got the same exception
@kchavda: Hi All, had a few questions about using `Pinot managed offline flows`. Any help would be greatly appreciated! 1. Does the OFFLINE table config need to have the `RealtimeToOfflineSegmentsTask` match the one added to the REALTIME table config? 2. I'm seeing this `TASK_ERROR to DROPPED` in the minion log. What does this signify? ```20 START:INVOKE /PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932 type: CALLBACK Resubscribe change listener to path: /PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES, for listener: org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932, watchChild: false Subscribing changes listener to path: /PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932 Subscribing child change listener to path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES Subscribing to path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES took:0 The latency of message 6a8ac921-3913-43e8-a777-b15c16185245 is 7 ms Scheduling message 6a8ac921-3913-43e8-a777-b15c16185245: TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0, TASK_ERROR->DROPPED Submit task: 6a8ac921-3913-43e8-a777-b15c16185245 to pool: java.util.concurrent.ThreadPoolExecutor@67024f54[Running, pool size = 40, active threads = 0, queued tasks = 0, completed tasks = 221] Message: 6a8ac921-3913-43e8-a777-b15c16185245 handling task scheduled 20 END:INVOKE /PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932 type: CALLBACK Took: 8ms handling task: 6a8ac921-3913-43e8-a777-b15c16185245 begin, at: 1636993355435 handling message: 6a8ac921-3913-43e8-a777-b15c16185245 transit TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945.TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0|[] from:TASK_ERROR to:DROPPED, relayedFrom: null Merging with delta list, recordId = TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945 other:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945 Instance Minion_172.19.0.6_9514, partition TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0 received state transition from TASK_ERROR to DROPPED on session 1005c465f540008, message id: 6a8ac921-3913-43e8-a777-b15c16185245 Merging with delta list, recordId = TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945 other:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945 Removed /PinotCluster/INSTANCES/Minion_172.19.0.6_9514/CURRENTSTATES/1005c465f540008/TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945 Message 6a8ac921-3913-43e8-a777-b15c16185245 completed. Delete message 6a8ac921-3913-43e8-a777-b15c16185245 from zk! message finished: 6a8ac921-3913-43e8-a777-b15c16185245, took 14 Message: 6a8ac921-3913-43e8-a777-b15c16185245 (parent: null) handling task for TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945:TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1636993325945_0 completed at: 1636993355449, results: true. FrameworkTime: 1 ms; HandlerTime: 13 ms. Subscribing changes listener to path: /PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@157c6932 Subscribing child change listener to path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES Subscribing to path:/PinotCluster/INSTANCES/Minion_172.19.0.6_9514/MESSAGES took:0``` 3. The tasks/scheduler/information API endpoint returns "Task scheduler is disabled". I've added entry to controller config `"controller.task.frequencyInSeconds": 3600` is there some other setting I need to configure? 4. The tasks/task/taskname/state is giving a `500 Index 1 out of bounds for length 1"` but tasks/tasktype/taskstates shows completed. I'm not seeing any segments added to my OFFLINE table though. Any idea on what's missing?
@npawar: 1. No need to set anything in offline table 2. Looks like the task had some exceptions, there should be some more logs about why the task failed and went into TASK_ERROR state (and then from TASK_ERROR to DROPPED). Any exception/error logs from before what you’ve pasted? 3. Not sure why is says disabled. As long as you’re seeing controller create tasks from the logs, and monion pick up the tasks, you’re good. If you’re not seeing that, try adding this `controller.task.scheduler.enabled: true` to the controller config 4. lets see more logs from controller/minion?
@kchavda: 1 :white_check_mark: 2 - I didn't go far back on the log. There is exception an exception ```Caught exception while fetching segment from: to: /tmp/PinotMinion/data/RealtimeToOfflineSegmentsTask/tmp-106dfc56-8986-48a1-98cb-c97c7b2bc767/tarredSegmentFile_0 java.lang.IllegalStateException: PinotFS for scheme: s3 has not been initialized``` I am using S3 for deep storage. I do see segments being written there. I'm guessing I need to pass in `env` values for access key and secret key? 3/4 - Maybe fixing the above will fix these? @npawar
@npawar: ah.. you need to add deep store properties to your minion components. you must’ve added some deep store configs to controller/server?
@kchavda: Yes, I added the configs to controller/server and the segments are being written to S3. Are minion specific configs documented?
@npawar: no.. let me add it
@npawar: would love some detailed feedback about the docs from you after this :stuck_out_tongue:
@kchavda: Great! Thank you Neha!
@kchavda: For sure :slightly_smiling_face:
@kchavda: I'm about to watch your presentation on this topic from back in July!
@npawar:
@npawar: added for all the FS. it’s the same as server/controller, except the prefix is shorter
@kchavda: Great! Thank you!
@kchavda: In comparing the config for controller, my working version I had to add the following: ```controller.helix.cluster.name=PinotCluster controller.zk.str=pinot-zookeeper:2181 controller.host= controller.port=9000``` Which is referenced in the tutorial
@kchavda: For server.conf the following ( present in tutorial) ```pinot.server.netty.port=8098 pinot.server.adminapi.port=8097 pinot.server.instance.dataDir=/tmp/pinot-tmp/server/index pinot.server.instance.segmentTarDir=/tmp/pinot-tmp/server/segmentTars```
@kchavda: So restarted Minion and that resolved the S3 error
@kchavda: However seeing errors on controller/server and OFFLINE table being in bad status.
@kchavda: Server log excerpt ```2021/11/15 20:59:35.301 ERROR [SegmentFetcherAndLoader] [HelixTaskExecutor-message_handle_thread] Attempts exceeded when downloading segment: consolidations_1550859587018_1550861765735_0 for table: consolidations_OFFLINE from: to: /tmp/PinotServer/segmentTar/consolidations_OFFLINE/tmp-consolidations_1550859587018_1550861765735_0-fb7cfa39-13a5-4f3e-813b-f4e36a505290/consolidations_1550859587018_1550861765735_0.tar.gz 2021/11/15 20:59:35.302 ERROR [SegmentFetcherAndLoader] [HelixTaskExecutor-message_handle_thread] Cannot load segment : consolidations_1550859587018_1550861765735_0 for table consolidations_OFFLINE org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 3 attempts at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:61) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.BaseSegmentFetcher.fetchSegmentToLocal(BaseSegmentFetcher.java:72) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocalInternal(SegmentFetcherFactory.java:146) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocal(SegmentFetcherFactory.java:141) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.downloadSegmentToLocal(SegmentFetcherAndLoader.java:198) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.addOrReplaceOfflineSegment(SegmentFetcherAndLoader.java:154) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:166) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2021/11/15 20:59:35.302 ERROR [SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] [HelixTaskExecutor-message_handle_thread] Caught exception in state transition from OFFLINE -> ONLINE for resource: consolidations_OFFLINE, partition: consolidations_1550859587018_1550861765735_0 org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 3 attempts at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:61) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.BaseSegmentFetcher.fetchSegmentToLocal(BaseSegmentFetcher.java:72) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocalInternal(SegmentFetcherFactory.java:146) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocal(SegmentFetcherFactory.java:141) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.downloadSegmentToLocal(SegmentFetcherAndLoader.java:198) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.addOrReplaceOfflineSegment(SegmentFetcherAndLoader.java:154) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:166) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2021/11/15 20:59:35.303 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Exception while executing a state transition task consolidations_1550859587018_1550861765735_0 java.lang.reflect.InvocationTargetException: null at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 3 attempts at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:61) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.BaseSegmentFetcher.fetchSegmentToLocal(BaseSegmentFetcher.java:72) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocalInternal(SegmentFetcherFactory.java:146) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.common.utils.fetcher.SegmentFetcherFactory.fetchSegmentToLocal(SegmentFetcherFactory.java:141) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.downloadSegmentToLocal(SegmentFetcherAndLoader.java:198) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.addOrReplaceOfflineSegment(SegmentFetcherAndLoader.java:154) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:166) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] ... 12 more 2021/11/15 20:59:35.312 ERROR [StateModel] [HelixTaskExecutor-message_handle_thread] Default rollback method invoked on error. Error Code: ERROR``` I see a new segment on S3 though.
@kchavda: controller error: ```2021/11/15 20:59:27.429 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(1ee314dc_DEFAULT)] Event 1ee314dc_DEFAULT : Unable to find a next state for resource: profiles_OFFLINE partition: profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:27.448 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(9f672776_DEFAULT)] Event 9f672776_DEFAULT : Unable to find a next state for resource: profiles_OFFLINE partition: profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:35.340 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(2bdf94fa_DEFAULT)] Event 2bdf94fa_DEFAULT : Unable to find a next state for resource: consolidations_OFFLINE partition: consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:35.340 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(2bdf94fa_DEFAULT)] Event 2bdf94fa_DEFAULT : Unable to find a next state for resource: profiles_OFFLINE partition: profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:35.362 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(a26a1dc5_DEFAULT)] Event a26a1dc5_DEFAULT : Unable to find a next state for resource: consolidations_OFFLINE partition: consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:35.363 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(a26a1dc5_DEFAULT)] Event a26a1dc5_DEFAULT : Unable to find a next state for resource: profiles_OFFLINE partition: profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:35.378 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(76e13678_DEFAULT)] Event 76e13678_DEFAULT : Unable to find a next state for resource: consolidations_OFFLINE partition: consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:35.378 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(76e13678_DEFAULT)] Event 76e13678_DEFAULT : Unable to find a next state for resource: profiles_OFFLINE partition: profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 20:59:55.384 ERROR [CompletionServiceHelper] [grizzly-http-server-15] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 20:59:55.432 ERROR [CompletionServiceHelper] [grizzly-http-server-4] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:04.680 ERROR [CompletionServiceHelper] [grizzly-http-server-7] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:04.730 ERROR [CompletionServiceHelper] [grizzly-http-server-0] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:07.529 ERROR [CompletionServiceHelper] [grizzly-http-server-6] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:00:07.578 ERROR [CompletionServiceHelper] [grizzly-http-server-7] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:43.084 ERROR [CompletionServiceHelper] [grizzly-http-server-3] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:43.179 ERROR [CompletionServiceHelper] [grizzly-http-server-6] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:46.233 ERROR [CompletionServiceHelper] [grizzly-http-server-11] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:46.284 ERROR [CompletionServiceHelper] [grizzly-http-server-12] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:01:52.885 ERROR [CompletionServiceHelper] [grizzly-http-server-6] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:14.731 ERROR [CompletionServiceHelper] [grizzly-http-server-11] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:17.036 ERROR [CompletionServiceHelper] [grizzly-http-server-11] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:17.081 ERROR [CompletionServiceHelper] [grizzly-http-server-15] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:20.184 ERROR [CompletionServiceHelper] [grizzly-http-server-4] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:02:20.276 ERROR [CompletionServiceHelper] [grizzly-http-server-6] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:03:44.386 ERROR [CompletionServiceHelper] [grizzly-http-server-15] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:03:44.438 ERROR [CompletionServiceHelper] [grizzly-http-server-10] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:06:56.253 ERROR [CompletionServiceHelper] [grizzly-http-server-12] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:06:56.287 ERROR [CompletionServiceHelper] [grizzly-http-server-1] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:10:43.545 ERROR [CompletionServiceHelper] [grizzly-http-server-9] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:10:43.575 ERROR [CompletionServiceHelper] [grizzly-http-server-13] Server: Server_172.19.0.5_8098 returned error: 404 2021/11/15 21:12:16.096 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(d838849b_DEFAULT)] Event d838849b_DEFAULT : Unable to find a next state for resource: consolidations_OFFLINE partition: consolidations_1550859587018_1550861765735_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE 2021/11/15 21:12:16.097 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-PinotCluster-(d838849b_DEFAULT)] Event d838849b_DEFAULT : Unable to find a next state for resource: profiles_OFFLINE partition: profiles_1413387486771_1413405745431_0 from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:ONLINE```
@npawar: were any changes made to the controller/server side deep store configs?
@kchavda: Nope. I just added config to minion and restarted the docker container.
@tony: Based on a thread from a few days ago, I changed our Pinot deployment from 6 controllers to 3. Now I am seeing three controllers as "dead" in Cluster Manager, and I am getting `segments ... unavailable` errors (though I am not sure these two issues are related 1. How do I get rid of "dead" controllers when I reduce the number of controllers? 2. Could this cause `segment ... unavailable` ?
@mayanks: For 2, could you share the output of debug api (from swagger)?
@tony: For 2, the segments eventually changed to a good state - I think I had not given the severs enough time to recover after restarting (for something else). So (1) is not related to (2) - but I would still like to fix (1)
@mayanks: Can you paste the screenshot for 1)?
@mayanks: Also, is it just the UI issue, or does ZK browser also show the issue? If former can you help file a GH issue?
@tony:
@tony: Not sure where to look in ZK browser
@mayanks: You can check if there's any reference of the removed controllers in the `CONTROLLER`, `INSTANCES` or `LIVEINSTANCES` nodes. If not, then this may be a UI issue.
@tony: There are references in INSTANCES but not LIVEINSTANCES
@mayanks: Thanks @tony, could you file a GH issue (with as much info as you can in terms of symptoms and repro steps)?
@tony: Will do
@elon.azoulay: Hi, observed that increasing the zk client timeout in the pinot zookeeper does not prevent a zk client timeout from helix, which is hardcoded. We see these errors when the brokers are under heavy gc pressure, gc pauses, etc.
@elon.azoulay: ```org.apache.helix.manager.zk public class ZkClient extends org.apache.helix.manager.zk.zookeeper.ZkClient implements HelixZkClient { ... public static final int DEFAULT_SESSION_TIMEOUT = 30 * 1000;```
@elon.azoulay: @jxue @g.kishore
@elon.azoulay: Would it make sense to make this configurable?
@elon.azoulay: I do agree w @g.kishore’s concern that the longer the timeout the longer an outage is not detected
@mayanks: Not to side track the conversation @elon.azoulay, but are you seeing 30s GCs?
@elon.azoulay: Thanksfully we didn't but we did see a spike in gc activity/duration
@elon.azoulay: Still investigating, could be a readiness probe failing
@julien.picard: @julien.picard has joined the channel
@docchial: @docchial has joined the channel
@sandeep.hadoopadmn: Hi team, Can we join two tables and query?
@g.kishore: depends on the join type, only lookup join is supported as of now
@kchavda:
@sandeep.hadoopadmn: thank you
@shantanoo.sinha: @shantanoo.sinha has joined the channel

#pinot-k8s-operator

@elon.azoulay: @elon.azoulay has joined the channel

#multi-region-setup

@elon.azoulay: @elon.azoulay has joined the channel

#metadata-push-api

@elon.azoulay: @elon.azoulay has joined the channel

#pinot-perf-tuning

@rohit: @rohit has joined the channel

#thirdeye-pinot

@shreya.chakraborty: @shreya.chakraborty has joined the channel
@shreya.chakraborty: Hi everyone :wave: Whats the process for pushing a new release to the repo? I dont see any release history/issues.
@rohit: @rohit has joined the channel

#getting-started

@bagi.priyank: hello. i started two pinot clusters with both of them consuming from the same kafka cluster and same topic. one pinot cluster is using inverted index on the same set of fields that the other one uses for star-tree index. so basically two pinot tables where the only difference is that first one uses inverted index while second one uses star-tree index. i created tables at the same time so i am assuming that both start consuming from the kafka topic at the same time. when i issue same query to both tables one after another, i see that `totalDocs` is 2x/3x for table with inverted index in comparison to table with star-tree index. if it matters, i started querying tables after ~5-10 mins of creating them. i also confirmed this by running ```select count(*) from <table_name>``` is this expected?
@bagi.priyank: i noticed that `group.id =` (basically empty) as so maybe both pinot tables are using the same group id.
@bagi.priyank: i tried using ``` "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "lowLevel", "stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": <broker_list>, "realtime.segment.flush.threshold.size": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "50M", "stream.kafka.consumer.prop.auto.offset.reset": "largest", "stream.kafka.consumer.prop.group.id": <group_id>, "stream.kafka.decoder.prop.schema": <schema> }``` and ``` "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "highLevel", "stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.hlc.bootstrap.server": <broker_list>, "realtime.segment.flush.threshold.size": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "50M", "stream.kafka.consumer.prop.auto.offset.reset": "largest", "stream.kafka.consumer.prop.group.id": <group_id>, "stream.kafka.decoder.prop.schema": <schema> }``` and ``` "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "highLevel", "stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.hlc.bootstrap.server": <broker_list>, "realtime.segment.flush.threshold.size": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "50M", "stream.kafka.consumer.prop.auto.offset.reset": "largest", "stream.kafka.consumer.prop.hlc.group.id": <group_id>, "stream.kafka.decoder.prop.schema": <schema> }``` and none of those worked. finally after looking at code i tried ``` "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "lowLevel", "stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": <broker_list>, "stream.kafka.consumer.prop.auto.offset.reset": "largest", "stream.kafka.group.id": <group_id>, "stream.kafka.decoder.prop.schema": <schema>, "realtime.segment.flush.threshold.size": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "50M" },``` and that was able to consume from kafka but i don't see it in the list of kafka consumer groups. logs still say group.id is empty. any help / pointers are appreciated.
@bagi.priyank: also tried ``` "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "highLevel", "stream.kafka.topic.name": <topic_name>, "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.hlc.bootstrap.server": <broker_list>, "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "stream.kafka.hlc.group.id": <group_id>, "stream.kafka.decoder.prop.schema": <schema>, "realtime.segment.flush.threshold.size": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "50M" },``` but it doesn't consume any events from kafka at all.
@npawar: You don't need the group id or any of the properties that say "hlc". Your tables might be out of sync because you've set offset criteria "largest". Each table will start consuming from the last message in the topic, so if your rate of events is high, second table will miss out on events that were emitted between creation of first and second table
@bagi.priyank: I tried with smallest instead of largest first and that's where I was seeing the difference and then I started using largest after that. I did see in the code that Pinot uses <table_name>_<timestamp> as a default group id. I am still confused why I don't see it in the list of consumer groups. I'll try again today.
@npawar: The concept of consumer group is not used in low level consumer
@bagi.priyank: And I am creating tables in both clusters at the same time using the same topic. if anything I would expect the difference to be smaller and not 2-3x of one another as event rate is low.
@bagi.priyank: i see. also forgot to mention that i am using 0.7.1 and kafka 2.x
@bagi.priyank: how do i use consumer group with high level consumer? clearly i am missing something when configuring that as well.
@bagi.priyank: do i need to use `stream.kafka.hlc.zk.connect.string` and `stream.kafka.zk.broker.url` ? i see those in the example table configs in the github repo for high level consumer. kafka cluster has its own zookeeper, and each pinot cluster have their own zookeeper as well.
@npawar: you shouldn’t be using high level, and hence shouldn’t have to worry about consumer group
@bagi.priyank: I see. Could you please go into a little bit about why you recommend that?
@npawar: we’ve stopped actively developing high-level consumer and would likely deprecate it soon. All you need for properties is
@bagi.priyank: Got it. Thank you so much once again for your help and time.
@npawar: still doesn’t solve your missing events issue though..is there a way for you to run some queries (like min/max timestamps, or count(*) group by timestamp) to verify that you’re indeed seeing events being missed?
@bagi.priyank: Yeah let me try those queries and share results with you.
@bagi.priyank: i am setting up everything to be able to run those queries. in the mean time i have few more questions. does low level consumer use group id by itself? or am i wrong in understanding that it uses a default group id based on table name and timestamp? if it is doing that, would merely using a different table name help? if it is using a group id internally i don't understand why kafka-consumer-groups doesn't show it? i do empty space as one of the consumer group. if it is not using a group id, then wouldn't the two tables compete with each other to consume from the same topic in the same kafka cluster?
@bagi.priyank: output for `select min(upload_time), max(upload_time) from table` for table with inverted index
@bagi.priyank: output for `select min(upload_time), max(upload_time) from table` for table with star-tree index
@bagi.priyank: looks like the one with `star-tree` index is lagging behind.
@bagi.priyank: used `largest` instead of `smallest` and they tend to be more or less doing similarly well. i think it also helped that i used different table names for table with inverted index v/s table with star-tree index. i don't have any proof othe than what i am seeing :joy: . thank you neha for all the help, your time and patience. much appreciated!
@npawar: oh cool..
@npawar: regarding `does low level consumer use group id by itself? or am i wrong in understanding that it uses a default group id based on table name and timestamp? if it is doing that, would merely using a different table name help? if it is using a group id internally i don't understand why kafka-consumer-groups doesn't show it? i do empty space as one of the consumer group.` - we dont use group id even internally.
@npawar: ``if it is not using a group id, then wouldn't the two tables compete with each other to consume from the same topic in the same kafka cluster` - Not sure what you mean by 2 tables should compete with each other. If you’re saying that 2 tables will interfere with each other, such that the messages they each receive are exclusive to the other - then no, that is not what happens. We directly consume from offsets inside the pinot-server, maintaining our own checkpointing
@npawar: this might help: This talks about how and why we moved away from high-level to low level, and how it works internally
@bagi.priyank: thank you. i do have questions around consuming from kafka and offset management. i'll go through this case study first.
@npawar: @bagi.priyank ^
@rohit: @rohit has joined the channel
@shreya.chakraborty: @shreya.chakraborty has joined the channel
@docchial: @docchial has joined the channel
@bagi.priyank: The link for `Transform Function in Aggregation Grouping` is broken on . I am guessing it should be pointing to .
@mark.needham: thanks - will fix
@bagi.priyank: Also example uses `DATETIME_CONVERT` instead of `DATETIMECONVERT`

#flink-pinot-connector

@elon.azoulay: @elon.azoulay has joined the channel

#minion-improvements

@elon.azoulay: @elon.azoulay has joined the channel

#fix_llc_segment_upload

@elon.azoulay: @elon.azoulay has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Apache Pinot Daily Email Digest (2021-11-15)

#general

#random

#feat-upsert

#pinot-helix

#group-by-refactor

#inconsistent-segment

#minion-star-tree

#troubleshooting

#pinot-k8s-operator

#multi-region-setup

#metadata-push-api

#pinot-perf-tuning

#thirdeye-pinot

#getting-started

#flink-pinot-connector

#minion-improvements

#fix_llc_segment_upload

Reply via email to