#general
@ellie.shen98: @ellie.shen98 has joined the channel
@mathieu.druart: @mathieu.druart has joined the channel
@mathieu.druart: Hi ! Does someone know if the pinot-pulsar module supports Pulsar authentication ?
@mayanks: Doesn't seem so atm: ``` _pulsarClient = PulsarClient.builder().serviceUrl(_config.getBootstrapServers()).build();```
@mathieu.druart: thank you @mayanks That's what I thought
@contact933: @contact933 has joined the channel
@tiger: Is there an efficient way to get the latest rows (based on timestamp and we want all the columns in the row) grouped by a column? I tried using LASTWITHTIME but it only takes in 1 column name at a time. Specifying LASTWITHTIME for each of the columns in the query increases latency significantly, which seems to imply that it does a linear scan for each column?
@g.kishore: can you please file a github issue with the details - sample queries, schema and latency numbers. There is a possibility to optimize this heavily based on the metadata
@krivoire: @krivoire has joined the channel
#random
@ellie.shen98: @ellie.shen98 has joined the channel
@mathieu.druart: @mathieu.druart has joined the channel
@contact933: @contact933 has joined the channel
@krivoire: @krivoire has joined the channel
#troubleshooting
@ellie.shen98: @ellie.shen98 has joined the channel
@mathieu.druart: @mathieu.druart has joined the channel
@contact933: @contact933 has joined the channel
@alihaydar.atil: Hello, is there anyway to run pinot cluster inside intellij for debug purposes?
@xiangfu0: try quickstart?
@alihaydar.atil: do you mean running 'pinot-admin.sh QuickStart' command?
@xiangfu0: there is a class called QuickStart, it should have a main method
@mark.needham: what do you want to debug?
@xiangfu0: You can run/debug it from ide
@alihaydar.atil: @xiangfu.1234 thank you
@mark.needham: if you want to remotely debug a cluster you can do that as well using the 'remote debug' feature in Intellij. But you'd need to set these JAVA_OPTS on the component that you want to debug: ```JAVA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005"```
@mark.needham: To do the debugging that Xiang suggested, you can also go via the `PinotAdministrator` class if you want to simulate what pinot-admin is doing
@richard892: @alihaydar.atil I use HybridQuickStart when I want to do that
@alihaydar.atil: @mark.needham actually i wanted to see my changes quickly without taking a build and running pinot-admin.sh script everytime i change something
@mark.needham: ah cool then do what Xiang/Richard suggested
@alihaydar.atil: @mark.needham @richard892 thank you a lot
@mark.needham: you can launch any of the components individually as well - search for `StartBrokerCommand` `StartServerCommand` , etc
@karinwolok1: Hi everyone! :heart: :star: :star: :star: :speaker::speaker::speaker: :date: :point_right: *This MONDAY (December 6th)* 10:00 PST | 13:00 EST | 18:00 UTC We're inviting you all to join us for the 2021 Annual re-cap of Apache Pinot. We will also be opening up the floor for discussion of the future roadmap and would love for you to join the discussion! Please RSVP and attend in the meetup link below. We will be doing it on Zoom, so you will be able to chime in with voice/video or text on your thoughts and feedback. Thank you so much for being part of this movement with us, and we look forward to an awesome 2022!!!! :star-struck:
@diana.arnos: So, I'm trying to configure a table with partial upsert on `Pinot 0.9.0` that consumes from a kafka topic and I'm experiencing a weird behaviour. Once the second message gets consumed, Pinot does a full upsert instead of the partial. So every field present in the second message gets updated and all the others are set to null (I believe because they are not present on the second message and the full upsert uses the default null values) Here's the table and schema configs: Schema: ```{ "schemaName": "responseCount", "dimensionFieldSpecs": [ { "name": "responseId", "dataType": "STRING" }, { "name": "formId", "dataType": "STRING" }, { "name": "channelId", "dataType": "STRING" }, { "name": "channelPlatform", "dataType": "STRING" }, { "name": "companyId", "dataType": "STRING" }, { "name": "submitted", "dataType": "BOOLEAN" }, { "name": "deleted", "dataType": "BOOLEAN" } ], "dateTimeFieldSpecs": [ { "name": "operationDate", "dataType": "STRING", "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ", "granularity": "1:MILLISECONDS" }, { "name": "createdAt", "dataType": "STRING", "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ", "granularity": "1:MILLISECONDS" }, { "name": "deletedAt", "dataType": "STRING", "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ", "granularity": "1:MILLISECONDS" } ], "primaryKeyColumns": [ "responseId" ] }``` Table: ```{ "REALTIME": { "tableName": "responseCount_REALTIME", "tableType": "REALTIME", "segmentsConfig": { "allowNullTimeValue": false, "replication": "1", "replicasPerPartition": "1", "timeColumnName": "operationDate", "schemaName": "responseCount" }, "tenants": { "broker": "DefaultTenant", "server": "DefaultTenant" }, "tableIndexConfig": { "rangeIndexVersion": 1, "autoGeneratedInvertedIndex": false, "createInvertedIndexDuringSegmentGeneration": false, "loadMode": "MMAP", "streamConfigs": { "streamType": "kafka", "stream.kafka.topic.name": "response-count.aggregation.source", "stream.kafka.broker.list": "kafka:9092", "stream.kafka.consumer.type": "lowlevel", "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "realtime.segment.flush.threshold.rows": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.segment.size": "100M" }, "enableDefaultStarTree": false, "enableDynamicStarTreeCreation": false, "aggregateMetrics": false, "nullHandlingEnabled": true }, "metadata": {}, "routing": { "instanceSelectorType": "strictReplicaGroup" }, "upsertConfig": { "mode": "PARTIAL", "partialUpsertStrategies": { "deleted": "OVERWRITE", "deletedAt": "OVERWRITE" }, "hashFunction": "NONE" }, "isDimTable": false } }``` Here's the first message consumed: ```Key: {"responseId": "52d96a0d-92ea-4103-9ea9-536252324481"} Value: { "responseId": "52d96a0d-92ea-4103-9ea9-536252324481", "formId": "7bd28941-f9e4-45f1-a801-5c7d647cc6cd", "channelId": "60d11312-0e01-48d8-acce-4871b8d2365b", "channelPlatform": "app", "companyId": "00ca0142-5634-57e6-8d44-61427ea4b13d", "submitted": true, "deleted": "false", "createdAt": "2021-05-21T12:55:54.000+0000", "operationDate": "2021-05-21T12:55:54.000+0000" }``` Here's the second message consumed: ```Key: {"responseId": "52d96a0d-92ea-4103-9ea9-536252324481"} Value: { "responseId": "52d96a0d-92ea-4103-9ea9-536252324481", "deleted": "true", "deletedAt": "2021-10-21T12:55:54.000+0000", "operationDate": "2021-05-21T12:55:54.000+0000" }```
@diana.arnos: If I try the exact same configs on Pinot 0.8.0 and send the exact same message to kafka, Pinot shows me an error when consuming and does not ingest the data at all. Here's the error in the server log: ```pinot-server_1 | 2021/12/03 17:06:41.902 ERROR [LLRealtimeSegmentDataManager_responseCount__0__0__20211203T1704Z] [responseCount__0__0__20211203T1704Z] Caught exception while transforming the record: { pinot-server_1 | "fieldToValueMap" : { pinot-server_1 | "formId" : "7bd28941-f9e4-45f1-a801-5c7d647cc6cd", pinot-server_1 | "operationDate" : "2021-05-21T12:55:54.000+0000", pinot-server_1 | "createdAt" : "2021-05-21T12:55:54.000+0000", pinot-server_1 | "companyId" : "00ca0142-5634-57e6-8d44-61427ea4b13d", pinot-server_1 | "deletedAt" : "null", pinot-server_1 | "submitted" : 1, pinot-server_1 | "deleted" : 0, pinot-server_1 | "channelPlatform" : "app", pinot-server_1 | "channelId" : "60d11312-0e01-48d8-acce-4871b8d2365b", pinot-server_1 | "responseId" : "52d96a0d-92ea-4103-9ea9-536252324481" pinot-server_1 | }, pinot-server_1 | "nullValueFields" : [ ] pinot-server_1 | } pinot-server_1 | java.lang.NullPointerException: null pinot-server_1 | at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.handleUpsert(MutableSegmentImpl.java:512) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] pinot-server_1 | at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.index(MutableSegmentImpl.java:469) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] pinot-server_1 | at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:516) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] pinot-server_1 | at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:417) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] pinot-server_1 | at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:560) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] pinot-server_1 | at java.lang.Thread.run(Thread.java:829) [?:?]```
@krivoire: @krivoire has joined the channel
#pinot-dev
@lipicsbarna: @lipicsbarna has joined the channel
@atri.sharma: Folks, please review:
#getting-started
@lipicsbarna: @lipicsbarna has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org