Apache Pinot Daily Email Digest (2022-04-01)

Pinot Slack Email Digest Fri, 01 Apr 2022 19:00:35 -0700

#general

@ysuo: I uploaded a realtime table config and pinot created some local data, but no records in the table.
@ysuo:
@erik.bergsten: We have configured our server to use s3 deepstore. Does anyone know where the documentation on how to restore a server from the deepstore backup is located?
@g.kishore: are you referring to the event where a server crashes and you bring up a new server?
@erik.bergsten: Yes! If the server disk is corrupted but the segments are all on s3 and we want to restore it
@g.kishore: you simply replace the disk and or delete the segments from the data dir on the server and restart it
@g.kishore: Pinot stores the segment location in Helix and knows how to pull it
@erik.bergsten: Nice! Thanks.
@erik.bergsten: Also: does anyone know if using NFS as storage with works well with pinot (as a secondary storage tier, not for real time writes).
@g.kishore: we haven't tested this but if the query load is not high it might work.
@tanmaykrishna266: Hello, what would be impact on storage footprint if we set maxLength of string column(SV) to 1MB?
@tanmaykrishna266: Wanted to understand if Pinot allots 1 MB(always) to that column per row or the size of each row is variable depending on the actual length of the column?
@tanmaykrishna266: We recently onboarded a table with multiple columns having maxLength as 1MB and we saw our servers crash due to disks being out of space as each segment of the tables was of ~80GB(throughput of 1-2k events/sec and consumed for few hrs). Wanted to know if we are following best practices.
@tanmaykrishna266: Also if we have a column which is basically an array of strings of length(0-200), should we define it as a string column with a high enough maxLenght or a MV string column? What would be the difference?
@richard892: dictionarized strings will be padded to that length
@richard892: so not a good idea if you dictionaries
@richard892: but if you really have 1MB strings, there's a good chance they aren't repeated, so having a dictionary would be wasteful
@richard892: if that's the case, you can add those columns to the "noDictionaryColumns" in "tableIndexConfig"
@tanmaykrishna266: > so having a dictionary would be wasteful Yes dictionaries are not needed they won’t be used in filters or groupBy in queries.
@tanmaykrishna266: One doubt. To test this we created two schemas(and tables). Schema1 ```{ "schemaName": "ztest_schema_max", "dimensionFieldSpecs": [ { "name": "column1", "dataType": "STRING", "maxLength": 1000000 }, { "name": "column2", "dataType": "STRING", "maxLength": 1000000 } ], "dateTimeFieldSpecs": [ { "name": "producer_timestamp", "dataType": "LONG", "format": "1:SECONDS:EPOCH", "granularity": "1:SECONDS" } ] }``` Schema2 ```{ "schemaName": "ztest_schema", "dimensionFieldSpecs": [ { "name": "column1", "dataType": "STRING" }, { "name": "column2", "dataType": "STRING" } ], "dateTimeFieldSpecs": [ { "name": "producer_timestamp", "dataType": "LONG", "format": "1:SECONDS:EPOCH", "granularity": "1:SECONDS" } ] }``` And inserted same data into both these tables. On pinot UI the size is reported as 730kb for both these tables.
@tanmaykrishna266: TableIndexConfig for both tables ```"tableIndexConfig": { "enableDefaultStarTree": false, "enableDynamicStarTreeCreation": false, "aggregateMetrics": false, "nullHandlingEnabled": false, "rangeIndexVersion": 1, "autoGeneratedInvertedIndex": false, "createInvertedIndexDuringSegmentGeneration": false, "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "LowLevel", "stream.kafka.topic.name": "events.router.v2.live", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": "kafka-kafka-bootstrap.kafka.svc.cluster.local:9092", "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "realtime.segment.flush.threshold.rows": "1", "realtime.segment.flush.threshold.time": "1d", "realtime.segment.flush.threshold.segment.size": "1m" }, "loadMode": "MMAP" }``` Total number of rows ingested = 206, Each row is just a duplicate of this. ```column1 column2 producer_timestamp router.optimizer_visibility_event router 1648643989```
@tanmaykrishna266: > dictionarized strings will be padded to that length Going by this I would expect table with 1MB columns should be significantly larger than the other. But that doesn’t seem to be the case here. Can you please help me understand why?
@tanmaykrishna266: Not sure if this might matter but each segment has only 1 row(was testing).
@diana.arnos: Hey there :wave: Is there a way to change the log level for the instances?
@mayanks: Log4j settings?
@tozhang: @tozhang has joined the channel
@jt: General question: Why is pinot broker/controller a statefulset in the helm chart? Those can be deployments isnt it? They dont have any persistence afaik.
@mayanks: Pinot is a distributed system. Data is one component, but there is also metadata (cluster state), and these components need persistent metadata from ZK. Also cc: @xiangfu0 in case of more comments.
@xiangfu0: Pinot controller can have local disk to serve as deep store for segment download, in that case, the registered download uri has to have fixed control pod name, so controller should be statefulset
@jt: I see. I thought metadata is only in zookeeper not persisted by the controller/broker
@xiangfu0: For Pinot broker, the default setup is all on default tenant, however, users have the ability to assign tenant for each broker and Pinot internal will maintain the mapping for which broker has which tables to serve.
@xiangfu0: Metadata is in zookeeper, but has the reference to internal contoller/broker pod name references
@jt: Ya ofc..that makes sense
@weixiang.sun: If I want to move pinot table from one tenant to another, can I just change the tenant name inside table config? If yes, is there any downtime?
@mayanks: Yes, you can change tenant name and perform rebalance. You have the option of specifying whether you want to avoid downtime in the rebalance api.
@weixiang.sun: @mayanks How does it work for realtime table? Will the consuming segments be sealed? The offline segments will be loaded to new tenant? How does query work? Will the broker send the query to old tenant or new tenant?
@mayanks: Broker will send the query to servers (it does not look at tenant info):
@mayanks: For real-time, there is a option to enable rebalance of consuming segments. I don’t recall if they are sealed or just dropped and consumption starting from checkpoint @jackie.jxt
@jackie.jxt: @weixiang.sun If consuming segments are configured to be moved, the new server will re-consume the already consumed events, and the original consuming segment is dropped
@singhal.prateek3: Hi team, I believe it is possible to apply inverted index on multiple columns. Any idea on how it is stored? All the documentation I have seen so far give examples of inverted index on only 1 column. I would like to apply inverted index on multiple columns in an optimized way.
@ssubrama: Pinot is a columnar store, so inverted index or any other index on the column is stored on a per-column basis. An inverted index of one column is independent of inverted index of another column.

#random

@tozhang: @tozhang has joined the channel

#troubleshooting

@grace.lu: Hi team, if each of our pinot components has multiple resolvable dns name, I wonder if there is any recommended config for us to define/pick the hostname
@mayanks: instance id?
@grace.lu: we don’t mind instance_id is constructed from hostname + port, but we would like to change the hostname itself from default
@grace.lu: seems like there is `controller.host` and `pinot.server.netty.host` for controller and server, but don’t see the equivalent for broker?
@grace.lu: also is this how pinot get its hostname? So the hostname needs to be either `InetAddress.getLocalHost().getCanonicalHostName();` or the ip address?
@mayanks: Yes
@mehtashailee21: Hello there, I am trying to validate this ~schema~ tableConfig using the schema validate API. It keeps *returning 404 with reason null.* Can someone help me find the issue with this schema ```{ "tableName": "lineorder_star_OFFLINE", "tableType": "OFFLINE", "segmentsConfig": { "timeColumnName": "LO_ORDERDATE", //date field with day-granularity "timeType": "DAYS", "replication": "1", "schemaName": "lineorder" }, "tenants": { "broker": "DefaultTenant", "server": "DefaultTenant" }, "metadata": { "customConfigs": {} }, "tableIndexConfig": { "starTreeIndexConfigs": [ { "dimensionsSplitOrder": [ "LO_ORDERDATE", //date "LO_SUPPKEY", // dim field "LO_PARTKEY", // dim field "LO_DISCOUNT", // measure "LO_QUANTITY", //measure "LO_REVENUE", // dim field "LO_ORDERPRIORITY" //measure ], "skipStarNodeCreationForDimensions": [], "functionColumnPairs": [ "SUM__LO_QUANTITY", "COUNT__LO_ORDERKEY", "SUM__LO_REVENUE" ] } ] } }```
@npawar: Did the mean you are trying to validate this table config? Or did you post the table config by mistake? Are you using swagger? Can you share the entire request and the logs from controller?
@mehtashailee21: Yes I am useing swagger to validate this config
@mehtashailee21: Curl: ```curl -X POST "" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"tableName\": \"lineorder_star_OFFLINE\", \"tableType\": \"OFFLINE\", \"segmentsConfig\": { \"timeColumnName\": \"LO_ORDERDATE\", \"timeType\": \"DAYS\", \"replication\": \"1\", \"schemaName\": \"lineorder\" }, \"tenants\": { \"broker\": \"DefaultTenant\", \"server\": \"DefaultTenant\" }, \"metadata\": { \"customConfigs\": {} }, \"tableIndexConfig\": { \"starTreeIndexConfigs\": [ { \"dimensionsSplitOrder\": [ \"LO_ORDERDATE\", \"LO_SUPPKEY\", \"LO_PARTKEY\", \"LO_DISCOUNT\", \"LO_QUANTITY\", \"LO_REVENUE\", \"LO_ORDERPRIORITY\" ], \"skipStarNodeCreationForDimensions\": [], \"functionColumnPairs\": [ \"SUM__LO_QUANTITY\", \"COUNT__LO_ORDERKEY\", \"SUM__LO_REVENUE\" ] } ] }}"``` Controller logs attached:
@mehtashailee21: oh I was checking the wrong API: tested it `` Yet it just says invalid JSON. I am not sure as to what is field is causing it
@mark.needham: ```} exception: Unexpected character ('/' (code 47)): maybe a (non-standard) comment? (not recognized as one since Feature 'ALLOW_COMMENTS' not enabled for parser) at [Source: (String)"{ "tableName": "lineorder_star_OFFLINE", "tableType": "OFFLINE", "segmentsConfig": {```
@mark.needham: it doesn't like the comments
@mark.needham: if you take those out it'll be fine
@diogo.baeder: In general JSON decoders don't accept comments.
@npawar: Tableconfigs Api needs schema + table config in the json. You need to try with just the tables/validate api
@mehtashailee21: okay thank you
@mehtashailee21: @mark.needham i haven't used them in the API. it was just for explanation.
@tozhang: @tozhang has joined the channel
@diogo.baeder: Hi guys! Got a question: is it possible to, somehow, have a batch ingestion pipeline for daily ingestions (therefore meaning daily segments being created), but then, on a monthly basis, combine all segments for the previous month and delete the daily segments for it? I'll continue in this thread.
@diogo.baeder: For example: suppose that I was creating one segment per day on last March, and then today I wanted to combine all the March daily segments into a single March month segment, then delete the daily March segments. Would this be possible somehow?
@diogo.baeder: The reasoning behind this is: I need to have fresh daily data, but it's very likely that each day will end up with too small segments, and the queries I would be running against them would very frequently be involving data for full months - therefore hitting too many segments, e.g. 31 segments for March. So I thought that "smashing" them into monthly segments would optimize the segments sizes.
@luisfernandez: something like this? we haven’t tried it but we were also thinking about merging segments
@mayanks: There’s a minion job for that
@mayanks: Yep @luisfernandez beat me to me
@diogo.baeder: Ah, nice, I'll take a look, thanks! I was taking a look at RealtimeToOfflineTask, but it didn't seem like what I was looking for...
@mark.needham: We wrote up a recipe showing how to use it - - I think it'll do what you want
@diogo.baeder: Nice, thanks man!
@diogo.baeder: Another question, somewhat related to the previous one: is it possible to have a date column? And to set it up as the `timeColumn`? I was looking at using `DAYS` as the type for this, but it doesn't seem correct - I want to use real dates and not "days since Epoch".
@mayanks: You can use DateTimeFieldSpec
@mayanks:
@diogo.baeder: Is this for display purposes only? Or does it also affect the storing of values?
@npawar: if you use the sql standard date (yyyy-MM-dd HH:mm:ss), then you ca use dataType `TIMESTAMP`, and internally it’ll be stored as long, but displayed as your date string. If you want to use any other format (like ISO), then you would use dataType `STRING`, it will be stored as string.
@npawar: for performance reasons, usually it’s better to have millisSinceEpoch (rounded to a nearest hour/day granularity as needed), so we can store as long
@diogo.baeder: Ah, thanks @npawar, that's what I wanted to know! Like, what's the best way to store dates for performance, partitioning, rollups etc. But what if I define the schema as data type `LONG`, and format `1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd`? Does this work, too?
@npawar: yup this is fine. this uses LONG and also has coarse granularity :thumbsup:
@diogo.baeder: Would it work with `INT` , too, for less memory/space usage?
@npawar: it should work for yyyyMMdd with INT
@npawar: for performance, partitioning, rollups, what you have will work fine. For extensibility, millisSinceEpoch works best. You have day granularity right now, so you would have values rolled up to the nearest day. In future if you want to make the granularity finer, helps to have millisSinceEpoch . You could go from day granularity to finer granularity anytime, without having to change anything else
@diogo.baeder: Got it. But there's absolutely zero chance we'll need finer granularity, that data we have has always been and will always be daily - it's a large ecosystem that depends on this data being daily. So I guess that would be the best approach for this case, right?
@npawar: yes it would in that case
@diogo.baeder: Awesome, thanks! (And sorry for the long delay)
@ahsen.m: so i am getting following error any idea’s? zookeeper is running fine and kafka is using it. i am using existing zookeeper connection url in pinot. ```Opening socket connection to server kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181. Will not attempt to authenticate using SASL (unknown error) 53 Socket connection established, initiating session, client: /10.48.12.58:54236, server: kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181 52 Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 51 Opening socket connection to server kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181. Will not attempt to authenticate using SASL (unknown error) 50 Socket connection established, initiating session, client: /10.48.12.58:54248, server: kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181 49 Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 48 Opening socket connection to server kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181. Will not attempt to authenticate using SASL (unknown error) 47 Socket connection established, initiating session, client: /10.48.12.58:54268, server: kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181 46 Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 45 Opening socket connection to server kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181. Will not attempt to authenticate using SASL (unknown error) 44 Socket connection established, initiating session, client: /10.48.12.58:54274, server: kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181 43 Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 42 Opening socket connection to server kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181. Will not attempt to authenticate using SASL (unknown error) 41 Socket connection established, initiating session, client: /10.48.12.58:54278, server: kafka-cluster-zookeeper-client.kafka-cluster.svc.cluster.local/10.52.8.240:2181 40 Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 39 Failed to initialize Pinot Broker Starter 38 java.lang.NullPointerException: null 37 at org.apache.helix.manager.zk.client.ZkConnectionManager.cleanupInactiveWatchers(ZkConnectionManager.java:112) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 36 at org.apache.helix.manager.zk.client.ZkConnectionManager.close(ZkConnectionManager.java:95) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 35 at org.apache.helix.manager.zk.client.ZkConnectionManager.close(ZkConnectionManager.java:91) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 34 at org.apache.helix.manager.zk.zookeeper.ZkClient.connect(ZkClient.java:1620) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 33 at org.apache.helix.manager.zk.zookeeper.ZkClient.<init>(ZkClient.java:186) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 32 at org.apache.helix.manager.zk.ZkClient.<init>(ZkClient.java:87) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 31 at org.apache.helix.manager.zk.client.ZkConnectionManager.<init>(ZkConnectionManager.java:41) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 30 at org.apache.helix.manager.zk.client.SharedZkClientFactory.getOrCreateZkConnectionNamanger(SharedZkClientFactory.java:60) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 29 at org.apache.helix.manager.zk.client.SharedZkClientFactory.buildZkClient(SharedZkClientFactory.java:40) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 28 at org.apache.pinot.common.utils.ServiceStartableUtils.applyClusterConfig(ServiceStartableUtils.java:54) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 27 at org.apache.pinot.broker.broker.helix.BaseBrokerStarter.init(BaseBrokerStarter.java:118) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 26 at org.apache.pinot.tools.service.PinotServiceManager.startBroker(PinotServiceManager.java:137) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 25 at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:92) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 24 at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.lambda$run$0(StartServiceManagerCommand.java:275) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 23 at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:301) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 22 at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.run(StartServiceManagerCommand.java:275) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 21 Failed to start a Pinot [BROKER] at 31.159 since launch 20 java.lang.NullPointerException: null 19 at org.apache.helix.manager.zk.client.ZkConnectionManager.cleanupInactiveWatchers(ZkConnectionManager.java:112) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 18 at org.apache.helix.manager.zk.client.ZkConnectionManager.close(ZkConnectionManager.java:95) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 17 at org.apache.helix.manager.zk.client.ZkConnectionManager.close(ZkConnectionManager.java:91) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 16 at org.apache.helix.manager.zk.zookeeper.ZkClient.connect(ZkClient.java:1620) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 15 at org.apache.helix.manager.zk.zookeeper.ZkClient.<init>(ZkClient.java:186) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 14 at org.apache.helix.manager.zk.ZkClient.<init>(ZkClient.java:87) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 13 at org.apache.helix.manager.zk.client.ZkConnectionManager.<init>(ZkConnectionManager.java:41) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 12 at org.apache.helix.manager.zk.client.SharedZkClientFactory.getOrCreateZkConnectionNamanger(SharedZkClientFactory.java:60) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 11 at org.apache.helix.manager.zk.client.SharedZkClientFactory.buildZkClient(SharedZkClientFactory.java:40) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 10 at org.apache.pinot.common.utils.ServiceStartableUtils.applyClusterConfig(ServiceStartableUtils.java:54) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 9 at org.apache.pinot.broker.broker.helix.BaseBrokerStarter.init(BaseBrokerStarter.java:118) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 8 at org.apache.pinot.tools.service.PinotServiceManager.startBroker(PinotServiceManager.java:137) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 7 at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:92) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 6 at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.lambda$run$0(StartServiceManagerCommand.java:275) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 5 at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:301) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 4 at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.run(StartServiceManagerCommand.java:275) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] 3 Shutting down Pinot Service Manager with all running Pinot instances... 2 Shutting down Pinot Service Manager admin application... 1 Deregistering service status handler```
@mayanks: What does your broker config look like? Seems like an NPE
@ahsen.m: ``` # ------------------------------------------------------------------------------ # Pinot Broker: # ------------------------------------------------------------------------------ broker: name: broker replicaCount: 1 podManagementPolicy: Parallel podSecurityContext: {} # fsGroup: 2000 securityContext: {} jvmOpts: "-Xms256M -Xmx1G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:file=/opt/pinot/gc-pinot-broker.log" log4j2ConfFile: /opt/pinot/conf/log4j2.xml pluginsDir: /opt/pinot/plugins routingTable: builderClass: random probes: endpoint: "/health" livenessEnabled: true readinessEnabled: true persistence: extraVolumes: [] extraVolumeMounts: [] service: annotations: {} clusterIP: "None" externalIPs: [] loadBalancerIP: "" loadBalancerSourceRanges: [] type: ClusterIP protocol: TCP port: 8099 name: broker nodePort: "" external: enabled: false type: LoadBalancer port: 8099 # For example, in private GKE cluster, you might add : Internal annotations: {} ingress: v1beta1: enabled: false v1: enabled: false resources: {} # resources: # requests: # cpu: 500m # limits: # cpu: 1000m nodeSelector: {} # nodeSelector: # service: pinot-cluster # affinity: {} affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: scope operator: In values: - highcpu # podAntiAffinity: # preferredDuringSchedulingIgnoredDuringExecution: # - podAffinityTerm: # labelSelector: # matchLabels: # : worker # : trino # : trino # namespaces: # - trino-cluster # topologyKey: # weight: 1 tolerations: [] podAnnotations: {} updateStrategy: type: RollingUpdate # Use envFrom to define all of the ConfigMap or Secret data as container environment variables. # ref: # ref: envFrom: [] # - configMapRef: # name: special-config # - secretRef: # name: test-secret # Use extraEnv to add individual key value pairs as container environment variables. # ref: extraEnv: [] # - name: PINOT_CUSTOM_ENV # value: custom-value # Extra configs will be appended to pinot-broker.conf file extra: configs: |- pinot.set.instance.id.to.hostname=true # --------------```
@ahsen.m: also in zookeeper when pinot tries to make connection i see following ```2022-04-02 01:53:24,604 ERROR Unsuccessful handshake with session 0x0 (org.apache.zookeeper.server.NettyServerCnxnFactory) [nioEventLoopGroup-7-1] 55 2022-04-02 01:53:24,604 WARN Exception caught (org.apache.zookeeper.server.NettyServerCnxnFactory) [nioEventLoopGroup-7-1] 54 io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 0000002d000000000000000000000000000075300000000000000000000000100000000000000000000000000000000000 53 at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:477) 52 at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) 51 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) 50 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) 49 at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) 48 at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) 47 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) 46 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) 45 at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) 44 at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) 43 at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) 42 at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) 41 at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) 40 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 39 at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) 38 at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 37 at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) 36 at java.base/java.lang.Thread.run(Thread.java:829) 35 Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 0000002d000000000000000000000000000075300000000000000000000000100000000000000000000000000000000000 34 at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1214) 33 at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) 32 at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507) 31 at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446) 30 ... 17 more```
@ahsen.m:

#pinot-k8s-operator

@tozhang: @tozhang has joined the channel
@tozhang: Any guide to install pinot components in a k8s cluster with 3 nodes?

#getting-started

@tozhang: @tozhang has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org