Apache Pinot Daily Email Digest (2021-09-08)

Pinot Slack Email Digest Wed, 08 Sep 2021 19:00:39 -0700

#general

@chandanchoudhary716: @chandanchoudhary716 has joined the channel
@jmint: @jmint has joined the channel
@aravindchoutpally: @aravindchoutpally has joined the channel
@qiuxgm: @qiuxgm has joined the channel
@mgjain: @mgjain has joined the channel

#random

#troubleshooting

@kangren.chia: hello, i could use some help with query tuning: ```# schema: user(int) | location(int) | time(long) # 1st query (filter): "select user, count(*) from {table} where time between {start} and {end} and location between 500 and 550 group by user having count(user) >= 24 limit 1000000" # 2nd query (combiner): "select time, count(distinct(user)) as count from {table} where user in ({users}) and time between {start} and {end} and location between 300 and 350 group by time limit 10000000"``` the query time scales linearly with the number of selected userids from the first query
@kangren.chia: segment size = 300mb number of records = ~500 million per day (10 segments) using the python client takes around: ```query took 1.5208384820143692 found 132164 userids query took 1.2651426389929838``` using the java client, with roaring bitmaps to serialize userids from the first query takes longer: ```query took 1.601 found 132164 userids query took 2.586``` both clients were measured solely on the execution time of the pinot query
@kangren.chia: i don’t use the `in_subquery` or `in_id_set` construct from because i was advised by @jackie.jxt to build the idset client side due to the nature of userid filtering in my first query
@kangren.chia: broker logs: ```timeMs=1159, docs=5175479/449912692, entries=456484558/5175479, segments(queried/processed/matched/consuming/unavailable):10/10/10/0/0,consumingFreshnessTimeMs=0, servers=10/10, groupLimitReached=false, brokerReduceTimeMs=516, exceptions=0, serverStats=(Server=SubmitDelayMs,ResponseDelayMs,ResponseSize,DeserializationTimeMs,RequestSentDelayMs); pinot-server-3_O=0,592,382743,0,-1; pinot-server-5_O=0,580,373923,0,-1; pinot-server-1_O=0,573,372135,0,-1; pinot-server-7_O=0,583,378483,0,-1; pinot-server-0_O=0,579,374595,0,-1; pinot-server-6_O=0,640,379359,0,-1; pinot-server-9_O=0,589,386235,0,-1; pinot-server-4_O=0,517,381771,0,-1; pinot-server-2_O=0,542,383007,0,1; pinot-server-8_O=0,590,379035,0,-1, offlineThreadCpuTimeNs=5758172708, realtimeThreadCpuTimeNs=0```
@mayanks: What’s the latency without having clause
@mayanks: Also what’s the indexing you have right now
@kangren.chia: i don’t see anything unusual on the grafana metrics - • pinot cluster is deployed on k8s • server pods: 4000cpu, 10GB mem • broker pods: 1000cpu, 1GB mem • requests and limits are the same for both pods, as followed the default metrics in
@kangren.chia: indexing: ``` dictionary | forward-index | inverted-index user Y Y Y location Y Y Y time Y Y Y```
@kangren.chia: segments were generated from parquet files that are physically sorted by user, then time
@kangren.chia: i don’t know if star tree index will help here, since the doc says > *Unsupported functions* > DISTINCT_COUNT > Intermediate result _Set_ is unbounded
@jackie.jxt: Broker reduce takes quite long
@kangren.chia: > What’s the latency without having clause checking now
@jackie.jxt: One think that can potentially help is to add an order by on count for the first query and reduce the limit
@jackie.jxt: With such high limit, broker might run into heavy GCs
@kangren.chia: latency without HAVING clause is just as bad, ~2s with brokerReduceTimeMs=~700
@kangren.chia: query time also can vary between 1.5s to 3s for either query 1 or query 2
@kangren.chia: when you say “add an order by on count”, do you mean this? ```# 1st query (old): "select user, count(*) from {table} where time between {start} and {end} and location between 500 and 550 group by user having count(user) >= 24 limit 1000000" # 1st query (new): "select user, count(*) as count from {table} where time between {start} and {end} and location between 500 and 550 group by user having count(user) >= 24 order by count limit 1000000"```
@kangren.chia: i’ve tried that, and also reducing the limit (to a point where i dont get truncated results). still slow :confused:
@kangren.chia: cardinality of the items in schema: ```users = ~5-6 million time = 96 cells = ~2500```
@kangren.chia: here are other types of queries that scale linearly with the amount of users fetched from the first step: ```# get relevant users ... # step 2 (reducer type 2) "select location, count(user) as count from {table} where user in ({users}) and time between {start} and {end} group by location limit 10000000" # step 2 (reducer type 3) "select count(user) from {table} where user in ({users}) and time_rounded between {start} and {end} and location between 300 and 505 group by user limit 10000000" # step 2 (reducer type 4) "select location, count(distinct(user)) from {table} where user in ({users}) and time_rounded between {start} and {end} and location between 300 and 310 group by location limit 10000000"```
@kangren.chia: would making segments smaller so that the computation can be further parallelized help? they’re currently at 300mb untarred/uncompressed per segment, or would throwing more resources at the broker help?
@humengyuk18: Getting the following exceptions when try to upgrade to 0.8, its complaining for found instance name with the same host name: ```Exception when connecting the instance Controller_pinot-controller-0.pinot-controller-headless.pinot.svc.cluster.local_9000 as Participant role to Helix. org.apache.helix.HelixException: instance: Controller_pinot-controller-0.pinot-controller-headless.pinot.svc.cluster.local_9000 already has a live-instance in cluster pinot at org.apache.helix.manager.zk.ParticipantManager.createLiveInstance(ParticipantManager.java:251) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.helix.manager.zk.ParticipantManager.handleNewSession(ParticipantManager.java:115) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.helix.manager.zk.ZKHelixManager.handleNewSessionAsParticipant(ZKHelixManager.java:1171) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:1131) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:701) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:738) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.controller.ControllerStarter.registerAndConnectAsHelixParticipant(ControllerStarter.java:524) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.controller.ControllerStarter.setUpPinotController(ControllerStarter.java:343) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.controller.ControllerStarter.start(ControllerStarter.java:287) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:116) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:91) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:234) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:286) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startBootstrapServices(StartServiceManagerCommand.java:233) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.execute(StartServiceManagerCommand.java:183) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.command.StartControllerCommand.execute(StartControllerCommand.java:130) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:164) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274] at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:184) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274]```
@chandanchoudhary716: @chandanchoudhary716 has joined the channel
@deemish2: Hi everyone , In pinot-0.8.0 , i have added controller.task.frequencyPeriod=4h in controller config. and it is working fine in local setup on machine. While in using kubernetes , it gives
@deemish2:
@deemish2: while setting up this config in local machine , it is working fine .
@andruszd: Hi can you remove the enable slider button from the web UI
@andruszd:
@andruszd: also having an issue that the disk space size is not being reported
@andruszd:
@andruszd: always shows -1 doe both the sizes ...
@mrpringle: Are there any guides for sizing pinot, e.g. number of servers, brokers, controllers. Seems my setup has stopped working well with most queries only getting a response from one server instead of expected 2. I have added memory to servers, clustered zookeeper, ran recommendation engine for tables but seems not much is making a good improvement.
@kangren.chia: i have the same question! i tried throwing more resources at my brokers/servers, but the cpu/mem utilization on grafana looks like it will not help
@kangren.chia: setup is: • pinot cluster is deployed on k8s • server pods: 4000cpu, 10GB mem • broker pods: 1000cpu, 1GB mem • requests and limits are the same for both pods, as followed the default metrics in
@richard892: Hi :wave: - it would be interesting to get a cpu profile by installing async-profiler as a native agent as outlined , setting ```-agentpath:/path/to/libasyncProfiler.so=start,event=cpu,file=cpu.html``` or ```-agentpath:/path/to/libasyncProfiler.so=start,event=alloc,file=alloc.html``` to get an allocation profile. If you copy the html files back here I would be keen to take a look at them. This will help pinpoint any unexpected issues or problems potentially exacerbated by inappropriate sizing.
@kangren.chia: thanks for the suggestion, i will try that @richard892 and report back (probably not soon). cool blog post on roaring bitmaps btw :slightly_smiling_face:
@richard892: ok, thanks, I look forward to it
@mayanks: @mrpringle It is also a function of things like read/write qps, expected latency, data size, and query type. If you can share those, we can help suggest a good size.
@jmint: @jmint has joined the channel
@aravindchoutpally: @aravindchoutpally has joined the channel
@qiuxgm: @qiuxgm has joined the channel
@mgjain: @mgjain has joined the channel