Apache Pinot Daily Email Digest (2022-05-11)

Pinot Slack Email Digest Wed, 11 May 2022 19:00:45 -0700

#general

@rootshellz: @rootshellz has joined the channel
@yves.kurz: @yves.kurz has joined the channel
@songxinbai: @songxinbai has joined the channel
@ricky: @ricky has joined the channel
@anantha.sharma4: @anantha.sharma4 has joined the channel
@brad: @brad has joined the channel
@luys8611: @luys8611 has joined the channel
@tim.spann: I found a typo in this page
@tim.spann: under the Launch Pinot Cluster section
@tim.spann: This command will run a single instance of the Pinot Controller, Pinot Server, Pinot Broker, Kafka, and Zookeeper. You can find the file on GitHub.
@tim.spann: should read: Pinot Broker, Pulsar, and Zookeeper
@tim.spann: thanks
@mayanks: @mark.needham ^^
@richard892: I love an eye for detail
@mark.needham: ta - have updated
@luys8611: It would be great if there's some docs of details how to set all env of pinot & thirdeye for this on my machine.
@mayanks: @pyne.suvodeep ^^
@pyne.suvodeep: Hi @luys8611 Here's a TE OSS public fork of @cyril that might help. There is a dockerized quickstart that uses MySQL as a datasource:
@pyne.suvodeep: This doc should provide a good guideline
@luys8611: I have a data as csv file.
@luys8611: And I've installed pinot docker containers.
@luys8611: Now I am trying to create pinot table from csv data.
@pyne.suvodeep: There are docs for this. Example:
@luys8611: Ok, let me follow it. Thanks
@luys8611: I started Apache Zookeeper, Pinot Controller, Pinot Broker, and Pinot Server. Now how can I create and add new table in cluster manager?
@mayanks:
@luys8611: Does it work completely at local env?
@mayanks: Sorry wrong link
@mayanks: If you just want to play here’s a local step by step guide:
@luys8611: *`java.net.UnknownHostException: manual-pinot-controller: Temporary failure in name resolution`* I'm getting this error when I tried to add new table
@luys8611: I'm trying to run this
@mark.needham: did you start those up with docker?
@mark.needham: if you have them all running outside docker then it's not gonna have a controller called `manual-pinot-controller` available
@luys8611: ```pinot-broker-run | May 11, 2022 8:52:35 PM org.glassfish.grizzly.http.server.NetworkListener start pinot-broker-run | INFO: Started listener bound to [0.0.0.0:8099] pinot-broker-run | May 11, 2022 8:52:35 PM org.glassfish.grizzly.http.server.HttpServer start pinot-broker-run | INFO: [HttpServer] Started. pinot-server-run | May 11, 2022 8:52:38 PM org.glassfish.grizzly.http.server.NetworkListener start pinot-server-run | INFO: Started listener bound to [0.0.0.0:8097] pinot-server-run | May 11, 2022 8:52:38 PM org.glassfish.grizzly.http.server.HttpServer start pinot-server-run | INFO: [HttpServer] Started. pinot-controller-run | May 11, 2022 8:52:40 PM org.glassfish.grizzly.http.server.NetworkListener start pinot-controller-run | INFO: Started listener bound to [0.0.0.0:9000] pinot-controller-run | May 11, 2022 8:52:40 PM org.glassfish.grizzly.http.server.HttpServer start pinot-controller-run | INFO: [HttpServer] Started. pinot-broker-run | 2022/05/11 20:52:40.923 INFO [StartServiceManagerCommand] [Start a Pinot [BROKER]] Started Pinot [BROKER] inst ance [Broker_172.19.0.5_8099] at 14.235s since launch pinot-server-run | 2022/05/11 20:52:43.533 INFO [StartServiceManagerCommand] [Start a Pinot [SERVER]] Started Pinot [SERVER] inst ance [Server_172.19.0.6_8098] at 14.637s since launch pinot-controller-run | 2022/05/11 20:52:45.017 INFO [StartServiceManagerCommand] [main] Started Pinot [CONTROLLER] instance [Control ler_172.19.0.4_9000] at 20.284s since launch zookeeper-run | 2022-05-11 20:52:54,061 [myid:1] - INFO [SessionTracker:ZooKeeperServer@398] - Expiring session 0x100027422aa 0005, timeout of 30000ms exceeded zookeeper-run | 2022-05-11 20:52:54,062 [myid:1] - INFO [SessionTracker:ZooKeeperServer@398] - Expiring session 0x100027422aa 000a, timeout of 30000ms exceeded```
@luys8611: I have all running in docker
@eddyreynoso: @eddyreynoso has joined the channel
@anli: Hi team, we're using Presto with Pinot and would like to support pushdown of functions like `COALESCE` or multi-column `CASE` statements on the Pinot side. This seems reasonable for predicates as currently it looks like push down logic is on aggregations / predicates. However, we're looking for some performance improvements here by having this as a `SELECT` pushdown instead of having to return all data to Presto for processing as we can "aggregate" row-wise for various operators and take advantage of certain indexing i.e. bloom filters, etc. for `COALESCE`, `CONCAT` , etc. Are there concerns or pointers around this? @xiangfu0
@xiangfu0: So far the support for pushing down different functions might need to be implemented separated.
@xiangfu0: If the semantics are the same across presto and pinot, you can try to have one general way to push down e.g. sum/count/min/max
@xiangfu0: otherwise it may require extra override e.g. count(distinct xx)
@xiangfu0: for row level _expression_, so far there is pushing down for arithmetic, you can follow that to support more transform functions
@xiangfu0: You can check PinotPushdownUtils and PinotAggregationProjectConverter for more details
@yzhou86: @yzhou86 has joined the channel