Apache Pinot Daily Email Digest (2021-09-02)

Pinot Slack Email Digest Thu, 02 Sep 2021 19:00:29 -0700

#general

#random

@ankitgupta4894: @ankitgupta4894 has joined the channel
@singitamkr: @singitamkr has joined the channel
@andruszd: @andruszd has joined the channel
@shish: @shish has joined the channel
@richard892: @richard892 has joined the channel
@nishad.sadasivan: @nishad.sadasivan has joined the channel
@debabrata: @debabrata has joined the channel

#troubleshooting

@ankitgupta4894: @ankitgupta4894 has joined the channel
@singitamkr: @singitamkr has joined the channel
@andruszd: @andruszd has joined the channel
@andruszd: joined <#C011C9JHN7R|troubleshooting>
@shish: @shish has joined the channel
@richard892: @richard892 has joined the channel
@david.cyze: I have a realtime table ingesting from Kafka and an application that writes events to the appropriate Kafka topic. My table originally had `realtime.segment.flush.threshold.rows=30`, and I ran my application to the point where I had pushed around 100k rows to a kafka topic before realizing that this was much too small a segment size. I stopped my app, deleted the table, changed `realtime.segment.flush.threshold.rows=100000` , and recreated it. Then, I ran my app to push 3mil rows to the kafka topic. At some point in Pinot's ingestion process, the status of my table changed to `BAD`. I looked in the controller logs and noticed this error: ```2021/09/02 16:02:48.585 ERROR [SegmentCompletionFSM_simplejson__0__632__20210902T1602Z] [grizzly-http-server-21] Caught exception while committing segment metadata for segment: simplejson__0__632__20210902T1602Z java.lang.IllegalStateException: Failed to find IdealState for table: simplejson_REALTIME``` In the web UI for the Pinot Controller, under the Cluster Manager for the affected table, I sorted the `SEGMENTS` list by Status and noticed that I had two `Bad` segments. Inspecting the bad segments, I noticed that each had a total of 30 documents. I checked a handful of `Good` segments, and each had 100k documents. I'm not sure how to bring these segments into a `Good` state, or why they entered into a `Bad` state in the first place. I was unable to find anything in Pinot's documentation on what causes this error or how to resolve it.
@jackie.jxt: The exception posted here seems a transient ZK issue.
@jackie.jxt: I suspect the issue to be that the segments are not properly cleaned up when deleting the table
@jackie.jxt: Can you please check the ideal state of the table and see if the old segments are still there? Did the table deletion succeed?
@david.cyze: How can I do that?
@david.cyze: As far as whether or not the deletion succeeded, I deleted the table through the WebUI and did not see any errors after doing so
@jackie.jxt: If it is okay, can you try deleting the table again and let's check if the segments are cleaned up properly
@jackie.jxt: As the source of truth, you may use zookeeper browser to check if the segments are cleaned up from the cluster
@david.cyze: I deleted the table and `PinotCluster/PropertyStore/Segments` is reported empty from the zookeeper browser
@jackie.jxt: How about the IdealState? It should also be empty
@jackie.jxt: If so, we can go ahead and recreate the table, and it should create segments from sequenceId 0 (e.g. `simplejson__0__0__20210902T...`)
@david.cyze: PinotCluster/IdealStates has two entries: • brokerResource • leadControllerResource I'm not sure what the entries should have inside them. Forgive my ignorance :sweat_smile:
@jackie.jxt: That is normal. The entry for the table `simplejson__REALTIME` is removed
@david.cyze: Yes it is
@david.cyze: I'll recreate
@jackie.jxt: After recreating the table, the idealstate should show up with the initial consuming segments
@david.cyze: That is right. I see that now.
@david.cyze: Should the table begin consuming from the first offset? The ideal state still shows `CONSUMING`, but I would have expected at least one segment to have completed by now
@jackie.jxt: In the table config, did you set the `"stream.kafka.consumer.prop.auto.offset.reset"` to be `"smallest"`?
@david.cyze: Ah ha, I did not. I assume the default is `largest`
@david.cyze: Thank you again for your help. I'll be sure to give all the configuration settings a closer read through.
@nishad.sadasivan: @nishad.sadasivan has joined the channel
@debabrata: @debabrata has joined the channel
@gqian3: Hi team, we are seeing lots of “BrokerResourceMissingError”, based the source code, this can happen only when table queried is not found? Is there any other case can result in this exception e.g. server or broker is under heavy load etc?
@jackie.jxt: This exception means broker cannot find the routing table for a given query. In most of the cases, it is caused by querying a table that does not exist
@jackie.jxt: Can you please check the external view of the `brokerResource` and see if there is any `ERROR` table?
@gqian3: You mean if there is a table with name “ERROR” in the external view?
@jackie.jxt: There should be an entry called `brokerResource` in the external view
@jackie.jxt: You need to use zookeeper browser to find it

#getting-started

@vibhor.jain: @vibhor.jain has joined the channel
@david.cyze: @david.cyze has joined the channel
@tiger: Is Pinot able to efficiently run queries that use REGEXP_LIKE? I'm not sure if there is any indexing or pre aggregations that would make that fast?
@g.kishore: Text index
@mayanks:
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Apache Pinot Daily Email Digest (2021-09-02)

#general

#random

#troubleshooting

#getting-started

Reply via email to