#general


@dunithd: Hello folks, I’m looking for any articles, blog posts, or conference talks on using Pinot for real-time personalization. For example, how to use Pinot to do real-time content recommendations, product recommendations, etc. If you come across anything like that, please post them here :)
  @mayanks: This one touches upon LinkedIn feed use case
  @mayanks:
  @dunithd: Thanks for the resources! :+1:
@zainamro1: @zainamro1 has joined the channel
@yuchaoran2011: @yuchaoran2011 has joined the channel
@cechovsky.jozef: @cechovsky.jozef has joined the channel
@bvarunpy: @bvarunpy has joined the channel
@gqian3: Hi, does Pinot support query like this: “select count(case when boolean_field then id else null end) as cnt”, and “select * from table where nvl(field1, field2) < 100?
  @jackie.jxt: For the first one, pinot does not support `null` as value (it supports a special null index), so you can do `select sum(case when boolean_field then 1 else 0 end) as cnt` instead. And I think you can directly do `select sum(boolean_field) as cnt` because boolean type is stored as 1 and 0 internally.
  @jackie.jxt: For the second one, because pinot does not support `null` as value, you should be able to use `CASE` statement to compare the value with default value
@bowenwan: Hi, I was trying to use spark to do batch ingestion. From this doc , it seems Pinot support Spart 2.X version at least for Pinot 0.4. it seems there is some dependency issue when I was using Spark 2.2.3 like the tutorial but I was able to use Spark 2.4.8 to do the ingestion. Since latest version of Spark is 3.X and Pinot is already 0.8, I'm wondering what's current recommended compatible Spark version ?
  @mayanks: @kulbir.nijjer ^^
  @kulbir.nijjer: @bowenwan I would recommend for now staying with Spark 2.x version which has been well tested with latest Pinot versions, there are some known issues w.r.t Spark 3 that team is still working on. Once those are addressed we will update docs to indicate spark 3.x support.
  @bowenwan: @kulbir.nijjer Thanks for clarification !
@rupesh_raghavan: @rupesh_raghavan has joined the channel

#random


@zainamro1: @zainamro1 has joined the channel
@yuchaoran2011: @yuchaoran2011 has joined the channel
@cechovsky.jozef: @cechovsky.jozef has joined the channel
@bvarunpy: @bvarunpy has joined the channel
@rupesh_raghavan: @rupesh_raghavan has joined the channel

#troubleshooting


@zainamro1: @zainamro1 has joined the channel
@yuchaoran2011: @yuchaoran2011 has joined the channel
@valentin: Hello, I have an OFFLINE table with a configured retention: ```"segmentsConfig": { "replication": "1", "timeColumnName": "timestamp", "retentionTimeUnit": "DAYS", "retentionTimeValue": "90", "timeType": "MILLISECONDS", "schemaName": "schema_607ec74cd839000300105f34" }``` But my segments aren’t deleted, like this one: ```{ "segment.offline.download.url": "", "segment.start.time": "1618921448331", "segment.time.unit": "MILLISECONDS", "segment.end.time": "1618947468383", "segment.total.docs": "390000", "segment.table.name": "datasource_607ec74cd839000300105f34", "segment.creation.time": "1619049989622", "segment.name": "datasource_607ec74cd839000300105f34_1618921448331_1618947468383_1", "segment.index.version": "v3", "segment.offline.refresh.time": "-9223372036854775808", "segment.type": "OFFLINE", "segment.offline.push.time": "1619050008800", "segment.crc": "1089143347" }``` I see in the logs the following message: ```2021/09/17 03:45:12.752 INFO [RetentionManager] [pool-9-thread-6] Start managing retention for table: datasource_607ec74cd839000300105f34_OFFLINE``` And except this log, no other log message mention the RetentionManager Do you know why or how I can debug this? Thank you
  @mayanks: Are no segments getting deleted or just a few? If none are getting deleted, perhaps a config issue.
  @mayanks: Also, does the controller have the correct permissions to delete from S3 buckets?
  @mayanks: And if the segment deletion is successful, you should see something like `Deleting {} segments from table: {}"` in the log
  @valentin: No segments getting deleted from what I see And I don’t see any other logs about the retention manager
  @mayanks: That seems to indicate that retention manager does not think the segments should be deleted, which points to some misconfiguration (time unit). But from your configs above, I don't see an issue
  @valentin: yep I thought the same, and in my online tables, segments are getting deleted ```"segmentsConfig": { "replication": "1", "timeType": "MILLISECONDS", "timeColumnName": "timestamp", "retentionTimeUnit": "DAYS", "retentionTimeValue": "3", "segmentPushFrequency": "HOURLY", "segmentPushType": "APPEND", "replicasPerPartition": "1", "completionConfig": { "completionMode": "DOWNLOAD" }, "schemaName": "schema_607ec7511360000300516e44" },```
  @valentin: Maybe it’s because it’s an hybrid table?
  @valentin: In my offline table I don’t have the ```"segmentPushType": "APPEND",```
  @valentin: But this is the default value isn’t it?
  @mayanks: No, hybrid table should still have retention
  @mayanks: Is the retention config from offline table or real-time
  @valentin: 1st config from the offline
  @valentin: 2nd one from the online (which works)
@prashant.pandey: Hi Pinot experts, we had a sudden ingestion lag (> 45m) in one of our tables today without any increase in incoming throughput. This happened suddenly to just that one table (the rest of the tables had very little lag in seconds). We had to scale up our Kafka partitions for it to catch up. What might be the possible reason behind this sudden lag? @tanmay.movva
  @mayanks: Seems like the issue was on the Kafka side then?
  @tanmay.movva: Metrics related to kafka were normal. And all the tables are consuming from the same cluster. We saw this issue only on one of the tables.
  @tanmay.movva: Any pointers on how we can confirm/dismiss if this was a kafka related issue?
  @tanmay.movva: We also didn’t see anything abnormal in pinot’s metrics, but the ingestion slowed down. We increased partitions and scaled up pinot relatime servers to mitigate lag and it has helped.
  @mayanks: What did you change on the Pinot side
  @tanmay.movva: Nothing changed on pinot and kafka, since atleast 2 weeks.
  @mayanks: Oh you did increase pinot serves
  @tanmay.movva: Yes, we did after we started observing lag in pinot table.
  @mayanks: Did the ingestion on pinot side slow down or was there periods of stopped consumption for some partitions
  @mayanks: Also did read qps increase for those servers
  @tanmay.movva: No. QPS was very low.
  @tanmay.movva: CPU utilisation for those servers was also normal. (60-70% like usual).
  @tanmay.movva: This is the ingestion rate graph for the table which had lag. The spikes at the right end are when we restarted the servers after increasing kafka topic partitions and scaling realtime servers from 2 to 4.
  @mayanks: Were a lot of segments being generated? Also, was there GC?
@cechovsky.jozef: @cechovsky.jozef has joined the channel
@luisfernandez: does anyone have an example to make the recommendation engine works with realtime tables for segmentSizeRecommendations? I keep on trying to running it but I keep on getting: ```"segmentSizeRecommendations": { "numRowsPerSegment": 0, "numSegments": 0, "segmentSize": 0, "message": "Segment sizing for realtime-only tables is done via Realtime Provisioning Rule" },```
  @luisfernandez: is there something i’m missing in the body of the request?
@bvarunpy: @bvarunpy has joined the channel
@will.gan: Hi, does anyone know why the `/version` REST API endpoint would return an empty object? The other endpoints work for me.
  @mayanks: Hmm that should not be the case. What version of Pinot are you running?
  @will.gan: @mayanks 0.8.0
  @will.gan: I recently upgraded it from 0.7
  @mayanks: Seems like it is might be a new issue. Do you mind filing a github issue?
  @will.gan: ok done!
  @mayanks: Thanks
@rupesh_raghavan: @rupesh_raghavan has joined the channel
@chxing: Hi All Can superset support auth for pinot, I don’t find some auth configurations in superset? thx

#announcements


@anu110195: @anu110195 has joined the channel

#getting-started


@sanipindi: @sanipindi has joined the channel

#metrics-plugin-impl


@ashish: @ashish has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Reply via email to