#general


@surajkmth29: Hi Team, I was looking at ID_SET and IN_SUB_QUERY provisions in pinot for handling subqueries referring the below video: Here I have few questions: 1. Is the ID_SET only supported for integer values? 2. Is there support for alphanumeric strings? Any pointers would be helpful
  @g.kishore: @jackie.jxt ^^
  @jackie.jxt: @surajkmth29 ID_SET supports all data types. For non-integer types (types other than INT and LONG), it stores the values in a bloom filter
  @jackie.jxt: The `expectedInsertions` and `fpp` is configurable for the bloom filter to tune the accuracy. You may read more here:
@msoni6226: Hi Team, Is there any document available where I can get the definition of counters/metrics exposed from Pinot for Prometheus?
  @adireddijagadesh: @msoni6226 Please refer this document:
@vibhor.jain: Hi Team, As part of handling duplicates in our hybrid table, we thought of using "mergeType": "dedup" for moving data from realtime to offline table. The problem we are facing is, one of our column is storing encrypted value and even for duplicate rows, this value is changing everytime. Is there a way to perform "dedup" on a subset of columns for moving data to offline table via minion?
  @mayanks: Won’t that cause data loss due to incorrect dedup?
  @vibhor.jain: Hi @mayanks, by a subset of columns I mean pointing only the primary key columns. Currently for "mergeType": "dedup" config, it scans the entire row. Is there any option of restricting it to primary key-related columns somehow?
  @mayanks: There isn't one right now, afaik. But I am still unclear. Let's say you have two rows with same primary key values, but different on other dimensions, which ones do you expect the dedup to drop?
@valentin.richer: @valentin.richer has joined the channel
@kchavda: Hi All, Any advice/suggestions on how to handle null values in date column with valid values same as the default `1970-01-01` in Pinot (ex: date of birth)? In my real time table schema I have the date defined as below under dateTimeFieldSpecs: ```{ "name": "date_of_birth", "dataType": "TIMESTAMP", "format": "1:DAYS:TIMESTAMP", "granularity": "1:DAYS" }```
  @mayanks: `date_of_birth` is not a time column right, but regular dimension?
  @kchavda: Right. I have a created_at date which I'm using as the primary time column in the table segment config.
  @kchavda: I'm formatting the field to show as date when querying the data.

#random


@valentin.richer: @valentin.richer has joined the channel

#troubleshooting


@valentin.richer: @valentin.richer has joined the channel
@vibhor.jain: Hi Team, As part of handling duplicates in our hybrid table, we thought of using "mergeType": "dedup" for moving data from realtime to offline table. The problem we are facing is, one of our column is storing encrypted value and even for duplicate rows, this value is changing everytime. Since "dedup" works on entire row, its not removing the duplicates. Is there a way to perform "dedup" on a subset of columns for moving data to offline table via minion?
  @jackie.jxt: @vibhor.jain Currently that is not supported. If the value keeps changing, we won't known which value to keep during the `dedup`. Is it possible to model the use case as `rollup` where we can merge different values into one?

#thirdeye-pinot


@hardik.chheda: @hardik.chheda has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Reply via email to