#general
@linux.hust: @linux.hust has joined the channel
@msoni6226: Hi Team, I have some questions regarding the quota related configuration of the Dimension table: We are running a Pinot Cluster setup having 4 servers and we have created a dimension table with the storage configured to be "200 MB". We are populating this dimension table with segments having 100k records and the uncompressed size of this segment is around ~23 MB (232 byte for each record in the segment). So as per the calculation, we were expecting that the table will be able to hold around ~900k records. However, we see only 200k records in the table and when we are pushing 3rd segment of 100k records, we are getting 403 error. Can someone please help us here as to why we are getting 403 error and not able to push any more segment of 100k record.
@mark.needham: Which command are you running that returns the 403 error?
@mayanks: Probably the offline segment push.
@mark.needham: ah
@msoni6226: We have defined a batch ingestion job, which is pushing the CSV of 100k records to the offline. We are getting 403 error and message which says total size exceeds quota size
@mayanks: Can you check the dim table size using the rest api? And also the uncompressed segment size you are trying to push?
@karinwolok1: :wave: A big, warm, welcome to all our new Apache Pinot community members! :wave: We'd love to know who you are and what brought you here! :smiley: @shubhendu.goswami @linux.hust @prashanth.rv @sanketjoshi4 @matteobovetti @arun.rajan.work @hljpzz1982 @kanishka @pachirao7 @prashanth.rv @xiaoman @manish.sharma1 @praneelgavali @momento.corto @diana.arnos @nicholas.yu2 @chris.jayakumar @ayush.jha @derek.p.moore @joseph.kolko @angelina.teneva @ashok.rex.2009 @troy @sam
@karinwolok1: :star: :loud_sound: *If you're interested in being a conference speaker this year, here's your chance! Submit your Apache Pinot use case to an open call-for-papers.* :loud_sound: :star: These conferences are currently accepting talk submissions! :partying_face: (PS. very often, these conferences cover the cost of travel as well!) > Kafka Summit London >
@karinwolok1: Hey all! :star: :star: :star: *We're hosting a community town hall to overview what's happened in 2021 with Apache Pinot features and capabilities, PLUS an open discussion of what's ahead in the Apache Pinot future roadmap!!* :star: :star: :star: We'd love for you to join!
@pavel.stejskal650: @pavel.stejskal650 has joined the channel
@flora.fong: @flora.fong has joined the channel
#random
@linux.hust: @linux.hust has joined the channel
@pavel.stejskal650: @pavel.stejskal650 has joined the channel
@flora.fong: @flora.fong has joined the channel
#troubleshooting
@linux.hust: @linux.hust has joined the channel
@mapshen: We run Pinot 0.8.0. When ingesting a table in `FULL` `upsert` mode, we notice the number of rows returned for the same query varies across times, but it is supposed to remain consistent. For example, there are 1000 unique values keyed on column `A`, which we use as the primary key for the pinot table `table1` . A query like `select count(1) from table1` can return values 2000, or 789, in addition to 1000. In the case of 2000, you can find duplicated rows with different timestamps such as ```| A | currenttime | | - | ------------ | | a | 1:00:00 | | a | 1:00:01 | | b | 1:00:00 | | b | 1:00:03 | ...``` In the case of 789, many rows are simply missing… We suspect this is related to the process of updating the index for the upserted table. Have anyone seen this before?
@mayanks: @yupeng ^^
@mayanks: I suspect the 789 might be because of partial result or some other reason not necessarily related to upsert.
@yupeng: can you check if the ingested topic is partitioned correctly?
@mapshen: the topic only has 1 partition and 1 replica per partition
@mapshen: 789 is one example. Sometimes it could be 999 or 998, which i doubt is due to a partial result
@mapshen: Could it be a regression in 0.8.0 because of the partial update support?
@mapshen: Rolled it back to 0.7.1 in dev, the numbers returned are a bit more consistent (?), which are 1000, seldom 999 or 998, and rarely 95x. Don’t see extremes like 2000 or 789 yet.
@mapshen: It seems the index updates are done in batches, and the primary keys are first removed and then added, which lead to inconsistent results? This can have serious consequences if we cannot rely on the data snapshot in Pinot…
@mapshen: @yupeng Any insights on this?
@yupeng: if there are duplicates, that means the upsert is not configured right
@yupeng: select the virtual column `$hostname` and see which hosts return the duplicates
@mapshen: @yupeng we only have 1 host for this testing in dev
@yupeng: and `$segmentName` to see which segments
@mapshen: @yupeng they are from different segments
@yupeng: it'll be helpful to see your table config and schema
@yupeng: cc @jackie.jxt
@mapshen: @yupeng here is the minimal tableconfig: ```{ "tableName":"table1", "schema":{ "metricFieldSpecs":[ { "name":"B", "dataType":"DOUBLE" } ], "dimensionFieldSpecs":[ { "name":"A", "dataType":"STRING" } ], "dateTimeFieldSpecs":[ { "name":"EPOCH", "dataType":"INT", "format":"1:SECONDS:EPOCH", "granularity":"1:SECONDS" } ], "primaryKeyColumns":[ "A" ], "schemaName":"schema1" }, "realtime":{ "tableName":"table1", "tableType":"REALTIME", "segmentsConfig":{ "schemaName":"schema1", "timeColumnName":"EPOCH", "replicasPerPartition":"1", "retentionTimeUnit":"DAYS", "retentionTimeValue":"4", "segmentPushType":"APPEND", "completionConfig":{ "completionMode":"DOWNLOAD" } }, "tableIndexConfig":{ "invertedIndexColumns":[ "A" ], "loadMode":"MMAP", "nullHandlingEnabled":false, "streamConfigs":{ "realtime.segment.flush.threshold.rows":"10000", "realtime.segment.flush.threshold.time":"96h", "streamType":"kafka", "stream.kafka.consumer.type":"lowLevel", "stream.kafka.topic.name":"topic1", "stream.kafka.decoder.class.name":"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "stream.kafka.consumer.factory.class.name":"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list":"kafka:9092", "stream.kafka.consumer.prop.auto.offset.reset":"largest" } }, "tenants":{ }, "metadata":{ }, "routing":{ "instanceSelectorType":"strictReplicaGroup" }, "upsertConfig":{ "mode":"FULL" } } }```
@mapshen: Also, how do you explain the case where the returned rows fewer than 1000? @yupeng
@mapshen: Actually, it seems that you will only get duplicates (>1000 rows) when querying via Trino (when a segment build is triggered?). When query Pinot directly, you only get rows fewer than 1000.
@mapshen: @elon.azoulay thoughts regarding the duplicates returned when querying via Trino?
@elon.azoulay: Do you get correct results if you do a "passthrough" query? i.e. ```select * from pinot.default."select count(*) from <table>"```
@mapshen: Message feed has stopped publishing so will have to test tomorrow. Just wanted to point it out that it’s not limited to `count()` - a regular `select * from table1` query without pushdowns can return more than and fewer than 1000 rows.
@pavel.stejskal650: @pavel.stejskal650 has joined the channel
@flora.fong: @flora.fong has joined the channel
#getting-started
@pavel.stejskal650: @pavel.stejskal650 has joined the channel
@pavel.stejskal650: Hello! I’ve got a question related to simple use case. Currently we have a Hadoop cluster for netflow ingestion ~ 320 TB data. Ingestion is from Kafka via Spark app directly to Hive (external table - simple parquet files). Searching in stored data is via Spark. Table is partitoned by hour but still we’re missing indexes. I’d like to replace current flow with Apache Pinot, but I’m not sure about segment store. We need to keep HDFS as a data backend and from documentation it seems like Pinot needs store data locally. We’re targeting to hybrid table, e.g. keep 1 hour from real time Kafka topis and older data to be pulled from HDFS. My questio is: a) real-time part of data need local disks - every Pinot server is holding a part of data from Kafka (consumer in group), right? b) hour + 1 data are stored “optimized” and indexed locally and pushed to HDFS? c) When I query data, current data are pulled from local semgment, older data are pulled in lazy fashion from HDFS/s3? d) is possible to host 200 TB table with ~ 12 columns (half nums, half strings) with @ 6 Pinot servers and get some benefit from indexes, just be more efficient than Spark with partition pruning?
@mayanks: Hi @pavel.stejskal650, welcome to the community: ```a) As of now, Pinot serving nodes store a local copy on the attached disk (both realtime as well as offline). The persistent storage can be in HDFS/S3 or similar such deepstore. For realtime, each Pinot server is assigned a sub-set of partitions from the topic to consume and store. b) RT nodes periodically flush the in-memory index to persistent store (HDFS in your case). But note that it will need to maintain a copy in the local disk as well, for serving. c) No, all data currently is local to the serving nodes. d) 200TB size is in what format? As I mentioned, serving nodes need local storage to serve the data from.```
@mayanks: If your concern is on cost of local storage, then you can also explore tiered storage:
@pavel.stejskal650: Thank you for clarification. 200 TB size is compressed parquet. So deepstore is strictly for backup/replication purposes, cannot be used for any type of lazy loading data for querying, right? Even if I’m OK with latency. In case when I’ve got server with 12 disks JBOD and 12 dirs/mounts, pinot server is able to split segments evenly to all drives?
@mayanks: I see, so Pinot segment might be around the same size. As of now, deep-store is strictly for backup purposes. Currently, the dataDir in pinot server is a single directory.
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org