#general
@rajathlikeslemons: @rajathlikeslemons has joined the channel
@kmvb.tau: Hello, For Real-Time Tables, is there a condition like one Kafka topic should have one table data alone ?. In our cases, we have multiple tables data produced in a single topic. i.e we have a group of topics in Kafka. Each topic will serve for the set of tables based on the use case. is it possible to consume multiple table events from a single topic in pinot?
@g.kishore: Each table in Pinot is independent of each other.. you can set up multiple tables to consume from same kafka topic.. You can also use filter config to filter out rows that don't belong to that table
@ssubrama: Watch where you are headed though. It is easy to throw in all data into one topic and create multiple pinot tables, but the tables will be consuming all the data and discarding those that we don't need. This can generate a lot of garbage and get in way of high throughput use cases.
@kmvb.tau: @ssubrama we have a group of topics. 1. since we have both user-level and table-level shard maintaining separate topic for every table is not scalable for us. 2. Configuring the Same topic for multiple tables will increase unnecessary IO load on Kafka servers.
@jmeyer: Hello :wave: Has anyone got experience with Pinot on ADLS (Gen 2) ? Specifically: • Any idea on minimum IOPS for running Pinot smoothly on lowish load ? (i.e. is a standard Storage account "enough ? If so, how "far" can we push it ?) • Is it recommended to create a dedicated PVC for `controller.local.temp.dir` ?
@fx19880617: • pinot uses ADLS as deep store( for backup) so it’s not your query path, all the segments are required to be copied to pinot server local disks. • For `controller.local.temp.dir` it’s mostly used as temporary data store for segments uploaded to controller, which is not required if you use ADLS as deep store, so you typically just keep it to local temp should be ok. • Pinot server uses PVC to serve data, please ensure you give enough disk space and SSD for query performance. On Azure, by default we use standard AzureDisk as PVC. You can also try AzureFile if you don’t care about the perf much.
@jmeyer: Thanks @fx19880617
@jmeyer: Does a Standard SSD with 500 IOPS seem enough for lowish loads or it is not a good idea ?
@fx19880617: depends on your use cases, standard SSD is good for typical use cases.
@jmeyer: We'll start with that, thanks again
@kmvb.tau: Few doubts regarding Streaming data : 1. Pinot supports data ingestion via streaming (Kafka) or batch(Hadoop) process. is there any direct API available for pushing data into Pinot? 2. Does pinot have a segment compaction process like Hbase compaction? Creating a lot of small segments will not affect query performance?
@mayanks: There offline push is via http post. There isn’t a write api right now if that’s what you are asking.
@mayanks: Segment compaction is under progress and will be available shortly
@kmvb.tau: Based on my understanding, Pinot Streaming Processor can pull data from Kafka, kinesis, etc. but Pinot doesn't have a rest layer that accepts data requests(POST) directly. is there any plan to support write REST API in the future?
@mayanks: We have discussed, but there's no concrete timeline right now. May I ask what's your use case that would need the write-api?
@pedro.cls93: Hello, Pinot docs related to deep-storage in K8s seem to be broken:
@mayanks:
@mayanks:
@pedro.cls93: The last link is for file import, is that relevant?
@pedro.cls93: I've configured the controller & server to connect to ADFS. I get the following exception: ```2021/05/03 16:20:48.309 ERROR [StartServiceManagerCommand] [Start a Pinot [SERVER]] Failed to start a Pinot [SERVER] at 1.844 since launch java.lang.RuntimeException: com.azure.storage.file.datalake.models.DataLakeStorageException: Status code 400, "<?xml version="1.0" encoding="utf-8"?> <Error><Code>OutOfRangeInput</Code><Message>One of the request inputs is out of range. RequestId:45faebbd-b01e-0064-2438-40f291000000 Time:2021-05-03T16:20:48.1516232Z</Message></Error>" at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:58) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.spi.filesystem.PinotFSFactory.init(PinotFSFactory.java:74) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.server.starter.helix.SegmentFetcherAndLoader.<init>(SegmentFetcherAndLoader.java:71) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.server.starter.helix.HelixServerStarter.start(HelixServerStarter.java:324) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.tools.service.PinotServiceManager.startServer(PinotServiceManager.java:150) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:95) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.lambda$run$0(StartServiceManagerCommand.java:260) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:286) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.access$000(StartServiceManagerCommand.java:57) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.run(StartServiceManagerCommand.java:260) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed] Caused by: com.azure.storage.file.datalake.models.DataLakeStorageException: Status code 400, "<?xml version="1.0" encoding="utf-8"?>``` Does this ring any bells?
@mayanks: @rkanumul ^^
@rkanumul: I haven’t seen this error before.. But will spend some time on it..
@mayanks: Also, @pedro.cls93 are you using ADLS gen1 or gen2?
@pedro.cls93: Hello again, are pinot helm charts published to any hub? They don't exist in
@fx19880617: right now it’s only on github
#random
@rajathlikeslemons: @rajathlikeslemons has joined the channel
#feat-presto-connector
@j.wise.hunter: @j.wise.hunter has joined the channel
#troubleshooting
@rajathlikeslemons: @rajathlikeslemons has joined the channel
@pedro.cls93: Hello, has anyone succesfully configured pinot to use Azure Storage Containers for deep storage?
@mayanks: I know of ADLS gen2 based deployments
@mayanks: cc @dlavoie
#pinot-dev
@syedakram93: @syedakram93 has joined the channel
@j.wise.hunter: @j.wise.hunter has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
