Apache Pinot Daily Email Digest (2021-10-25)

Pinot Slack Email Digest Mon, 25 Oct 2021 19:00:41 -0700

#general

@sandeep.hadoopadmn: @sandeep.hadoopadmn has joined the channel
@girishpatel.techbiz: @girishpatel.techbiz has joined the channel
@jain.arpit6: Hi, I would like to setup a Pinot cluster with multiple controllers,servers and brokers on different hosts. I can see in document that controller should have a shared volume. Do servers, brokers, controllers running on different hosts should be reachable to each other?
@mayanks: Brokers and servers need network connectivity for query scatter/gather (between broker and server)
@mayanks: One broker does not need to know about other brokers
@mayanks: Same for servers
@jain.arpit6: All brokers should be reachable to all server. What abt connection with controller?
@ssubrama: Brokers need to reach the servers that host the tables that are served in the broker. In general, Pinot depends on Helix, so all nodes need to reach zookeeper and vice versa. I suppose by "reach" you mean setting up ip tables to block each other? It is good if the controllers can reach brokers. If not, your query console will not work. It is also good if controllers can reach servers otherwise some debug commands and features wont work. It is required that servers reach controllers. What exactly is your reachability constraint?
@jain.arpit6: I meant network connectivity and Mayank clarified to me as you also mentioned. Between controllers only file system sharing, and controllers should be reachable to both servers and brokers, servers should be reachable to brokers
@ken: Not sure about controllers needing a shared file system - that depends on how you’ve configured your segment deep store. And I think having multiple controllers using a shared fs for deep store could be problematic (which would be the case if you were pushing segments to the controller, vs pushing metadata).
@ssubrama: Linkedin runs pinot in production with multiple controllers sharing a common nfs (and pushing data through controllers, yes).
@ken: Hi @subbareddydagumati - so you rely on each controller getting a distinct set of segments (by name), so they don’t step on each other when writing data?
@g.kishore: Arpit, if you have the option, my recommendation is to avoid the linkedin model and use metadata uri based push (suggested by Ken)
@ssubrama: @ken no. each controller can receive messages for any table. Not sure why you mention that they need to get a distinct set of segments by name. Are you thinking of pushing two segments with the same name (but different contents) to two different controllers and somehow expecting a consistent result?
@jain.arpit6: @g.kishore I am very new to Pinot and I am not aware of metadata push vs data push model. For a start, I was tryjng to setup a simple multi node cluster with 1 controller, server A, broker A on host 1 having network connectivity with another host 2 running server B and broker B. I was getting some error with above setup but what I understood is that it should be possible. My plan is to use HDFS for deep storage once I make above setup work
@ken: @subbareddydagumati in a past life we had a painful bug, due to two servers (behind a LB) that used a shared disk. The LB was configured to auto-retry to the other server if the initial request took too long, but that timeout was sometimes too short, so then we’d wind up with processes on two different servers stepping on each other’s data (writing/updating the same file).
@ken: The deep scars from that experience made me (probably too) afraid of having multiple servers using a shared disk, without strict partitioning of the data in the file system.
@jacob.medal: @jacob.medal has joined the channel
@greyson: Is it possible to query Pinot from a database IDE like Datagrip?
@g.kishore: haven't used Datagrip.. did you find any issues if you use the jdbc connector
@greyson: Unfortunately doesn't seem like I can use just a JDBC connector. I only offers me drivers from this list
@ebyhry: DataGrip allows to register a new JDBC driver.
@g.kishore: Is datagrip a popular tool that folks use in IDE?
@stuart.millholland: @stuart.millholland has joined the channel
@rionmonster: @rionmonster has joined the channel

#random

@sandeep.hadoopadmn: @sandeep.hadoopadmn has joined the channel
@girishpatel.techbiz: @girishpatel.techbiz has joined the channel
@jacob.medal: @jacob.medal has joined the channel
@stuart.millholland: @stuart.millholland has joined the channel
@rionmonster: @rionmonster has joined the channel

#feat-upsert

@stuart.millholland: @stuart.millholland has joined the channel

#troubleshooting

@sandeep.hadoopadmn: @sandeep.hadoopadmn has joined the channel
@chxing: Hi @jackie.jxt @mayanks Filed 2 issues in git:
@mayanks: Thanks @chxing
@nadeemsadim: we have pinot *realtime* data backup on deepstore (gcs )as tar .. data was published on kafka as json but since its pushed on gcs as tar .. how can we restore data back into realtime / offline table .. q1)can we restore the tar segments present on gcs directly into realtime table q2)If not possible to restore into realtime table as of now (since zookeeper metadata also needed and feature may not be ready).. can we restore data stored in deep store as tar into offline table .. and then create hybrid table in order to not lose old data when *Disaster Recovery* happens .. q3)how to restore into offline table .. using *Job Segment Metadata Push or Segment URI Push* *where can I refer to get the steps to restore into offline table? (also should we use* *org.apache.pinot.plugin.inputformat.json.JSONRecordReader* *) in config since data was published on kafka as json for .* *some links already explored :-*
@nadeemsadim: @mayanks @xiangfu0 @g.kishore
@agsherrick: I would create a little standalone application that is a Kafka producer that reads from GCS and writes your data back out to Kafka. If you are only updating part of your records, then this should help:
@mayanks: 1) is not available. 2) you can simply push the segment tar in GCS to Pinot via a curl command. Or if the ingestion job has a skip build, then you can just do the push metadata + uri
@mayanks: You don’t need to use Kafka or redo segment generation for pushing to offline
@mayanks: For RT, you need to have ZK state restored currently, and that would work
@nadeemsadim: cc: @hussain
@nadeemsadim: what is the curl command for pushing tar in GCS to pinot @mayanks..any link or reference..is it available on pinot controller UI in swagger API section? ie
@nadeemsadim: For RT, you need to have ZK state restored currently, and that would work --> we dont need to restore into RT table.. we will restore realtime table generted tar into offline table and use hybrid table for compliance dashboards to have historical data as well when we restore
@nadeemsadim:
@mayanks: Yeah, you can just push the tar.gz files to offline table
@girishpatel.techbiz: @girishpatel.techbiz has joined the channel
@jacob.medal: @jacob.medal has joined the channel
@tiger: Hi, I'm trying to run the RealtimeToOfflineSegmentsTask on a minion. I'm using S3 as a deepstore and the minion logs are showing this exception: `java.lang.IllegalStateException: PinotFS for scheme: s3 has not been initialized` . Is there a way to configure the minions to be able to read from S3? I couldn't find anything in the docs. Thanks!
@xiangfu0: did you configure pinot fs inside minion job?
@tiger: Ah I'm not sure how to set that in a minion job. Is there a doc for that?
@tiger: Would I set it in the offline table corresponding to the realtime table under "ingestionConfig"?
@xiangfu0: yes, it should be there with the pinot fs configs
@xiangfu0: let me see if I can find some ref docs
@tiger: I see. Currently I'm setting all the S3 details in the controller/server config files. Is it recommended to set them in each table instead?
@xiangfu0: ```{ "tableName": "airlineStats", "tableType": "OFFLINE", "segmentsConfig": { "timeColumnName": "DaysSinceEpoch", "timeType": "DAYS", "segmentPushType": "APPEND", "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy", "replication": "1" }, "tenants": {}, "tableIndexConfig": { "loadMode": "MMAP" }, "metadata": { "customConfigs": {} }, "ingestionConfig": { "batchIngestionConfig": { "segmentIngestionType": "APPEND", "segmentIngestionFrequency": "DAILY", "batchConfigMaps": [ { "inputDirURI": "", "input.fs.className": "org.apache.pinot.plugin.filesystem.S3PinotFS", "input.fs.prop.region": "us-west-2", "includeFileNamePattern": "glob:**/*.avro", "excludeFileNamePattern": "glob:**/*.tmp", "inputFormat": "avro", "outputDirURI": "", "push.mode": "metadata" } ], "segmentNameSpec": {}, "pushSpec": {} } }, "task": { "taskTypeConfigsMap": { "SegmentGenerationAndPushTask": { } } } }```
@xiangfu0: this is one example table conf
@xiangfu0: you can config it in minion config files
@xiangfu0: controller/servers have it as pinot uses s3 for deep store, however you may have different access credential for ingestion data bucket
@xiangfu0: you can configure your accesskey and secret along ```"input.fs.prop.region": "us-west-2",```
@xiangfu0: if not set in env variables
@tiger: Got it, makes sense. For configuring in the minion config files, what keys do I set? I tried using something like `pinot.minion.storage.factory.s3.accessKey` but that doesn't seem to work
@xiangfu0: try configs without `pinot.minion` prefix. cc @jackie.jxt to confirm
@tiger: removing the pinot.minion prefix seems to have worked. thanks!
@tiger: I have another quick question about the RealtimeToOfflineSegmentsTask @xiangfu0. When I set the bucketTimePeriod to something like 1d, does Pinot round it to whole time periods? For example, would it strictly separate data such that one segment would only contain data for 2021-10-25. Or does it do a relative time period from whenever the task is run, so a segment could contain data from 2021-10-25 and 2021-10-24 if it was run mid day?
@xiangfu0: @jackie.jxt ^^
@jackie.jxt: @tiger It does strict partition based on the epoch time
@jackie.jxt: So each segment only contains data from the same epoch day
@stuart.millholland: @stuart.millholland has joined the channel
@rionmonster: @rionmonster has joined the channel
@bcwong: My queries don’t seem to span both OFFLINE and REALTIME tables. How do I debug that? Here’s what I did: 1. Added OFFLINE table via `AddTable`. Loaded data from Oct 1 via `ImportData`. 2. Query `select count(1) from tbl where ds = '2021-10-01'` ran successfully. 3. Added REALTIME table via web ui. Kafka ingested a bunch of data for `ds = '2021-10-03'`. Query shows new data. 4. *But* the query from #2 now returns no row. I have to query against `tbl_OFFLINE` to see the offline records. Many thanks!
@xiangfu0: hybrid table compute time boundary and exclude that from offline table:
@xiangfu0:
@bcwong: My offline and online data have non-overlapping time ranges.
@bcwong: The timestamp column is `create_time`, and the min(create_time) of the realtime table is greater than the max(create_time) of the offline table. This looks like a bug, though I’d like someone to confirm: ```-- 1633219200 select min(create_time) from tbl_REALTIME -- 0 (no rows) select count(1) from tbl where create_time < 1633219200 -- 1414034 select count(1) from tbl_OFFLINE where create_time < 1633219200```
@xiangfu0: timeboundary is max(offline_ts)
@xiangfu0: pinot will append `where create_time < time_boundary` to your query
@xiangfu0: so it will filter out the offline table data when you query with hybrid table name
@xiangfu0: see on how the time boundary is determined and how the query split works
@bcwong: I read that page, but that doesn’t explain what I’m seeing. > pinot will append `where create_time < time_boundary` to your query That should return something if Pinot queries the offline table. Right? My realtime data comes *after* the offline data with non-overlapping time ranges. > so it will filter out the offline table data when you query with hybrid table name Could you elaborate? The doc seems to suggest that the hybrid table query should expand into 2 queries with a merge. But that’s not happening. Many thanks again.
@bcwong: Even a simple `select count(1) from tbl` is excluding the data from the offline table. It almost looks like that Pinot doesn’t recognize the hybrid.

#pinot-dev

@jacob.medal: @jacob.medal has joined the channel
@atri.sharma: Where will I find the code style sheet for Pinot in the repo?
@xiangfu0:
@dadelcas: I've made some progress with the new big decimal type. It seems this will be a large change and I've got questions with regards serialisation and backwards compatibility. For example, most aggregations rely on double primitives, this should be configurable via some table property perhaps
@dadelcas: For serialisation I need to make sure the dictionary understands that bigdecimal is a variable length type, this seems to be detected by the indexer but I've not seen where I specify that
@tyler773: @tyler773 has joined the channel
@stuart.millholland: @stuart.millholland has joined the channel

#getting-started

@tyler773: @tyler773 has joined the channel
@greyson: @greyson has joined the channel

#releases