#general


@knowledgeisstrengthfo: @knowledgeisstrengthfo has joined the channel
@knowledgeisstrengthfo: Hi Everyone, We are evaluating Apache Pinot for our analytical use case. We have encountered some scenarios for which we didn't get proper justification yet. Please help us to understand the reasoning behind them & how to address those scenarios; 1. Why insert to Pinot table via Presto connector is not supported as almost all other SQL commands are supported ? 2. Why updating records using update query is not allowed on Pinot table via Presto ? 3. If we want to replicate same set of data values in a Pinot table how to do it at present without Kafka Ingestion ? Ex: Existing 1M records we want to multiple by insert into TableA ( select * from TableA ). As Presto connector not allowed to insert into table and Pinot itself doesn't support subqueries, hence those 2 options are not there. 4. If we made some mistake adding column name during schema creation, and later update the schema, will the previous ingested data values for that column will automatically considered ? Ex: Realtime table has 1 column called "NAME", which is supposed to be mentioned as "name". So as the Kafka stream data, previously ingested have values for "name" attribute, so after schema change will Pinot automatically update values for all rows or we need to retrofit "name" values again ? If need to retrofit, what is the best possible way ? 5. Can a single query read from both REALTIME & OFFLINE tables ? As subqueries & joins are not supported directly by Pinot, is there any way, we can achieve that ?
  @mayanks: ```1. Ingestion in Pinot has traditionally been via offline and realtime stream. Could you elaborate your usecase that requires insert of rows via Presto? 2. Upsert in Pinot is a newer feature and requires a primary key to identify a row to be updated. While we may definitely explore update via Presto, it might still be primary key based (as opposed to any generic condition). 3. If you don't want to use Kafka ingestion, you can push the data via offline pipeline. 4. Schema changes have to be backward compatible, which your example isn't. 5. Offline and realtime tables are internal to Pinot. Client side only sees a single hybrid table, and Pinot answers query including the offline and realtime data.```
  @xiangfu0: For presto Pinot integration, we only connect the query path. No support for table and data ops.
  @mayanks: @knowledgeisstrengthfo Based on your questions, I am really curious on what your use case is, and how you are trying to use Pinot. Could you please share some details about that?

#random


@knowledgeisstrengthfo: @knowledgeisstrengthfo has joined the channel

#troubleshooting


@azri: Hi I try to push data from GCS to Pinot, after submitting job it seem not doing any and no output at all, these are my job spec ```executionFrameworkSpec: name: 'standalone' segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner' segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner' segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType: SegmentCreationAndUriPush inputDirURI: '' outputDirURI: '/tmp/ais-pinot/sentences/' includeFileNamePattern: 'glob:**/**.parquet' overwriteOutput: true pinotFSSpecs: - scheme: file className: org.apache.pinot.spi.filesystem.LocalPinotFS - scheme: gs className: org.apache.pinot.plugin.filesystem.GcsPinotFS configs: projectId: 'aton-analytics' gcpKey: '/var/pinot/controller/config/gcs-datalake-key.json' recordReaderSpec: dataFormat: 'parquet' className: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader' tableSpec: tableName: 'sentence' pinotClusterSpecs: - controllerURI: ''```
  @ken: This looks odd to me `includeFileNamePattern: 'glob:**/**.parquet'`. I think it should be `includeFileNamePattern: 'glob:**/*.parquet'`
  @azri: I tried that one before, but same no output.
  @azri: Is it because the data was too big?
@knowledgeisstrengthfo: @knowledgeisstrengthfo has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Reply via email to