#general


@arunkumarc2010: @arunkumarc2010 has joined the channel
@sriramdas.sivasai: Hello every one, I have doubt about the Storage and query part of the pinot. Suppose if we hav 6 months of data as pinot segments in deep storage (size of 500gb) and if i want to make the aggregate query on last 6th months data. 1. does my offline data server should have 500gb memory(RAM) to processs the query ?? or Even with 100gb ram and storage of 500gb, the queries will work efficiently ?? 2. Also does my query work, if i didnt have the storage of 500gb ? 3. memory required for loading segment file from disk is same as the size of the file ? i meant, because of loading the compressed file to memory will blow up the ram 3-4x. Also If want to read the single record from previous 6 months, will it do on demand segment loading from deep storage ??
  @mayanks: Servers maintain a local copy of segments on their disk. You do need local disk big enough to store the per server data, but segments are memory mapped, so you don’t need a big RAM, you can do away with 64GB ram for example.
  @sriramdas.sivasai: understood. if my query is a groupBY query on 6months dataset. does the lesser memory than the size of whole dataset to process the query works ?
  @mayanks: Yes. You don't need large ram to match the data size.
  @sriramdas.sivasai: Ok. thanks. just to understand the performance of pinot, all the benchmarks of pinot shown in various places are used memory mapped mode segments from disk or HEAP mode ?
  @mayanks: MMAP mode only
@liranbri: Hi everyone, we are evaluating Pinot and one of our requirements is to be able to encrypt our client's data on the disk (in memory it can be decrypted). is such a thing possible? and if so, we may also need to encrypt it with a different encryption key per client (each client's data would be encrypted with a unique key dedicated to that client). is there a way to achieve that? thank you so much
  @mayanks: Pinot does support encryption of data copy on deepstore. However, the local server copies on disk need to be decrypted to maintain low latency. The per client encryption requirement is an interesting that I came across in the past and opened an issue to track
  @liranbri: It would be a great feature! TBH i’m not sure what you mean by “deepstore”. is that that storage consumed by Pinot, or the source of data owned by us and ingested into Pinot?
  @mayanks: Pinot uses deep store to maintain a golden copy of the data ingested. It supports deep stores like S3/ADLS/GCP/etc. That copy can be encrypted.
  @mayanks: Pinot servers store a copy of the data on local disk for faster serving (today), that copy does not support encryption.
  @liranbri: thanks for the explanation. and those are copies of all the data, or just subsets of it ?
  @liranbri: because i’m trying to understand whats the actual value of deep-store encryption, if the same data is decrypted on other disks?
@savingoyal: @savingoyal has joined the channel
@orbit2: @orbit2 has joined the channel
@carlos: ```Hi guys! I have a question regarding Kafka integration with Pinot If I'm using a secured Kafka using SASL_SSL. Is there any way of configuring that and use those credentials? Or there is another way of setting security from Pinot to Kafka for data ingestion? Thanks in advance!```
  @xiangfu0: I think we have an issue open for SASL_SSL support:
@carlos: Posted in troubleshooting as well
@rkabir: @rkabir has joined the channel

#random


@arunkumarc2010: @arunkumarc2010 has joined the channel
@savingoyal: @savingoyal has joined the channel
@rkabir: @rkabir has joined the channel

#feat-upsert


@kchavda: @kchavda has joined the channel

#troubleshooting


@chxing: Hi All , Can Pinot realtime table data transfer to offline table directly, thx
  @mayanks: Yes
  @chxing: Thx Mayank
@arunkumarc2010: @arunkumarc2010 has joined the channel
@ruslanrodriquez: Hi everyone! I am researching realtime table evolution. After updating pinot schema and reloading segments I see new columns in table and null values in old data. But after consuming new data with not empty newly added fields, new data are importing with null values too in new columns. Kafka messages in avro formats. When I debug the code I see that AvroRecordExtractor still uses old set of fields. Can I refresh fields set in AvroRecordExtractior and start consuming messages with new columns?
  @jackie.jxt: The current consuming segment won't be able to pick up the new values immediately because the writers for the new added columns are not setup. The next consuming segment will pick the new fields up.
  @jackie.jxt: Can you please file an issue for the requirement? One work around we can do is to drop the current consuming segment and replace with a new one with all fields set up properly. But that also means the already consumed data within the consuming segments are dropped and will be re-consumed in the replacing segment
@deemish2: Hello everyone , i would like to understand how can backfill offline data which contains multiple segment.
  @jackie.jxt: Do you mean replacing the current segments with a new set of segments? Pinot automatically replaces the segment with the same name when a new segment is pushed. One approach is to replace segment one by one with the segment of the same name. We are also working on a feature to do atomic batch replacement
@savingoyal: @savingoyal has joined the channel
@orbit2: @orbit2 has joined the channel
@carlos: Hi guys!
@carlos: I have a question regarding Kafka integration with Pinot
@carlos: If I’m using a secured Kafka using SASL_SSL. Is there any way of configuring that and use those credentials? Or there is another way of setting security from Pinot to Kafka for data ingestion?
  @jackie.jxt: @slack1 Can you please help answering this?
  @jackie.jxt: nvm.. Xiang already replied: I think we have an issue open for SASL_SSL support: 
@carlos: Thanks in advance!
@rkabir: @rkabir has joined the channel

#pinot-s3


@kchavda: @kchavda has joined the channel

#aggregators


@kchavda: @kchavda has joined the channel

#pinot-dev


@agnihotrisuryansh55: Under the section `setting up pinot cluster` manual cluster setup link is not accessible ``
  @xiangfu0: fixed
@atri.sharma: @mayanks @jackie.jxt please review the distinct PR and let me know if it looks ok
  @mayanks: Will do

#pinot-docs


@kchavda: @kchavda has joined the channel

#getting-started


@kchavda: @kchavda has joined the channel

#debug_upsert


@deemish2: Hello everyone , i would like to understand how can backfill offline data which contains multiple segment.
@yupeng: upsert table takes real time data only
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Reply via email to