Apache Pinot Daily Email Digest (2020-12-21)

Pinot Slack Email Digest Mon, 21 Dec 2020 18:07:15 -0800

#general

@tymm: Hello, is it possible to use flink to sink data into pinot?
@kharekartik: Hi Tymn, Currently there is no native Flink sink. You can publish the data from Flink job to Kafka or S3 and then allow Pinot to consume from there.
@pugarg: @pugarg has joined the channel
@pugarg: Hi team, I am using official docker image of Apache pinot referring to this doc My broker and server container is going down without any exception. can someone help me to look out this issue Error log at zk side is: ```EndOfStreamException: Unable to read additional data from client sessionid 0x17685043a2e001d, likely client has closed socket```
@nimesh.mittal: @nimesh.mittal has joined the channel
@igarcia: @igarcia has joined the channel

#random

@pugarg: @pugarg has joined the channel
@nimesh.mittal: @nimesh.mittal has joined the channel
@igarcia: @igarcia has joined the channel

#troubleshooting

@pugarg: @pugarg has joined the channel
@pugarg: Hi team, I am using official docker image of Apache pinot referring to this doc My broker and server container is going down without any exception. can someone help me to look out this issue Error log at zk side is: ```EndOfStreamException: Unable to read additional data from client sessionid 0x17685043a2e001d, likely client has closed socket```
@g.kishore: @pugarg what’s the container spec?
@pugarg: I dont get u @g.kishore.. i am new to apache pinot.. if you can guide me what actually you are looking..
@g.kishore: are you running this on your local machine?
@pugarg: I am creating docker container in my local machine, broker and server is going down.. And when i am using locally without docker containerization, its working properly
@pugarg: I am not sure why it is failing inside container
@g.kishore: did you build your own docker image?
@g.kishore:
@pugarg: I tried with both way one is officical image
@pugarg: Thi is quick start where controller zk broker qnd server are running in one container
@g.kishore: @fx19880617 ^^
@fx19880617: Can you check if your container has enough disk size? typically this means zk is out of disk
@nimesh.mittal: @nimesh.mittal has joined the channel
@igarcia: @igarcia has joined the channel
@elon.azoulay: Hi, anyone familiar with setting generic kafka properties, ex. `stream.kafka.consumer.prop.isolation.level` or `group_id`, `client_id`, etc. - it looks like only a specific list properties are honored, like `stream.kafka.topic.name`, `stream.kafka.decoder.class.name` ... - I can create a github issue, lmk.
@fx19880617: I think all the configs will be passed to kafka consumer
@fx19880617: E.g. you can put ```"ssl.truststore.password": "${KAFKA_TRUSTSTORE_PASSWORD}", "ssl.keystore.password": "${KAFKA_KEYSTORE_PASSWORD}",``` into your stream configs
@fx19880617: this is a full sample table conf just fyi
@fx19880617: ```"tableIndexConfig": { "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "LowLevel", "stream.kafka.topic.name": "my-events", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": "my-kafka:9090", "security.protocol": "SSL", "ssl.truststore.location": "/opt/pinot/kafka.client.truststore.jks", "ssl.keystore.location": "/opt/pinot/kafka.client.keystore.jks", "ssl.truststore.password": "$KAFKA_TRUSTSTORE_PASSWORD", "ssl.keystore.password": "$KAFKA_KEYSTORE_PASSWORD", "ssl.endpoint.identification.algorithm": "", "stream.kafka.decoder.prop.schema.registry.url": "", "stream.kafka.decoder.prop.schema.registry.rest.url": "", "realtime.segment.flush.threshold.rows": "5000000", "realtime.segment.flush.threshold.time": "1d", "realtime.segment.flush.threshold.segment.size": "500m", "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "stream.kafka.fetch.timeout.millis": "40000" } }```
@elon.azoulay: Thanks @fx19880617! So I see code that strips the prefix, but it doesn't seem to honor the following properties (unless this is incorrect): ``` "streamConfigs": { ... "stream.kafka.consumer.prop.isolation.level": "read_committed", "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "stream.kafka.consumer.prop.group.id": "a2938a5b-747c-4a2a-90e6-2eaddf81164d", "stream.kafka.consumer.prop.client.id": "7100f4b4-f15e-4624-881c-7949c807addf", ... }```
@elon.azoulay: those specific properties do not seem to be set when we check the `ConsumerConfig` 's in the server logs
@fx19880617: then I think you can directly set them without prefix?
@elon.azoulay: Ah, I'll try that, thanks!

#pinot-perf-tuning

@samarth: @samarth has joined the channel
@samarth: are there any default perf test in pinot repo that I can use to measure the performance ? I am starting pinot-server with `-XX:ActiveProcessorCount` java opts, so that I can override number of available processors, and workaround the need to request high number of CPUs in k8 deployment. Wanted to do some perf test to reach optimal value. ``` /** * Use at most 10 or half of the processors threads for each query. If there are less than 2 processors, use 1 thread. * <p>NOTE: Runtime.getRuntime().availableProcessors() may return value < 2 in container based environment, e.g. * Kubernetes. */ public static final int MAX_NUM_THREADS_PER_QUERY = Math.max(1, Math.min(10, Runtime.getRuntime().availableProcessors() / 2));```
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Apache Pinot Daily Email Digest (2020-12-21)

#general

#random

#troubleshooting

#pinot-perf-tuning

Reply via email to