#general
@nurcahyopujo: @nurcahyopujo has joined the channel
@prachiprakash80: @prachiprakash80 has joined the channel
@ravi.maddi: @1705ayush -- I am also doing same you are advanced. I am planning to do this flow. MySQL --> Python Stats(code already exist) --> Json --> Avro Convertion --> Kafka Producer --> Pinot Kafka Connector --> Pinot Session Store --> SuperSet I completed till Kafka Avro Poducer and consumer. And I am doing with everything using Python language. I have few doubts: 1. my flow is feasible. Is there any changes should needed. 2. Any idea Pinot Avro ingrest. It will help me. 3. In my case json contains lot of nested entities. Then Pinot Avro help to me or I should fatten the json data to record data. Need help :slightly_smiling_face:
@ravi.maddi: @All -- Few doubts: 1. Pinot Kafka Connector with Avro is possible. 2. If possible kindly any detailed document available online. I am fetching from one day, even no luck. Need Help :slightly_smiling_face:
@falexvr: Why a kafka connector? Pinot supports streaming directly from kafka
@npawar: Have you looked at the avro example on this page
@npawar: If you're not using confluent Kafka, you can change the decoder to
@mayanks: @ravi.maddi if you could let us know where in the docs page would you have looked at to find this info, it would help us arrange our docs better
@orajason: @orajason has joined the channel
@satish: @satish has joined the channel
@terodeakshay: @terodeakshay has joined the channel
@abprakash2003: @abprakash2003 has joined the channel
#random
@nurcahyopujo: @nurcahyopujo has joined the channel
@prachiprakash80: @prachiprakash80 has joined the channel
@orajason: @orajason has joined the channel
@satish: @satish has joined the channel
@terodeakshay: @terodeakshay has joined the channel
@abprakash2003: @abprakash2003 has joined the channel
#troubleshooting
@nurcahyopujo: @nurcahyopujo has joined the channel
@elon.azoulay: Hi, we noticed that for a segment on a table 2 out of 3 servers have the same data, but 1 of the servers has less data in the segment. External view == ideal state and in the cluster manager, when I click on the table it says it's in a "good" state. What would cause that? It's an offline table.
@g.kishore: Real-time segment or offline
@elon.azoulay: offline only
@elon.azoulay: I looked in the data directory to confirm the segment directories and one of them differs from the other 2
@g.kishore: Why do say they have less data?
@elon.azoulay: Only one of them does (out of 3). But when I look at the cluster manager it says the table is in a good state
@elon.azoulay: And ideal state == external view for all segments. All online, with the 3 servers for each segment (replication factor is 3).
@elon.azoulay:
@elon.azoulay: Here's the table def ^^
@elon.azoulay: A lot of bloom filter columns, not sure if that affects anything? This is pinot 6
@elon.azoulay: Could it be due to zookeeper being messed up? We had an issue with an istio deployment that killed zookeeper. It came back up but had to reread snapshots. I can check there
@g.kishore: I don’t think so.. how are saying that one of them has less data?
@elon.azoulay: Yes, when I do select count where $segmentName = ...
@g.kishore: Can you pate the results
@elon.azoulay: Here's from the server with less data: ```-rw-r--r-- 1 root 1337 38270 Mar 12 00:26 columns.psf -rw-r--r-- 1 root 1337 16 Mar 12 00:26 creation.meta -rw-r--r-- 1 root 1337 25203 Mar 12 00:26 index_map -rw-r--r-- 1 root 1337 104475 Mar 12 00:26 metadata.properties```
@elon.azoulay: And the other 2 servers have this: ```-rw-r--r-- 1 root 1337 829877 Mar 13 00:05 columns.psf -rw-r--r-- 1 root 1337 16 Mar 13 00:05 creation.meta -rw-r--r-- 1 root 1337 25694 Mar 13 00:05 index_map -rw-r--r-- 1 root 1337 105129 Mar 13 00:05 metadata.properties```
@g.kishore: That’s bizarre
@jackie.jxt: @elon.azoulay Did you replaced this segment? I think what might happen is that one server somehow didn't receive the message of re-downloading the segment, thus still keeping the old segment. Restarting the server should be able to pick up the new segment
@prachiprakash80: @prachiprakash80 has joined the channel
@orajason: @orajason has joined the channel
@satish: @satish has joined the channel
@terodeakshay: @terodeakshay has joined the channel
@abprakash2003: @abprakash2003 has joined the channel
#pinot-dev
@prachiprakash80: @prachiprakash80 has joined the channel
#getting-started
@prachiprakash80: @prachiprakash80 has joined the channel
#debug_upsert
@matteo.santero: Hello, thank you very much to all of you for the infos. At the moment we bypassed the case by using a rank&rownum(in case of exact duplications) = 1 on presto, the strange thing is that on web interface was showing it in the desired way. Thank you very much again for all the infos and I paste here the Yupeng note so we can have all in same chat Yupeng Fu: “yes, upsert is only for realtime table now. there is an ongoing PR to address upsert table with longer retention” Design doc:
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
