jhyao commented on issue #11948:
URL: https://github.com/apache/pinot/issues/11948#issuecomment-1797769758
After publishing 1M ids, producer continued to send 2M upsert data with same
ids as first 1M ids.
Producer code like this:
```python
def generate_record(id):
record = {
'UID': id,
'UpdatedTime': get_time(),
'Content': generate_random_string(CONTENT_LENGTH)
}
for i in range(1, 11):
record[f'JTD{i}'] = generate_random_number()
return json.dumps(record)
for i in range(3):
for id in range(1_000_000):
record = generate_record(id)
producer.send(TOPIC, key=str(id).encode(), value=record.encode())
if id % 10000 == 0:
print(f'Published {i} round, {id} messages')
producer.flush()
producer.flush()
```
I tested again without upsert, no this issue. So the issue is only on upsert
table.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]