izhangzhihao opened a new issue, #3776:
URL: https://github.com/apache/paimon/issues/3776

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   0.9.0
   
   ### Compute Engine
   
   flink 1.17.2
   
   ### Minimal reproduce step
   
   ```sql
   CREATE TABLE if not exists a_one_billion_table (
       id STRING,
       ... ...
       PRIMARY KEY (id) NOT ENFORCED
   ) WITH ('bucket' = '-1');
   
   # make sure you already inserted more then 1 billion data into table 
`a_one_billion_table`
   # then run the blow new filnk job, `source_table` may only have 10000 
records, then the checkpoint will fail.
   
   insert into a_one_billion_table
   select * from source_table;
   ```
   
   ### What doesn't meet your expectations?
   
   checkpoint failed with error:
   
   ```
   2024-07-18 16:09:32,895 WARN  org.apache.flink.runtime.taskmanager.Task [] - 
dynamic-bucket-assigner (1/1)#0 switched from RUNNING to FAILED with failure 
cause:
   java.lang.IllegalArgumentException: Too large (1466616922 expected elements 
with load factor 0.75)
        at 
org.apache.paimon.shade.it.unimi.dsi.fastutil.HashCommon.arraySize(HashCommon.java:208)
        at 
org.apache.paimon.shade.it.unimi.dsi.fastutil.ints.Int2ShortOpenHashMap.<init>(Int2ShortOpenHashMap.java:103)
        at 
org.apache.paimon.shade.it.unimi.dsi.fastutil.ints.Int2ShortOpenHashMap.<init>(Int2ShortOpenHashMap.java:116)
        at 
org.apache.paimon.utils.Int2ShortHashMap.<init>(Int2ShortHashMap.java:35)
        at 
org.apache.paimon.utils.Int2ShortHashMap$Builder.build(Int2ShortHashMap.java:70)
        at 
org.apache.paimon.index.PartitionIndex.loadIndex(PartitionIndex.java:138)
        at 
org.apache.paimon.index.HashBucketAssigner.loadIndex(HashBucketAssigner.java:166)
        at 
org.apache.paimon.index.HashBucketAssigner.assign(HashBucketAssigner.java:83)
        at 
org.apache.paimon.flink.sink.HashBucketAssignerOperator.processElement(HashBucketAssignerOperator.java:98)
        at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:246)
        at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:217)
        at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:169)
        at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:68)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:616)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:1080)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:1029)
        at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:959)
        at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:938)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:567)
        at java.lang.Thread.run(Thread.java:879) [?:1.8.0_372]
   ```
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to