Kabir, Could you share the content of your commit metadata ? You can list the timeline, find the latest commit in the timeline, perform a cat and paste the results (that you can share).
Thanks, Nishith On Tue, Jul 2, 2019 at 4:53 PM Kabeer Ahmed <kab...@linuxmail.org> wrote: > Hi Vinoth and other HUDI Experts, > > I am stuck while processing inserts into HUDI. The process picks up CSV > files and loads them into HUDI. The process seems to be stuck at: > https://github.com/apache/incubator-hudi/blob/master/hoodie-client/src/main/java/com/uber/hoodie/table/HoodieCopyOnWriteTable.java#L679 > Log is below: > > 2019-07-02 22:43:31,875 [main] INFO > com.uber.hoodie.table.HoodieCopyOnWriteTable - AvgRecordSize => > 9223372036854775807 > 2019-07-02 22:43:31,969 [main] INFO > com.uber.hoodie.table.HoodieCopyOnWriteTable - For partitionPath : > 2018/05/30 Small Files => [SmallFile {location=HoodieRecordLocation > {commitTime=20190702161750, fileId=39cff0df-24e4-45b8-bff5-9b4f41c4096a}, > sizeBytes=435362}] > 2019-07-02 22:43:31,969 [main] INFO > com.uber.hoodie.table.HoodieCopyOnWriteTable - After small file assignment: > unassignedInserts => 8, totalInsertBuckets => 2147483647, recordsPerBucket > => 0 > Looking at the last line in the log: "unassignedInserts => 8, > totalInsertBuckets => 2147483647, recordsPerBucket => 0", this causes the > below code to loop for quite long causing heap issues. > > logger.info( > "After small file assignment: unassignedInserts => " + > totalUnassignedInserts > + ", totalInsertBuckets => " + insertBuckets + ", recordsPerBucket => " > + insertRecordsPerBucket); > for (int b = 0; b < insertBuckets; b++) { > bucketNumbers.add(totalBuckets); > recordsPerBucket.add(totalUnassignedInserts / insertBuckets); > BucketInfo bucketInfo = new BucketInfo(); > bucketInfo.bucketType = BucketType.INSERT; > bucketInfoMap.put(totalBuckets, bucketInfo); > totalBuckets++; > } > Has someone seen the issue? Do I need to file a bug or it is something to > do with my misconfiguration? > > Any help is highly appreciated. > > Thanks > Kabeer. >