Re: [PR] KAFKA-17544: Fix for loading big files while performing load tests [kafka]

via GitHub Mon, 06 Jan 2025 05:36:38 -0800


manoj-mathivanan commented on code in PR #18391:
URL: https://github.com/apache/kafka/pull/18391#discussion_r1904160619



##########
tools/src/main/java/org/apache/kafka/tools/ProducerPerformance.java:
##########
@@ -194,9 +195,16 @@ static List<byte[]> readPayloadFile(String 
payloadFilePath, String payloadDelimi
                 throw new IllegalArgumentException("File does not exist or 
empty file provided.");
             }
 
-            String[] payloadList = 
Files.readString(path).split(payloadDelimiter);
+            List<String> payloadList = new ArrayList<>();

Review Comment:
   @m1a2st Thanks for looking into this. The problem here is actually the 
entire data is loaded into a single String object. Even though the machine has 
memory, the String object is limited to the size of an array as seen here: 
https://github.com/openjdk/jdk/blob/f1d85ab3e61f923b4e120cf30e16109e04505b53/src/java.base/share/classes/java/lang/String.java#L568
   
   So it is more efficient to split the string while parsing it and storing the 
delimited string in an arraylist. Even though the array list is also limited in 
size, since we are storing the string(not the characters) in the array list, it 
is more efficient.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] KAFKA-17544: Fix for loading big files while performing load tests [kafka]

Reply via email to