manoj-mathivanan commented on code in PR #18391:
URL: https://github.com/apache/kafka/pull/18391#discussion_r1904160619
##########
tools/src/main/java/org/apache/kafka/tools/ProducerPerformance.java:
##########
@@ -194,9 +195,16 @@ static List<byte[]> readPayloadFile(String
payloadFilePath, String payloadDelimi
throw new IllegalArgumentException("File does not exist or
empty file provided.");
}
- String[] payloadList =
Files.readString(path).split(payloadDelimiter);
+ List<String> payloadList = new ArrayList<>();
Review Comment:
@m1a2st Thanks for looking into this. The problem here is actually the
entire data is loaded into a single String object. Even though the machine has
memory, the String object is limited to the size of an array as seen here:
https://github.com/openjdk/jdk/blob/f1d85ab3e61f923b4e120cf30e16109e04505b53/src/java.base/share/classes/java/lang/String.java#L568
So it is more efficient to split the string while parsing it and storing the
delimited string in an arraylist. Even though the array list is also limited in
size, since we are storing the string(not the characters) in the array list, it
is more efficient.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]