cxzl25 commented on code in PR #2371:
URL: https://github.com/apache/orc/pull/2371#discussion_r2344221223
##########
java/core/src/java/org/apache/orc/OrcConf.java:
##########
@@ -182,6 +187,9 @@ public enum OrcConf {
"added to all of the writers. Valid range is [1,10000] and is primarily
meant for" +
"testing. Setting this too low may negatively affect performance."
+ " Use orc.stripe.row.count instead if the value larger than
orc.stripe.row.count."),
+ STRIPE_SIZE_CHECK("orc.stripe.size.check",
"hive.exec.orc.default.stripe.size.check",
+ 128L * 1024 * 1024,
Review Comment:
Can you consider the configuration according to the scale of stripe size?
`orc.stripe.size`*ratio
##########
java/core/src/java/org/apache/orc/impl/WriterImpl.java:
##########
@@ -325,9 +327,9 @@ public boolean checkMemory(double newScale) throws
IOException {
}
private boolean checkMemory() throws IOException {
- if (rowsSinceCheck >= ROWS_PER_CHECK) {
+ long size = treeWriter.estimateMemory();
Review Comment:
I suggest that if the conditions do not meet the requirements, the call
frequency of `estimateMemory` is appropriately reduced.
```suggestion
long size = rowsSinceCheck < ROWS_PER_CHECK && STRIPE_SIZE_CHECK <= 0
? 0 : treeWriter.estimateMemory();
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]