bhisevishal commented on code in PR #25219:
URL: https://github.com/apache/beam/pull/25219#discussion_r1096131692
##########
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy:
##########
@@ -1553,7 +1553,7 @@ class BeamModulePlugin implements Plugin<Project> {
if (project.file("/opt/cprof/profiler_java_agent.so").exists()) {
def gcpProject = project.findProperty('gcpProject') ?:
'apache-beam-testing'
def userName =
System.getProperty("user.name").toLowerCase().replaceAll(" ", "_")
- jvmArgs
'-agentpath:/opt/cprof/profiler_java_agent.so=-cprof_service=' + userName + "_"
+ project.getProperty("benchmark").toLowerCase() + '_' +
System.currentTimeMillis() + ',-cprof_project_id=' + gcpProject +
',-cprof_zone_name=us-central1-a'
+ jvmArgs
'-agentpath:/opt/cprof/profiler_java_agent.so=-cprof_service=' + userName + "_"
+ project.getProperty("benchmark").toLowerCase() + '_' +
String.format('%1$tm%1$td%1$tY_%1$tH%1$tM%1$tS%1$tL',
System.currentTimeMillis()) + ',-cprof_project_id=' + gcpProject +
',-cprof_zone_name=us-central1-a'
Review Comment:
Should we start with year so that its always ascending order?
Separations between data and time be more readable.
And if ':' supported then we can use
'%1$tY%1$tm%1$td_%1$tH:%1$tM:%1$tS.%1$tL'
but if its overkill please ignore this comment.
##########
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/Caches.java:
##########
@@ -73,6 +77,25 @@ public static long weigh(Object o) {
}
}
+ /**
+ * Returns whether the cache should be updated in the case where the objects
size has changed.
+ *
+ * <p>Note that this should only be used in the case where the cache is
being updated very often
+ * in a tight loop and is not a good fit for cases where the object being
cached is the result of
+ * an expensive operation like a disk read or remote service call.
+ */
+ public static boolean shouldUpdateOnSizeChange(long oldSize, long newSize) {
+ /*
+ Our strategy is three fold:
+ - tiny objects (<= 2^WEIGHT_RATIO) don't change the amount being weighed
+ - large changes (>= CACHE_SIZE_CHANGE_LIMIT_BYTES) should always update
the size
+ - all others if the size changed by a factor of 2
+ */
+ return (oldSize > 1 << WEIGHT_RATIO || newSize > 1 << WEIGHT_RATIO)
Review Comment:
can we also convert this to cont so that more readable e.g lower limit or
lowe_detla_change some thing like it.
##########
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/Caches.java:
##########
@@ -53,12 +53,16 @@ public final class Caches {
*/
@VisibleForTesting static final int WEIGHT_RATIO = 6;
+ /** Objects which change in this amount should always update the cache. */
+ private static final long CACHE_SIZE_CHANGE_LIMIT_BYTES = 1 << 16;
Review Comment:
Can you add comment why its 2^16 why not less or why not more. Should this
value be const or should configurable based on total memory available.
Will this help to in avoiding OOMs we observed on n1-starandrad-1 machine
for certain workload? or this is unrelated change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]