[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15463720#comment-15463720 ]
Chris Douglas commented on HDFS-10742: -------------------------------------- bq. Removed per-method measurement and aggregate stats to minimize overhead for cheap operations Recording stack traces, etc. may be useful. I raised those overheads for two reasons. First, that diagnostic data should only be built when necessary. Previous versions of the patch generated diagnostic data that was ultimately discarded. Second, minutiae like "GC-friendliness", whether a wrapper class on {{Lock}} adds dispatch overhead, class loading overhead, etc. when this JIRA logs critical sections longer than *300ms* are irrelevant, literally by orders of magnitude. I'll ask again after the motivation. Is the intent to identify subsystems that do too much work (and/or I/O) in a critical section? To measure where the DN is spending cycles? In some of these cases, stack traces could be very helpful. As long as this telemetry can be tuned by configuration, it shouldn't create new problems for operators. To that point, instead of reusing the {{FsDatasetImpl}} log, this may want to use its own (so disabling/redirecting it doesn't affect other, critical information written to that log). bq. Was your concern was with exposing AutoCloseableLock in hadoop utils or just an overall objection to this idiom even if it's not exposed outside the DN. I have two objections, one to the idiom and one to the design. {{AutoCloseableLock}} is a new abstraction that doesn't elide any details of the thing it obscures. Every path for {{Lock}} objects is possible, so it doesn't make it easier to reason about lock correctness. It creates a special type for the _simplest_ possible case. This case is so simple, we have automated tooling to detect it. Further, instead of every Java developer looking at a {{Lock}} and scanning for well-worn expectations, we introduce an abstraction that requires investigation. No advantage beyond developer convenience has been raised, and it is a demonstrably _inconvenient_ abstraction. If the interface were limited so it could _only_ be used in the resource idiom (as [proposed|https://s.apache.org/Eyaw] earlier), then a developer could know that the {{Lock}} acquisition was limited to scope. As semantic sugar it would still be pointless, because Java programmers also know how to use {{synchronized}}. This could be redeemed by changing the design. If the {{AutoCloseableLock}} constructor accepted a {{Lock}} instance and cut its API to scope-limited changes, then telemetry like this could be added as implementations of the {{Lock}} type with all the perceived advantages of ACL. Devs can reason about the simpler API of ACL (scope-based acquisition), but still monitor all acquisitions as required by this JIRA (impractical to implement using {{synchronized}}). I remain skeptical of {{AutoCloseableLock}}, since one could just use the {{Lock}} in interface and existing tooling to implement the same. But if its virtues are obvious to everyone but me I won't stand in the way of including it. > Measurement of lock held time in FsDatasetImpl > ---------------------------------------------- > > Key: HDFS-10742 > URL: https://issues.apache.org/jira/browse/HDFS-10742 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.0.0-alpha2 > Reporter: Chen Liang > Assignee: Chen Liang > Attachments: HDFS-10742.001.patch, HDFS-10742.002.patch, > HDFS-10742.003.patch, HDFS-10742.004.patch, HDFS-10742.005.patch, > HDFS-10742.006.patch, HDFS-10742.007.patch, HDFS-10742.008.patch, > HDFS-10742.009.patch, HDFS-10742.010.patch, HDFS-10742.011.patch > > > This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is > held by a thread. Doing so will allow us to measure lock statistics. > This can be done by extending the {{AutoCloseableLock}} lock object in > {{FsDatasetImpl}}. In the future we can also consider replacing the lock with > a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org