Why did you not change the exception caught in SubSystem.java:getStringValue to 
PrivilegedActionException from IOException
so it’s consistent with the other get functions?

Bob.


> On Dec 6, 2019, at 8:41 PM, Daniil Titov <daniil.x.ti...@oracle.com> wrote:
> 
> Hi David, Mandy, and Bob,
> 
> Thank you for reviewing this fix.
> 
> Please review a new version of the fix [1] that includes the following 
> changes comparing to the previous version of the webrev ( webrev.04)
> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to 
> CSR [3] were discarded.
> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, 
> getFreeSwapSpaceSize and getTotalSwapSpaceSize
>     was also reverted to webrev.03 version that return host's values if the 
> metrics are unavailable or cannot be properly read.
>     I would like to mention that  currently the native implementation of 
> these methods de-facto may return -1 at some circumstances,
>     but I agree that the changes proposed in the previous version of the 
> webrev increase such probability.
>     I filed the follow-up issue [4] as Mandy suggested.
> 3.  The legacy methods were renamed as David suggested.
> 
> 
>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>> !     static int initialized=1;
>> 
>> Am I reading this right that the code currently fails to actually do the
>> initialization because of this ???
> 
> Yes, currently the code fails to do the initialization but it was unnoticed 
> since method 
> get_cpuload_internal(...) was never called for a specific CPU, the first 
> parameter "which"
> was always -1.
> 
>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>> 
>> System.out.println(String.format(...)
>> 
>> Why not simply
>> 
>> System.out.printf(..)
> 
> As I tried explain it earlier it would make the tests unstable.
> System.out.printf(...) just delegates the call to System.out.format(...) that 
> doesn't emit the string atomically.
> Instead it parses the format string into a list of FormatString objects and 
> then iterates over the list.
> As a result, the other traces occasionally got printed between these 
> iterations  and break the pattern the test is expected to find
> in the output.
> 
> For example, here is the sample of a such output that has the trace message 
> printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
> and "1030762496".
> 
> <skipped>
> [0.304s][trace][os,container] Memory Usage is: 42983424
> OperatingSystemMXBean.getFreeMemorySize: 1030758400
> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is 
> /sys/fs/cgroup/memory/memory.usage_in_bytes
> [0.305s][trace][os,container] Memory Usage is: 42979328
> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is 
> /sys/fs/cgroup/memory/memory.usage_in_bytes
> OperatingSystemMXBean.getFreePhysicalMemorySize: 
> [0.306s][trace][os,container] Memory Usage is: 42975232
> 1030762496
> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
> 
> <skipped>
> java.lang.RuntimeException: 
> 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from 
> stdout/stderr 
> 
>       at 
> jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>       at 
> TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>       at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>       at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>       at 
> com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>       at java.base/java.lang.Thread.run(Thread.java:832)
> 
> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker 
> tests passed. Tier4-tier6 tests are still running.
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 
> 
> Thank you,
> Daniil
> 
> On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.ch...@oracle.com> wrote:
> 
> 
> 
>    On 12/6/19 5:59 AM, Bob Vandette wrote:
>>> On Dec 6, 2019, at 2:49 AM, David Holmes<david.hol...@oracle.com>  wrote:
>>> 
>>> 
>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>> 
>>> The changes to allow for a return of -1 are somewhat more extensive than we 
>>> have previously discussed. These methods previously were (per the spec) 
>>> guaranteed to return some (assumably) meaningful value but now they are 
>>> effectively allowed to fail by returning -1. No existing code is expecting 
>>> to have to handle a return of -1 so I see this as a significant 
>>> compatibility issue.
> 
>    I thought that the error case we are referring to is limit == 0 which 
>    indicates something unexpected goes wrong.  So the compatibility concern 
>    should be low.  This is very specific to Metrics implementation for 
>    cgroup v1 and let me know if I'm wrong.
> 
>>> Surely there must always be some information available from the operating 
>>> environment? I see from the impl file:
>>> 
>>>    // the host data, value 0 indicates that something went wrong while the 
>>> metric was read and
>>>   // in this case we return "information unavailable" code -1.
>>> 
>>> I don't agree with this. If the container metrics are messed up somehow we 
>>> should either fallback to the host value or else abort with some kind of 
>>> exception. Returning -1 is not an option here IMO.
>> I agree with David on the compatibility concern.  I originally thought that 
>> -1 was already a specified return for all of these methods.
>> Since the 0 return failure from the Metrics API should only occur if one of 
>> the cgroup subsystems is not enabled while others
>> are, I’d suggest we keep Daniil’s original logic to fall back to the host 
>> value since a disabled subsystem is equivalent to no
>> limits.
>> 
> 
>    It's important to consider carefully if the monitoring API indicates an 
>    error vs unavailable and an application should continue to run when the 
>    monitoring system fails to get the metrics.
> 
>    There are several choices to report "something goes wrong" scenarios 
>    (should unlikely happen???):
>    1. fall back to a random positive value  (e.g. host value)
>    2. return a negative value
>    3. throw an exception
> 
>    #3 is not an option as the application is not expecting this.  For #2, 
>    the application can filter bad values if desirable.
> 
>    I'm okay if you want to file a JBS issue to follow up and thoroughly 
>    look at the cases that the metrics are unavailable and the cases when 
>    fails to obtain.
> 
>>> ---
>>> 
>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>> 
>>> System.out.println(String.format(...)
>>> 
>>> Why not simply
>>> 
>>> System.out.printf(..)
>>> 
>>> ?
> 
>    or simply (as I commented [1])
>         System.out.format
> 
>    Mandy
>    [1] 
>    
> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
> 
> 
> 
> 

Reply via email to