Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Bob, David, Mandy, and Serguei, Thank you for reviewing this change! Best regards, Daniil On 12/11/19, 9:21 AM, "Bob Vandette" wrote: Yes, I defer to Mandy on the best way to express the various Java exceptions. I’m ok with the changes. Thanks for getting this done for JDK14! Bob. > On Dec 11, 2019, at 12:12 PM, Daniil Titov wrote: > > Hi Serguei, > > Thank you for your comments. I will correct this nits before pushing the changes. > > Hi Bob and David, > >> [Mandy Chung] >>> I reviewed Metrics and Subsystem in this version. >>> I don't need to see a new webrev. > > As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ? > > Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. > > Thank you, > Daniil > > > > On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: > >Hi Daniil, > >It is not a full review, just some minor comments. >In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM >runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without >specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on >how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for >a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: >%d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: >%d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: >%d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: >%d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: >%d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: >%d", osBean.getFreePhysicalMemorySize())); > 40 > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: >%d", osBean.getTotalSwapSpaceSize())); > 41 > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: >%d", osBean.getFreeSwapSpaceSize())); > 42 >System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", >osBean.getCpuLoad())); > 43 > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: >%f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a >log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", >Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: >%d",
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Daniil, Got it, thanks! Serguei On 12/11/19 15:33, Daniil Titov wrote: Hi Serguei, Thank you for reviewing this change. Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and memoryAndSwapLimit). The "limit" metrics (memoryLimit and memoryAndSwapLimit) are set when the container starts and are not subjects to change. The only method that reads more than one "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries if the calculated swapUsage is negative as a result of non-atomic reads. Thank you, Daniil On 12/11/19, 3:13 PM, "serguei.spit...@oracle.com" wrote: Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: > Hi Serguei, > >>Do we need to check if the (limit - memLimit) value is negative? >>The same question is for getFreeSwapSpaceSize(): >> memSwapLimit - memLimit - (memSwapUsage - memUsage) >> >>and getFreeMemorySize(): >> 101 return limit - usage; > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > up issue I created for that [1]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d",
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Serguei, Thank you for reviewing this change. Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and memoryAndSwapLimit). The "limit" metrics (memoryLimit and memoryAndSwapLimit) are set when the container starts and are not subjects to change. The only method that reads more than one "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries if the calculated swapUsage is negative as a result of non-atomic reads. Thank you, Daniil On 12/11/19, 3:13 PM, "serguei.spit...@oracle.com" wrote: Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: > Hi Serguei, > >>Do we need to check if the (limit - memLimit) value is negative? >>The same question is for getFreeSwapSpaceSize(): >> memSwapLimit - memLimit - (memSwapUsage - memUsage) >> >>and getFreeMemorySize(): >> 101 return limit - usage; > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > up issue I created for that [1]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d", osBean.getFreePhysicalMemorySize())); > 40 >
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Typo fixed... .. that the only "volatile" metrics are "usage" ones ( memoryUsage and *memoryAndSwapUsage*). Best regards, Daniil On 12/11/19, 3:33 PM, "Daniil Titov" wrote: Hi Serguei, Thank you for reviewing this change. Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and memoryAndSwapLimit). The "limit" metrics (memoryLimit and memoryAndSwapLimit) are set when the container starts and are not subjects to change. The only method that reads more than one "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries if the calculated swapUsage is negative as a result of non-atomic reads. Thank you, Daniil On 12/11/19, 3:13 PM, "serguei.spit...@oracle.com" wrote: Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: > Hi Serguei, > >>Do we need to check if the (limit - memLimit) value is negative? >>The same question is for getFreeSwapSpaceSize(): >> memSwapLimit - memLimit - (memSwapUsage - memUsage) >> >>and getFreeMemorySize(): >> 101 return limit - usage; > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > up issue I created for that [1]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize()));
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: Hi Serguei, Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow up issue I created for that [1]. [1] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html 55 public long getTotalSwapSpaceSize() { 56 if (containerMetrics != null) { 57 long limit = containerMetrics.getMemoryAndSwapLimit(); 58 // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) 59 // or if a docker container was started without specifying a memory limit ( without '--memory=' 60 // Docker option). In latter case there is no limit on how much memory the container can use and 61 // it can use as much memory as the host's OS allows. 62 long memLimit = containerMetrics.getMemoryLimit(); 63 if (limit >= 0 && memLimit >= 0) { 64 return limit - memLimit; 65 } 66 } 67 return getTotalSwapSpaceSize0(); 68 } Unneeded space after brackets '('. Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; 81 // If this happens just retry the loop for a few iterations Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); To make the above lines a little bit shorter I'd suggest to define a log() method like this: private static void log(String msg) ( System.out.println(msg(; } 34 log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
Re: 8226575: OperatingSystemMXBean should be made container aware
Yes, I defer to Mandy on the best way to express the various Java exceptions. I’m ok with the changes. Thanks for getting this done for JDK14! Bob. > On Dec 11, 2019, at 12:12 PM, Daniil Titov wrote: > > Hi Serguei, > > Thank you for your comments. I will correct this nits before pushing the > changes. > > Hi Bob and David, > >> [Mandy Chung] >>> I reviewed Metrics and Subsystem in this version. >>> I don't need to see a new webrev. > > As I understood Mandy finished reviewing this fix. Just wanted to confirm > with you if you are okey with that version of the fix (webrev.06) ? > > Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker > tests passed. > > Thank you, > Daniil > > > > On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" > wrote: > >Hi Daniil, > >It is not a full review, just some minor comments. >In fact, I do not see real problems yet. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM >runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without >specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on >how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for >a few iterations > > Dot is missed at the end of comment. > > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: >%d", Runtime.getRuntime().availableProcessors())); > 35 > > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > >%d", osBean.getAvailableProcessors())); > 36 > > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: >%d", osBean.getTotalMemorySize())); > 37 > > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > >%d", osBean.getTotalPhysicalMemorySize())); > 38 >System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: >%d", osBean.getFreeMemorySize())); > 39 > > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > >%d", osBean.getFreePhysicalMemorySize())); > 40 > > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > >%d", osBean.getTotalSwapSpaceSize())); > 41 > > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: >%d", osBean.getFreeSwapSpaceSize())); > 42 >System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", >osBean.getCpuLoad())); > 43 >System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: >%f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a >log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", >Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: >%d", osBean.getAvailableProcessors())); > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", >osBean.getTotalMemorySize())); > 37 >log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: >%d", osBean.getTotalPhysicalMemorySize())); > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", >osBean.getFreeMemorySize())); > 39 >log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", >osBean.getFreePhysicalMemorySize())); > 40
Re: 8226575: OperatingSystemMXBean should be made container aware
Hi Serguei, Thank you for your comments. I will correct this nits before pushing the changes. Hi Bob and David, > [Mandy Chung] >> I reviewed Metrics and Subsystem in this version. >> I don't need to see a new webrev. As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ? Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. Thank you, Daniil On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html 55 public long getTotalSwapSpaceSize() { 56 if (containerMetrics != null) { 57 long limit = containerMetrics.getMemoryAndSwapLimit(); 58 // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) 59 // or if a docker container was started without specifying a memory limit ( without '--memory=' 60 // Docker option). In latter case there is no limit on how much memory the container can use and 61 // it can use as much memory as the host's OS allows. 62 long memLimit = containerMetrics.getMemoryLimit(); 63 if (limit >= 0 && memLimit >= 0) { 64 return limit - memLimit; 65 } 66 } 67 return getTotalSwapSpaceSize0(); 68 } Unneeded space after brackets '('. Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; 81 // If this happens just retry the loop for a few iterations Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); To make the above lines a little bit shorter I'd suggest to define a log() method like this: private static void log(String msg) ( System.out.println(msg(; } 34 log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Serguei, > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): >101 return limit - usage; I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow up issue I created for that [1]. [1] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil On 12/9/19, 6:02 PM, "serguei.spit...@oracle.com" wrote: Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html 55 public long getTotalSwapSpaceSize() { 56 if (containerMetrics != null) { 57 long limit = containerMetrics.getMemoryAndSwapLimit(); 58 // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) 59 // or if a docker container was started without specifying a memory limit ( without '--memory=' 60 // Docker option). In latter case there is no limit on how much memory the container can use and 61 // it can use as much memory as the host's OS allows. 62 long memLimit = containerMetrics.getMemoryLimit(); 63 if (limit >= 0 && memLimit >= 0) { 64 return limit - memLimit; 65 } 66 } 67 return getTotalSwapSpaceSize0(); 68 } Unneeded space after brackets '('. Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; 81 // If this happens just retry the loop for a few iterations Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); To make the above lines a little bit shorter I'd suggest to define a log() method like this: private static void log(String msg) ( System.out.println(msg(; } 34 log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi David, > Please file a follow up RFE to look into this. I created an issue to follow this up [1] [1] https://bugs.openjdk.java.net/browse/JDK-8235681 Thank you, Daniil On 12/10/19, 2:11 AM, "David Holmes" wrote: On 10/12/2019 5:31 am, Daniil Titov wrote: > Hi David, > >> So we never try to access the uninitialized counters.cpus array which is >> good but we still return garbage for counters.jvmTicks and >>counters.cpuTicks - surely that should have been noticeable? > > It only affected the first time the CPU load was requested. Function get_cpuload_internal(...) > calls get_totalticks() and get_jvmticks() functions that update these counters. > But on the first call, yes, it compares the newly received counters with the garbage. > > > It has the code that seems to be written to somehow mitigate it and in worse case just return > 0 or 1.0. But it also could be that there is some other problem this code tries to solve so I'm not sure > we should remove these workarounds as a part of the current fix. Please file a follow up RFE to look into this. Thanks, David > 274 // seems like we sometimes end up with less kernel ticks when > 275 // reading /proc/self/stat a second time, timing issue between cpus? > 276 if (pticks->usedKernel < tmp.usedKernel) { > 277 kdiff = 0; > 278 } else { > 279 kdiff = pticks->usedKernel - tmp.usedKernel; > 280 } > 281 tdiff = pticks->total - tmp.total; > 282 udiff = pticks->used - tmp.used; > 283 > 284 if (tdiff == 0) { > 285 user_load = 0; > 286 } else { > 287 if (tdiff < (udiff + kdiff)) { > 288 tdiff = udiff + kdiff; > 289 } > 290 *pkernelLoad = (kdiff / (double)tdiff); > 291 // BUG9044876, normalize return values to sane values > 292 *pkernelLoad = MAX(*pkernelLoad, 0.0); > 293 *pkernelLoad = MIN(*pkernelLoad, 1.0); > 294 > 295 user_load = (udiff / (double)tdiff); > 296 user_load = MAX(user_load, 0.0); > 297 user_load = MIN(user_load, 1.0); > 298 } > 299 } > > Best regards, > Daniil > > On 12/8/19, 8:49 PM, "David Holmes" wrote: > > Hi Daniil, > > On 7/12/2019 11:41 am, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > > > Thank you for reviewing this fix. > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > Okay. > > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > Okay. > > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > I filed the follow-up issue [4] as Mandy suggested. > > I added a comment to the bug. This is potentially a difficult problem to > resolve - it all depends on the likelihood of any errors and what they > really indicate. > > > 3. The legacy methods were renamed as David suggested. > > Thanks! > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > >> ! static int initialized=1; > >> > >> Am I reading this right that the code currently fails to actually do the > >> initialization because of this ??? > > > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > was always -1. > > So we never try to access the uninitialized counters.cpus array which is > good but we still return garbage for counters.jvmTicks and > counters.cpuTicks - surely that should have been noticeable? > > >>
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 10/12/2019 5:31 am, Daniil Titov wrote: Hi David, So we never try to access the uninitialized counters.cpus array which is good but we still return garbage for counters.jvmTicks and counters.cpuTicks - surely that should have been noticeable? It only affected the first time the CPU load was requested. Function get_cpuload_internal(...) calls get_totalticks() and get_jvmticks() functions that update these counters. But on the first call, yes, it compares the newly received counters with the garbage. It has the code that seems to be written to somehow mitigate it and in worse case just return 0 or 1.0. But it also could be that there is some other problem this code tries to solve so I'm not sure we should remove these workarounds as a part of the current fix. Please file a follow up RFE to look into this. Thanks, David 274 // seems like we sometimes end up with less kernel ticks when 275 // reading /proc/self/stat a second time, timing issue between cpus? 276 if (pticks->usedKernel < tmp.usedKernel) { 277 kdiff = 0; 278 } else { 279 kdiff = pticks->usedKernel - tmp.usedKernel; 280 } 281 tdiff = pticks->total - tmp.total; 282 udiff = pticks->used - tmp.used; 283 284 if (tdiff == 0) { 285 user_load = 0; 286 } else { 287 if (tdiff < (udiff + kdiff)) { 288 tdiff = udiff + kdiff; 289 } 290 *pkernelLoad = (kdiff / (double)tdiff); 291 // BUG9044876, normalize return values to sane values 292 *pkernelLoad = MAX(*pkernelLoad, 0.0); 293 *pkernelLoad = MIN(*pkernelLoad, 1.0); 294 295 user_load = (udiff / (double)tdiff); 296 user_load = MAX(user_load, 0.0); 297 user_load = MIN(user_load, 1.0); 298 } 299 } Best regards, Daniil On 12/8/19, 8:49 PM, "David Holmes" wrote: Hi Daniil, On 7/12/2019 11:41 am, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. Okay. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. Okay. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. I added a comment to the bug. This is potentially a difficult problem to resolve - it all depends on the likelihood of any errors and what they really indicate. > 3. The legacy methods were renamed as David suggested. Thanks! > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. So we never try to access the uninitialized counters.cpus array which is good but we still return garbage for counters.jvmTicks and counters.cpuTicks - surely that should have been noticeable? >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. Sorry I missed the earlier explanation. I find it somewhat surprising that format() works that way, but without unlimited buffering there will
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 12/9/19 3:47 PM, Daniil Titov wrote: Hi Mandy and Bob, Please review a new version of the webrev [1] that moves doPrivileged calls in jdk.internal.platform.cgroupv1.SubSystem to separate methods that throw IOException, as Mandy suggested. Mach5 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.06/ I reviewed Metrics and Subsystem in this version. I think it's simpler to have unwrapIOExceptionAndRethrow handling the InternalError case. + List lines = subsystem.readMatchingLines(param); + for (String line : lines) { if (line.startsWith(match)) { retval = conversion.apply(line); break; } }This can simply call Metrics::readFilePrivileged and process on the Stream. return Metrics::readFilePrivileged(Paths.get(subsystem.path(), param)) .filter(line -> line.startsWith(match)) .map(conversion::apply) .findFirst().orElseGet(() ->retval); I don't need to see a new webrev. Mandy
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html 55 public long getTotalSwapSpaceSize() { 56 if (containerMetrics != null) { 57 long limit = containerMetrics.getMemoryAndSwapLimit(); 58 // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) 59 // or if a docker container was started without specifying a memory limit ( without '--memory=' 60 // Docker option). In latter case there is no limit on how much memory the container can use and 61 // it can use as much memory as the host's OS allows. 62 long memLimit = containerMetrics.getMemoryLimit(); 63 if (limit >= 0 && memLimit >= 0) { 64 return limit - memLimit; 65 } 66 } 67 return getTotalSwapSpaceSize0(); 68 } Unneeded space after brackets '('. Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; 81 // If this happens just retry the loop for a few iterations Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); To make the above lines a little bit shorter I'd suggest to define a log() method like this: private static void log(String msg) ( System.out.println(msg(; } 34 log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); Thanks, Serguei On 12/6/19 17:41, Daniil Titov wrote: Hi David, Mandy, and Bob, Thank you for reviewing this fix. Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, but I agree that the changes proposed in the previous version of the webrev increase such probability. I filed the follow-up issue [4] as
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Mandy and Bob, Please review a new version of the webrev [1] that moves doPrivileged calls in jdk.internal.platform.cgroupv1.SubSystem to separate methods that throw IOException, as Mandy suggested. Mach5 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.06/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thank you, Daniil On 12/9/19, 1:55 PM, "Mandy Chung" wrote: On 12/9/19 10:51 AM, Daniil Titov wrote: > Hi Mandy and Bob, > >> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException >> so it’s consistent with the other get functions? > In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put > the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) > so we could change this implementation to the following: > > public static String getStringValue(SubSystem subsystem, String parm) { > if (subsystem == null) return null; > > try (BufferedReader bufferedReader = > AccessController.doPrivileged((PrivilegedExceptionAction) () -> { > return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); > })) { > return bufferedReader.readLine(); > } catch (PrivilegedActionException | IOException e) { > return null; > } > } > > Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap > PrivilegedActionException exception and throw the cause instead? > I think it's simpler to read and understand if the doPrivileged call is moved out as a separate method that will throw IOException as the expected functionality as suggested above. For SubSystem::getStringValue, one suggestion would be: diff --git a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java --- a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java +++ b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java @@ -29,7 +29,11 @@ import java.io.IOException; import java.math.BigInteger; import java.nio.file.Files; +import java.nio.file.Path; import java.nio.file.Paths; +import java.security.AccessController; +import java.security.PrivilegedActionException; +import java.security.PrivilegedExceptionAction; import java.util.ArrayList; import java.util.List; import java.util.Optional; @@ -90,9 +94,8 @@ public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; -try(BufferedReader bufferedReader = Files.newBufferedReader(Paths.get(subsystem.path(), parm))) { -String line = bufferedReader.readLine(); -return line; +try { +return subsystem.readStringValue(parm); } catch (IOException e) { return null; @@ -100,6 +103,24 @@ } +private String readStringValue(String param) throws IOException { +PrivilegedExceptionAction pea = () -> Files.newBufferedReader(Paths.get(path(), param)); +try (BufferedReader bufferedReader = AccessController.doPrivileged(pea)) { +String line = bufferedReader.readLine(); +return line; +} catch (PrivilegedActionException e) { +Throwable x = e.getCause(); +if (x instanceof IOException) +throw (IOException)x; +if (x instanceof RuntimeException) +throw (RuntimeException)x; +if (x instanceof Error) +throw (Error)x; + +throw new InternalError(x); +} +} +
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 12/9/19 10:51 AM, Daniil Titov wrote: Hi Mandy and Bob, Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException so it’s consistent with the other get functions? In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) so we could change this implementation to the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> { return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); })) { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap PrivilegedActionException exception and throw the cause instead? I think it's simpler to read and understand if the doPrivileged call is moved out as a separate method that will throw IOException as the expected functionality as suggested above. For SubSystem::getStringValue, one suggestion would be: diff --git a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java --- a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java +++ b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java @@ -29,7 +29,11 @@ import java.io.IOException; import java.math.BigInteger; import java.nio.file.Files; +import java.nio.file.Path; import java.nio.file.Paths; +import java.security.AccessController; +import java.security.PrivilegedActionException; +import java.security.PrivilegedExceptionAction; import java.util.ArrayList; import java.util.List; import java.util.Optional; @@ -90,9 +94,8 @@ public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; - try(BufferedReader bufferedReader = Files.newBufferedReader(Paths.get(subsystem.path(), parm))) { - String line = bufferedReader.readLine(); - return line; + try { + return subsystem.readStringValue(parm); } catch (IOException e) { return null; @@ -100,6 +103,24 @@ } + private String readStringValue(String param) throws IOException { + PrivilegedExceptionAction pea = () -> Files.newBufferedReader(Paths.get(path(), param)); + try (BufferedReader bufferedReader = AccessController.doPrivileged(pea)) { + String line = bufferedReader.readLine(); + return line; + } catch (PrivilegedActionException e) { + Throwable x = e.getCause(); + if (x instanceof IOException) + throw (IOException)x; + if (x instanceof RuntimeException) + throw (RuntimeException)x; + if (x instanceof Error) + throw (Error)x; + + throw new InternalError(x); + } + } +
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi David, > So we never try to access the uninitialized counters.cpus array which is >good but we still return garbage for counters.jvmTicks and > counters.cpuTicks - surely that should have been noticeable? It only affected the first time the CPU load was requested. Function get_cpuload_internal(...) calls get_totalticks() and get_jvmticks() functions that update these counters. But on the first call, yes, it compares the newly received counters with the garbage. It has the code that seems to be written to somehow mitigate it and in worse case just return 0 or 1.0. But it also could be that there is some other problem this code tries to solve so I'm not sure we should remove these workarounds as a part of the current fix. 274 // seems like we sometimes end up with less kernel ticks when 275 // reading /proc/self/stat a second time, timing issue between cpus? 276 if (pticks->usedKernel < tmp.usedKernel) { 277 kdiff = 0; 278 } else { 279 kdiff = pticks->usedKernel - tmp.usedKernel; 280 } 281 tdiff = pticks->total - tmp.total; 282 udiff = pticks->used - tmp.used; 283 284 if (tdiff == 0) { 285 user_load = 0; 286 } else { 287 if (tdiff < (udiff + kdiff)) { 288 tdiff = udiff + kdiff; 289 } 290 *pkernelLoad = (kdiff / (double)tdiff); 291 // BUG9044876, normalize return values to sane values 292 *pkernelLoad = MAX(*pkernelLoad, 0.0); 293 *pkernelLoad = MIN(*pkernelLoad, 1.0); 294 295 user_load = (udiff / (double)tdiff); 296 user_load = MAX(user_load, 0.0); 297 user_load = MIN(user_load, 1.0); 298 } 299 } Best regards, Daniil On 12/8/19, 8:49 PM, "David Holmes" wrote: Hi Daniil, On 7/12/2019 11:41 am, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. Okay. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. Okay. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. I added a comment to the bug. This is potentially a difficult problem to resolve - it all depends on the likelihood of any errors and what they really indicate. > 3. The legacy methods were renamed as David suggested. Thanks! > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. So we never try to access the uninitialized counters.cpus array which is good but we still return garbage for counters.jvmTicks and counters.cpuTicks - surely that should have been noticeable? >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. Sorry I missed the earlier explanation. I find it somewhat surprising that format() works that way, but without unlimited buffering there will always be a need to flush the outputstream at some point. Thanks, David - > For example, here is the sample of a such output that has the trace message
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
A correction... We could even further simplify it as the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> Files.newBufferedReader(Paths.get(subsystem.path(), parm { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Best regards, Daniil On 12/9/19, 10:51 AM, "Daniil Titov" wrote: Hi Mandy and Bob, > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it’s consistent with the other get functions? In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) so we could change this implementation to the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> { return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); })) { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap PrivilegedActionException exception and throw the cause instead? Thank you, Daniil On 12/9/19, 9:48 AM, "Mandy Chung" wrote: Files:lines requires FilePermission check. So it needs to be wrapped with doPrivileged. The readFilePrivileged can unwrap and throw the cause instead like this: static Stream readFilePrivileged(Path path) throws IOException { try { return AccessController.doPrivileged((PrivilegedExceptionAction>) () -> Files.lines(path)); } catch (PrivilegedActionException e) { Throwable x = e.getCause(); if (x instanceof IOException) throw (IOException)x; if (x instanceof RuntimeException) throw (RuntimeException)x; if (x instanceof Error) throw (Error)x; throw new InternalError(x); } } On 12/9/19 7:17 AM, Bob Vandette wrote: > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it’s consistent with the other get functions? > > Bob. > > >> On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: >> >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>>
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Mandy and Bob, > Why did you not change the exception caught in SubSystem.java:getStringValue > to PrivilegedActionException from IOException > so it’s consistent with the other get functions? In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) so we could change this implementation to the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> { return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); })) { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap PrivilegedActionException exception and throw the cause instead? Thank you, Daniil On 12/9/19, 9:48 AM, "Mandy Chung" wrote: Files:lines requires FilePermission check. So it needs to be wrapped with doPrivileged. The readFilePrivileged can unwrap and throw the cause instead like this: static Stream readFilePrivileged(Path path) throws IOException { try { return AccessController.doPrivileged((PrivilegedExceptionAction>) () -> Files.lines(path)); } catch (PrivilegedActionException e) { Throwable x = e.getCause(); if (x instanceof IOException) throw (IOException)x; if (x instanceof RuntimeException) throw (RuntimeException)x; if (x instanceof Error) throw (Error)x; throw new InternalError(x); } } On 12/9/19 7:17 AM, Bob Vandette wrote: > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it’s consistent with the other get functions? > > Bob. > > >> On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: >> >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >> As I tried explain it earlier it would make the tests unstable. >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. >> Instead it parses the format string into a list of FormatString objects and then iterates over the list. >> As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find >> in the output. >> >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and "1030762496". >> >> >> [0.304s][trace][os,container] Memory Usage is: 42983424 >> OperatingSystemMXBean.getFreeMemorySize: 1030758400 >>
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Files:lines requires FilePermission check. So it needs to be wrapped with doPrivileged. The readFilePrivileged can unwrap and throw the cause instead like this: static Stream readFilePrivileged(Path path) throws IOException { try { return AccessController.doPrivileged((PrivilegedExceptionAction>) () -> Files.lines(path)); } catch (PrivilegedActionException e) { Throwable x = e.getCause(); if (x instanceof IOException) throw (IOException)x; if (x instanceof RuntimeException) throw (RuntimeException)x; if (x instanceof Error) throw (Error)x; throw new InternalError(x); } } On 12/9/19 7:17 AM, Bob Vandette wrote: Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException so it’s consistent with the other get functions? Bob. On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: Hi David, Mandy, and Bob, Thank you for reviewing this fix. Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, but I agree that the changes proposed in the previous version of the webrev increase such probability. I filed the follow-up issue [4] as Mandy suggested. 3. The legacy methods were renamed as David suggested. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ! static int initialized=1; Am I reading this right that the code currently fails to actually do the initialization because of this ??? Yes, currently the code fails to do the initialization but it was unnoticed since method get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" was always -1. test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java System.out.println(String.format(...) Why not simply System.out.printf(..) As I tried explain it earlier it would make the tests unstable. System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. Instead it parses the format string into a list of FormatString objects and then iterates over the list. As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find in the output. For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" and "1030762496". [0.304s][trace][os,container] Memory Usage is: 42983424 OperatingSystemMXBean.getFreeMemorySize: 1030758400 [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes [0.305s][trace][os,container] Memory Usage is: 42979328 [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 1030762496 OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) at TestMemoryAwareness.main(TestMemoryAwareness.java:73) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:832) Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 [4] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil On 12/6/19, 1:38 PM,
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException so it’s consistent with the other get functions? Bob. > On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following > changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to > CSR [3] were discarded. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, > getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the > metrics are unavailable or cannot be properly read. > I would like to mention that currently the native implementation of > these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the > webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. > 3. The legacy methods were renamed as David suggested. > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed > since method > get_cpuload_internal(...) was never called for a specific CPU, the first > parameter "which" > was always -1. > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that > doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and > then iterates over the list. > As a result, the other traces occasionally got printed between these > iterations and break the pattern the test is expected to find > in the output. > > For example, here is the sample of a such output that has the trace message > printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is > /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is > /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: > [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: > 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from > stdout/stderr > > at > jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at > TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker > tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > >On 12/6/19 5:59 AM, Bob Vandette wrote: >>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>> >>> >>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>> >>> The changes to allow for a return of -1 are somewhat more extensive than we >>> have previously discussed. These methods previously were (per the spec) >>> guaranteed to return some (assumably) meaningful value but now they are >>> effectively allowed to fail by returning -1. No existing code is expecting >>> to have to handle a return of -1 so I see this as a
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Daniil, On 7/12/2019 11:41 am, Daniil Titov wrote: Hi David, Mandy, and Bob, Thank you for reviewing this fix. Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. Okay. 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. Okay. I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, but I agree that the changes proposed in the previous version of the webrev increase such probability. I filed the follow-up issue [4] as Mandy suggested. I added a comment to the bug. This is potentially a difficult problem to resolve - it all depends on the likelihood of any errors and what they really indicate. 3. The legacy methods were renamed as David suggested. Thanks! src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ! static int initialized=1; Am I reading this right that the code currently fails to actually do the initialization because of this ??? Yes, currently the code fails to do the initialization but it was unnoticed since method get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" was always -1. So we never try to access the uninitialized counters.cpus array which is good but we still return garbage for counters.jvmTicks and counters.cpuTicks - surely that should have been noticeable? test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java System.out.println(String.format(...) Why not simply System.out.printf(..) As I tried explain it earlier it would make the tests unstable. System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. Instead it parses the format string into a list of FormatString objects and then iterates over the list. As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find in the output. Sorry I missed the earlier explanation. I find it somewhat surprising that format() works that way, but without unlimited buffering there will always be a need to flush the outputstream at some point. Thanks, David - For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" and "1030762496". [0.304s][trace][os,container] Memory Usage is: 42983424 OperatingSystemMXBean.getFreeMemorySize: 1030758400 [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes [0.305s][trace][os,container] Memory Usage is: 42979328 [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 1030762496 OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) at TestMemoryAwareness.main(TestMemoryAwareness.java:73) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:832) Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 [4] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil On 12/6/19, 1:38 PM, "Mandy Chung" wrote: On 12/6/19 5:59 AM, Bob Vandette wrote: >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >> >> >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >> >> The changes to allow for a return of -1 are somewhat more extensive than we have
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi David, Mandy, and Bob, Thank you for reviewing this fix. Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, but I agree that the changes proposed in the previous version of the webrev increase such probability. I filed the follow-up issue [4] as Mandy suggested. 3. The legacy methods were renamed as David suggested. > src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > ! static int initialized=1; > > Am I reading this right that the code currently fails to actually do the > initialization because of this ??? Yes, currently the code fails to do the initialization but it was unnoticed since method get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" was always -1. > test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > System.out.println(String.format(...) > > Why not simply > > System.out.printf(..) As I tried explain it earlier it would make the tests unstable. System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. Instead it parses the format string into a list of FormatString objects and then iterates over the list. As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find in the output. For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" and "1030762496". [0.304s][trace][os,container] Memory Usage is: 42983424 OperatingSystemMXBean.getFreeMemorySize: 1030758400 [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes [0.305s][trace][os,container] Memory Usage is: 42979328 [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 1030762496 OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) at TestMemoryAwareness.main(TestMemoryAwareness.java:73) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:832) Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 [4] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil On 12/6/19, 1:38 PM, "Mandy Chung" wrote: On 12/6/19 5:59 AM, Bob Vandette wrote: >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >> >> >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >> >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. I thought that the error case we are referring to is limit == 0 which indicates something unexpected goes wrong. So the compatibility concern should be low. This is very specific to Metrics implementation for cgroup v1 and let me know if I'm wrong. >> Surely there must always be some information available from the operating environment?
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 12/6/19 5:59 AM, Bob Vandette wrote: On Dec 6, 2019, at 2:49 AM, David Holmes wrote: src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. I thought that the error case we are referring to is limit == 0 which indicates something unexpected goes wrong. So the compatibility concern should be low. This is very specific to Metrics implementation for cgroup v1 and let me know if I'm wrong. Surely there must always be some information available from the operating environment? I see from the impl file: // the host data, value 0 indicates that something went wrong while the metric was read and // in this case we return "information unavailable" code -1. I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others are, I’d suggest we keep Daniil’s original logic to fall back to the host value since a disabled subsystem is equivalent to no limits. It's important to consider carefully if the monitoring API indicates an error vs unavailable and an application should continue to run when the monitoring system fails to get the metrics. There are several choices to report "something goes wrong" scenarios (should unlikely happen???): 1. fall back to a random positive value (e.g. host value) 2. return a negative value 3. throw an exception #3 is not an option as the application is not expecting this. For #2, the application can filter bad values if desirable. I'm okay if you want to file a JBS issue to follow up and thoroughly look at the cases that the metrics are unavailable and the cases when fails to obtain. --- test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java System.out.println(String.format(...) Why not simply System.out.printf(..) ? or simply (as I commented [1]) System.out.format Mandy [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
+1 On 12/6/19 5:59 AM, Bob Vandette wrote: On Dec 6, 2019, at 2:49 AM, David Holmes wrote: Hi Daniil, I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below. --- src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ! static int initialized=1; Am I reading this right that the code currently fails to actually do the initialization because of this ??? Style nit: if(perfInit() space after "if" --- src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file: // the host data, value 0 indicates that something went wrong while the metric was read and // in this case we return "information unavailable" code -1. I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others are, I’d suggest we keep Daniil’s original logic to fall back to the host value since a disabled subsystem is equivalent to no limits. Bob. --- src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only. --- test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java System.out.println(String.format(...) Why not simply System.out.printf(..) ? --- Thanks, David - On 6/12/2019 11:03 am, Daniil Titov wrote: Hi Mandy and Bob, Thank you for your comments. Please review a new version of the fix [1] that makes OperatingSystemImpl methods return -1 if one of the metric has value 0. As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean indicating that methods could return -1 if the information is not available. There were no changes in CSR [3] yet, I plan to proceed with them after the fix is reviewed. In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html Shouldn’t you keep the IOException catch clauses in case the file is not found? There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. Here is the sampe of the stacktrace: java.security.PrivilegedActionException: java.io.FileNotFoundException at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) Caused by: java.io.FileNotFoundException at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1]
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > Hi Daniil, > > I'm not familiar with all the details of the various API's involved here so > just a few general comments in places. I do have one major issue flagged > below. > > --- > > src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > > ! static int initialized=1; > > Am I reading this right that the code currently fails to actually do the > initialization because of this ??? > > Style nit: if(perfInit() > > space after "if" > > --- > > src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > The changes to allow for a return of -1 are somewhat more extensive than we > have previously discussed. These methods previously were (per the spec) > guaranteed to return some (assumably) meaningful value but now they are > effectively allowed to fail by returning -1. No existing code is expecting to > have to handle a return of -1 so I see this as a significant compatibility > issue. Surely there must always be some information available from the > operating environment? I see from the impl file: > >// the host data, value 0 indicates that something went wrong while the > metric was read and > // in this case we return "information unavailable" code -1. > > I don't agree with this. If the container metrics are messed up somehow we > should either fallback to the host value or else abort with some kind of > exception. Returning -1 is not an option here IMO. I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others are, I’d suggest we keep Daniil’s original logic to fall back to the host value since a disabled subsystem is equivalent to no limits. Bob. > > --- > > src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java > src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java > > Can you please rename the legacy methods so that, for example, > getTotalMemorySize() calls getTotalMemorySize0() rather than > getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the > interface only. > > --- > > test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > System.out.println(String.format(...) > > Why not simply > > System.out.printf(..) > > ? > > --- > > Thanks, > David > - > > On 6/12/2019 11:03 am, Daniil Titov wrote: >> Hi Mandy and Bob, >> Thank you for your comments. Please review a new version of the fix [1] that >> makes >> OperatingSystemImpl methods return -1 if one of the metric has value 0. >> As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean >> indicating that methods could return -1 if the information is not available. >> There were no changes in CSR [3] yet, I plan to proceed with them after the >> fix is >> reviewed. >>> In >>> http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html >>> Shouldn’t you keep the IOException catch clauses in case the file is not >>> found? >> There is no need in keeping IOException catch in these 2 places where it >> used to be (getLongValueMatchingLine and getLongEntry methods). >> As I understand IOException catch was required only because File.lines() and >> File. readAllLines() can throw IOException. >> Now these calls are performed inside >> AccessController.doPrivileged(PrivilegedExceptionAction) that wraps >> all checked exceptions in PrivilegedActionException that we are catching >> now instead of IOException. >> Here is the sampe of the stacktrace: >> java.security.PrivilegedActionException: java.io.FileNotFoundException >> at >> java.base/java.security.AccessController.doPrivileged(AccessController.java:558) >> at >> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) >> at >> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) >> at >> jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) >> at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) >> Caused by: java.io.FileNotFoundException >> at >> java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) >> at >> java.base/java.security.AccessController.doPrivileged(AccessController.java:554) >> In getStringValue method the whole code block is now executed inside >> AccessController.doPrivileged() so we still need either catch >> IOException inside this code block or convert this block to >> PrivilegedExceptionAction and then put AccessController.doPrivileged >> call inside new try/catch Block to catch
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Daniil, I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below. --- src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ! static int initialized=1; Am I reading this right that the code currently fails to actually do the initialization because of this ??? Style nit: if(perfInit() space after "if" --- src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file: // the host data, value 0 indicates that something went wrong while the metric was read and // in this case we return "information unavailable" code -1. I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. --- src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only. --- test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java System.out.println(String.format(...) Why not simply System.out.printf(..) ? --- Thanks, David - On 6/12/2019 11:03 am, Daniil Titov wrote: Hi Mandy and Bob, Thank you for your comments. Please review a new version of the fix [1] that makes OperatingSystemImpl methods return -1 if one of the metric has value 0. As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean indicating that methods could return -1 if the information is not available. There were no changes in CSR [3] yet, I plan to proceed with them after the fix is reviewed. In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html Shouldn’t you keep the IOException catch clauses in case the file is not found? There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. Here is the sampe of the stacktrace: java.security.PrivilegedActionException: java.io.FileNotFoundException at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) Caused by: java.io.FileNotFoundException at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thanks, Daniil On 12/5/19, 12:59 PM, "Mandy Chung" wrote: On 12/5/19 12:50 PM, Bob Vandette wrote: > It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics.
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Mandy and Bob, Thank you for your comments. Please review a new version of the fix [1] that makes OperatingSystemImpl methods return -1 if one of the metric has value 0. As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean indicating that methods could return -1 if the information is not available. There were no changes in CSR [3] yet, I plan to proceed with them after the fix is reviewed. > In > http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html > Shouldn’t you keep the IOException catch clauses in case the file is not > found? There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. Here is the sampe of the stacktrace: java.security.PrivilegedActionException: java.io.FileNotFoundException at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) Caused by: java.io.FileNotFoundException at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thanks, Daniil On 12/5/19, 12:59 PM, "Mandy Chung" wrote: On 12/5/19 12:50 PM, Bob Vandette wrote: > It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics. Bob may have an opinion. >> There was no any new input regarding this so I decided to leave it unchanged. > Sorry, I didn’t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries > due to the access of multiple changing values, I think it’s best to leave things as they are so the caller of > these methods understands the limitations of the API. OK with me. > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker > online documentation so it’s probably best to be consistent. > Also it seems correct for the memory related methods to check if (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). BTW what does it mean if limit == 0? >> Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. >> And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible >> if something went wrong while retrieving this metric. > That is true but shouldn’t you return -1 in that case? > > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. +1 The javadoc should be changed and returns -1 when it's unavailable and the CSR should also be updated to reflect this.I'm sure Joe can re-approve the CSR quickly when the fix is reviewed and approved. > You should only fall back to the original logic (host values) if container values are set to unlimited. > +1 Mandy
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 12/5/19 12:50 PM, Bob Vandette wrote: It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics. Bob may have an opinion. There was no any new input regarding this so I decided to leave it unchanged. Sorry, I didn’t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries due to the access of multiple changing values, I think it’s best to leave things as they are so the caller of these methods understands the limitations of the API. OK with me. Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker online documentation so it’s probably best to be consistent. Also it seems correct for the memory related methods to check if (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). BTW what does it mean if limit == 0? Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible if something went wrong while retrieving this metric. That is true but shouldn’t you return -1 in that case? I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. +1 The javadoc should be changed and returns -1 when it's unavailable and the CSR should also be updated to reflect this. I'm sure Joe can re-approve the CSR quickly when the fix is reviewed and approved. You should only fall back to the original logic (host values) if container values are set to unlimited. +1 Mandy
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html Shouldn’t you keep the IOException catch clauses in case the file is not found? > On Dec 5, 2019, at 2:31 PM, Daniil Titov wrote: > > Hi Mandy and Bob, > > Please review a new version of the webrev that addresses the most of these > issues: > > 1) The interface and spec [3] were updated to use default methods. CSR [3] > was re-approved. > > 2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in > j.i.p.cgroupv1. SubSystem > were wrapped with doPrivileged > > 3) getCpuLoad () method was optimized to fallback to getSystemCpuLoad0 if > the cpuset is identical to the host's one. > It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs > configured on the host . Testing with > different --cpuset-cpus settings inside a Docker container proved that it > always returns the same number of hosts configured > CPUs regardless of --cpuset-cpus settings while the same settings affect > getEffectiveCpuSetCpus and getCpuSetCpus() metrics. > >In addition, getCpuLoad () method now returns -1 in the cases when quotas > are active but cpu usage and cpu period metrics are not available and > in the case when for some reason it fails to retrieve a valid CPU load for > one of CPUs while iteration over them Shouldn't you do the same for getCpuLoad 149 int[] cpuSet = containerMetrics.getEffectiveCpuSetCpus(); 150 if (cpuSet != null && cpuSet.length > 0) { If cpuSet.length == 0? > >>> CheckOperatingSystemMXBean.java >>>System.out.println(String.format(...)) can simply be replaced with >>> System.out.format. > I had to leave it unchanged since replacing it with System.out.format > results in the tests instability as it makes the trace output > occasionally Intervene here (the trace message sometimes is printed inside > this message) and tests cannot find the expected > pattern in the output. > >>> It may worth considering adding Metrics::getSwapLimit and >>> Metrics::getSwapUsage and move the computation to the implementation of >>> Metrics. Bob may have an opinion. > > There was no any new input regarding this so I decided to leave it unchanged. Sorry, I didn’t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries due to the access of multiple changing values, I think it’s best to leave things as they are so the caller of these methods understands the limitations of the API. Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker online documentation so it’s probably best to be consistent. > >>> Also it seems correct for the memory related methods to check if >>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >>> BTW what does it mean if limit == 0? > > Per Docker docs the minimum allowed value for memory limit (--memory option) > is 4 megabytes. > And if memory limit is unset the return value is -1. Thus, in my > understanding the value 0 is only possible > if something went wrong while retrieving this metric. That is true but shouldn’t you return -1 in that case? I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. You should only fall back to the original logic (host values) if container values are set to unlimited. Bob. > > Testing: Mach5 tier1-tier6 tests (that include > open/test/hotspot/jtreg/containers/docker and : jdk_management tests) > passed. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > Daniil > > On 12/4/19, 4:09 PM, "Mandy Chung" wrote: > > > >On 12/3/19 9:40 PM, Daniil Titov wrote: >> Under what circumstance that limit or memLimit is < 0? >> The memory limit metrics is not available if JVM runs on Linux host ( not in >> a docker container) or if a docker container was started without >> specifying a memory limit ( without '--memory=' Docker option) . In latter >> there is no limit on how much memory the container can use and >> it can use as much memory as the host's OS allows. >> > >OK. Please add a comment to the code. > >It may worth considering adding Metrics::getSwapLimit and >Metrics::getSwapUsage and move the computation to the implementation of >Metrics. Bob may have an opinion. > >Also it seems correct for the memory related methods to check if >(containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >BTW what does it mean if limit == 0? > Is it worth
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Mandy and Bob, Please review a new version of the webrev that addresses the most of these issues: 1) The interface and spec [3] were updated to use default methods. CSR [3] was re-approved. 2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in j.i.p.cgroupv1. SubSystem were wrapped with doPrivileged 3) getCpuLoad () method was optimized to fallback to getSystemCpuLoad0 if the cpuset is identical to the host's one. It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs configured on the host . Testing with different --cpuset-cpus settings inside a Docker container proved that it always returns the same number of hosts configured CPUs regardless of --cpuset-cpus settings while the same settings affect getEffectiveCpuSetCpus and getCpuSetCpus() metrics. In addition, getCpuLoad () method now returns -1 in the cases when quotas are active but cpu usage and cpu period metrics are not available and in the case when for some reason it fails to retrieve a valid CPU load for one of CPUs while iteration over them >> CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with >> System.out.format. I had to leave it unchanged since replacing it with System.out.format results in the tests instability as it makes the trace output occasionally Intervene here (the trace message sometimes is printed inside this message) and tests cannot find the expected pattern in the output. >> It may worth considering adding Metrics::getSwapLimit and >> Metrics::getSwapUsage and move the computation to the implementation of >> Metrics. Bob may have an opinion. There was no any new input regarding this so I decided to leave it unchanged. >>Also it seems correct for the memory related methods to check if >>(containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >> BTW what does it mean if limit == 0? Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible if something went wrong while retrieving this metric. Testing: Mach5 tier1-tier6 tests (that include open/test/hotspot/jtreg/containers/docker and : jdk_management tests) passed. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thank you, Daniil On 12/4/19, 4:09 PM, "Mandy Chung" wrote: On 12/3/19 9:40 PM, Daniil Titov wrote: > >>> Under what circumstance that limit or memLimit is < 0? > The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without > specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and > it can use as much memory as the host's OS allows. > OK. Please add a comment to the code. It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics. Bob may have an opinion. Also it seems correct for the memory related methods to check if (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). BTW what does it mean if limit == 0? >>> Is it worth specifying this case? > I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. > I was wondering if the javadoc should specify that. >>> It fallbacks to return the system's total swap space size - this is not really what it should report. > For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. > However, I am not sure how we could differentiate these 2 cases. As this is the case when the limit is not set in the container, it returns the system metrics which sounds appropriate. > >>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. > For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. > For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Will zero memory usage happen? > Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. > > For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 12/3/19 9:40 PM, Daniil Titov wrote: Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. OK. Please add a comment to the code. It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics. Bob may have an opinion. Also it seems correct for the memory related methods to check if (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). BTW what does it mean if limit == 0? Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. I was wondering if the javadoc should specify that. It fallbacks to return the system's total swap space size - this is not really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. As this is the case when the limit is not set in the container, it returns the system metrics which sounds appropriate. Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Will zero memory usage happen? Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() returning -1 sounds right. src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. It's not strictly needed but I can go either way. CheckOperatingSystemMXBean.java System.out.println(String.format(...)) can simply be replaced with System.out.format. I will include this change in the next webrev, thank you! thanks Mandy
Re: 8226575: OperatingSystemMXBean should be made container aware
On Wed, 2019-12-04 at 08:32 -0500, Bob Vandette wrote: > You can try to use containerMetrics.getPerCpuUsage() instead of > containerMetrics.getEffectiveCpuSetCpus(). > The length of the array returned is the number of host cpus. Maybe Severin > can confirm if this true in cgroupv2 as > well. If I'm not mistaken getPerCpuUsage() is not supported in cgroupv2. Thanks, Severin
Re: jmx-dev 8226575: OperatingSystemMXBean should be made container aware
> On Dec 4, 2019, at 8:32 AM, Bob Vandette wrote: > > >> On Dec 3, 2019, at 9:00 PM, Daniil Titov wrote: >> >> Hi Bob, >> It’s too bad getCpuLoad can’t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. >> >> I think we can detect that the cpuset is identical to the host's one by >> comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() >> returns >> to the number of the CPUs configured on the host and returned by >> sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new >> native >> method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we >> just fallback to getSystemCpuLoad0(). I did some testing on Linux host and >> inside Docker container with different '--cpuset-cpus' settings and it seems >> to work as expected. >> >> JNIEXPORT jint JNICALL >> Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 >> (JNIEnv *env, jobject mbean) >> { >> if(perfInit() == 0) { >> return counters.nProcs; >> } else { >> return -1; >> } >> } >> >> >> If there is no objection I will include this change in the new webrev. > > I don’t think this approach will work. Both the array returned and the > sysconf(_SC_NPROCESSORS_CONF) > report the containers cpuset value so they will be equal causing you to > always fallback. > > You can try to use containerMetrics.getPerCpuUsage() instead of > containerMetrics.getEffectiveCpuSetCpus(). > The length of the array returned is the number of host cpus. Maybe Severin > can confirm if this true in cgroupv2 as > well. I just checked the webrev for the cgroupv2 implementation and getPerCpuUsage is not supported. I still think it’s worth implementing this optimization but it won’t be used on cgroupv2 since the array length (0) won’t be equal to _SC_NPROCESSORS_CONF. Here’s the cgroupv2 implementation of this method. 64 @Override 65 public long[] getPerCpuUsage() { 66 // Not supported 67 return new long[0]; 68 } Bob. > > Bob. > > >> >> Thank you, >> Daniil >> >> On 12/3/19, 1:30 PM, "Bob Vandette" wrote: >> >> Daniil, >> >> Looks good to me. >> >> If there are any management jtreg tests, I’d run these since your changes >> to OperatingSystemMXBean will >> alter the behavior of these methods even for Linux hosts since cgroups is >> typically enabled causing the >> container detection to report containerized. >> >> It’s too bad getCpuLoad can’t detect that the cpuset is identical to the >> hosts in order to allow you to fallback to >> getSystemCpuLoad0. >> >> >> Bob. >> >> >>> On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: >>> >>> Please review the change that makes OperatingSystemMXBean methods return >>> container specific information >>> rather than the host based data. >>> >>> The webrev [1] is based on the code Andrew and Severin initially provided >>> with some additional changes and combined >>> with the spec update David made [3]. >>> >>> The webrev corrects the implementation for the free/total swap methods as >>> Bob noted to subtract the total >>> and free memory from the returned values. >>> >>> It also corrects getCpuLoad() implementation, as Bob advised, to cover the >>> case when CPU quotas are not active. >>> >>> The webrev also takes into account the case when >>> java.security.AccessControlException exception is thrown >>> during the initialization of the container subsystem ( e.g. when >>> java.policy doesn’t grant "read" access to >>> "/proc/self/mountinfo" file). >>> >>> CSR for the spec changes [3] is approved. >>> >>> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including >>> open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . >>> >>> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ >>> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 >>> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 >>> >>> Thank you, >>> -Daniil >>> >>> >> >> >> >> >
Re: 8226575: OperatingSystemMXBean should be made container aware
> On Dec 3, 2019, at 9:00 PM, Daniil Titov wrote: > > Hi Bob, > >>> It’s too bad getCpuLoad can’t detect that the cpuset is identical to the >>> hosts in order to allow you to fallback to >>> getSystemCpuLoad0. > > I think we can detect that the cpuset is identical to the host's one by > comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() > returns > to the number of the CPUs configured on the host and returned by > sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new > native > method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we > just fallback to getSystemCpuLoad0(). I did some testing on Linux host and > inside Docker container with different '--cpuset-cpus' settings and it seems > to work as expected. > > JNIEXPORT jint JNICALL > Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 > (JNIEnv *env, jobject mbean) > { > if(perfInit() == 0) { >return counters.nProcs; > } else { >return -1; > } > } > > > If there is no objection I will include this change in the new webrev. I don’t think this approach will work. Both the array returned and the sysconf(_SC_NPROCESSORS_CONF) report the containers cpuset value so they will be equal causing you to always fallback. You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus(). The length of the array returned is the number of host cpus. Maybe Severin can confirm if this true in cgroupv2 as well. Bob. > > Thank you, > Daniil > > On 12/3/19, 1:30 PM, "Bob Vandette" wrote: > >Daniil, > >Looks good to me. > >If there are any management jtreg tests, I’d run these since your changes > to OperatingSystemMXBean will >alter the behavior of these methods even for Linux hosts since cgroups is > typically enabled causing the >container detection to report containerized. > >It’s too bad getCpuLoad can’t detect that the cpuset is identical to the > hosts in order to allow you to fallback to >getSystemCpuLoad0. > > >Bob. > > >> On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: >> >> Please review the change that makes OperatingSystemMXBean methods return >> container specific information >> rather than the host based data. >> >> The webrev [1] is based on the code Andrew and Severin initially provided >> with some additional changes and combined >> with the spec update David made [3]. >> >> The webrev corrects the implementation for the free/total swap methods as >> Bob noted to subtract the total >> and free memory from the returned values. >> >> It also corrects getCpuLoad() implementation, as Bob advised, to cover the >> case when CPU quotas are not active. >> >> The webrev also takes into account the case when >> java.security.AccessControlException exception is thrown >> during the initialization of the container subsystem ( e.g. when >> java.policy doesn’t grant "read" access to >> "/proc/self/mountinfo" file). >> >> CSR for the spec changes [3] is approved. >> >> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including >> open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ >> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 >> >> Thank you, >> -Daniil >> >> > > > >
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Mandy, I think in my previous reply I missed to answer one of the questions from your email. >> getFreeSwapSpaceSize retry for a few times. What special about this method >> but not others like getFreeMemorySize? The specific of method getFreeSwapSpaceSize is that MemoryAndSwapUsage and MemoryUsage metrics it reads are related ( MemoryAndSwapUsage includes MemoryUsage) and they are not constant. Since these metrics are not read atomically it could be that they change their values between these 2 reads. On the contrary, some other metrics, such as MemoryLimit, are constant. They are set when the container starts and are supposed to return the same value over the whole time the JVM runs. The other methods don't use more than one such nonconstant metric, so the only place where this potential issue with not atomic reads could happen is getFreeSwapSpaceSize method. Best regards, Daniil On 12/3/19, 7:34 PM, "Daniil Titov" wrote: Hi Mandy, Thank you for your comments, please find my answers below. >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java >> this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. I will include this change in the next webrev, thank you. >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". I will correct this, thanks. >>Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. >> Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> It fallbacks to return the system's total swap space size - this is not really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >>There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. >>CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with System.out.format. I will include this change in the next webrev, thank you! Best regards, Daniil From: Mandy Chung Date: Tuesday, December 3, 2019 at 4:10 PM To: Daniil Titov Cc: OpenJDK Serviceability , "jmx-...@openjdk.java.net" , Bob Vandette Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo,
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Resending with the corrected subject. "RFR" was somehow stripped from it and that breaks the sorting by subject... Hi Mandy, Thank you for your comments, please find my answers below. >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java >> this should wrap the security-sensitive operations with doPrivileged. >> jdk.management is trusted and it has all permissions. I will include this change in the next webrev, thank you. >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>Formatting nit: line 346-355: JDK native source uses 4-space identation >> convention. A space is missing between "if" and "(". I will correct this, thanks. >>Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. >> Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> It fallbacks to return the system's total swap space size - this is not >> really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >>There is no strong need to make the deprecated methods as default >> methods. If they were default methods, they only need to be implemented >> once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. >>CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with >> System.out.format. I will include this change in the next webrev, thank you! Best regards, Daniil From: Mandy Chung Date: Tuesday, December 3, 2019 at 4:10 PM To: Daniil Titov Cc: OpenJDK Serviceability , "jmx-...@openjdk.java.net" , Bob Vandette Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy. The jdk default security policy should grant proper permission to do so. CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c Formatting nit: line 346-
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Resending with the corrected title, "RFR" was somehow stripped from it that breaks the sorting by subject... Hi Bob, >>It’s too bad getCpuLoad can’t detect that the cpuset is identical to the hosts in order to allow you to fallback to >> getSystemCpuLoad0. I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns to the number of the CPUs configured on the host and returned by sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected. JNIEXPORT jint JNICALL Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 (JNIEnv *env, jobject mbean) { if(perfInit() == 0) { return counters.nProcs; } else { return -1; } } If there is no objection I will include this change in the new webrev. Thank you, Daniil On 12/3/19, 1:30 PM, "Bob Vandette" wrote: Daniil, Looks good to me. If there are any management jtreg tests, I’d run these since your changes to OperatingSystemMXBean will alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the container detection to report containerized. It’s too bad getCpuLoad can’t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. Bob. > On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: > > Please review the change that makes OperatingSystemMXBean methods return container specific information > rather than the host based data. > > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined > with the spec update David made [3]. > > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total > and free memory from the returned values. > > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. > > The webrev also takes into account the case when java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to > "/proc/self/mountinfo" file). > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > -Daniil > >
Re: 8226575: OperatingSystemMXBean should be made container aware
Hi Mandy, Thank you for your comments, please find my answers below. >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java >> this should wrap the security-sensitive operations with doPrivileged. >> jdk.management is trusted and it has all permissions. I will include this change in the next webrev, thank you. >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>Formatting nit: line 346-355: JDK native source uses 4-space identation >> convention. A space is missing between "if" and "(". I will correct this, thanks. >>Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. >> Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> It fallbacks to return the system's total swap space size - this is not >> really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >>There is no strong need to make the deprecated methods as default >> methods. If they were default methods, they only need to be implemented >> once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. >>CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with >> System.out.format. I will include this change in the next webrev, thank you! Best regards, Daniil From: Mandy Chung Date: Tuesday, December 3, 2019 at 4:10 PM To: Daniil Titov Cc: OpenJDK Serviceability , "jmx-...@openjdk.java.net" , Bob Vandette Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy. The jdk default security policy should grant proper permission to do so. CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java 59 if (limit >= 0 && memLimit >= 0) { 60 return li
Re: 8226575: OperatingSystemMXBean should be made container aware
Hi Bob, >>It’s too bad getCpuLoad can’t detect that the cpuset is identical to the >> hosts in order to allow you to fallback to >> getSystemCpuLoad0. I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns to the number of the CPUs configured on the host and returned by sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected. JNIEXPORT jint JNICALL Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 (JNIEnv *env, jobject mbean) { if(perfInit() == 0) { return counters.nProcs; } else { return -1; } } If there is no objection I will include this change in the new webrev. Thank you, Daniil On 12/3/19, 1:30 PM, "Bob Vandette" wrote: Daniil, Looks good to me. If there are any management jtreg tests, I’d run these since your changes to OperatingSystemMXBean will alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the container detection to report containerized. It’s too bad getCpuLoad can’t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. Bob. > On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: > > Please review the change that makes OperatingSystemMXBean methods return container specific information > rather than the host based data. > > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined > with the spec update David made [3]. > > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total > and free memory from the returned values. > > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. > > The webrev also takes into account the case when java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to > "/proc/self/mountinfo" file). > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > -Daniil > >
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy. The jdk default security policy should grant proper permission to do so. CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java 59 if (limit >= 0 && memLimit >= 0) { 60 return limit - memLimit; 61 } Under what circumstance that limit or memLimit is < 0? It fallbacks to return the system's total swap space size - this is not really what it should report. Is it worth specifying this case?Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. getFreeSwapSpaceSize retry for a few times. What special about this method but not others like getFreeMemorySize? src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. CheckOperatingSystemMXBean.java System.out.println(String.format(...)) can simply be replaced with System.out.format. Mandy
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Daniil, Looks good to me. If there are any management jtreg tests, I’d run these since your changes to OperatingSystemMXBean will alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the container detection to report containerized. It’s too bad getCpuLoad can’t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. Bob. > On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: > > Please review the change that makes OperatingSystemMXBean methods return > container specific information > rather than the host based data. > > The webrev [1] is based on the code Andrew and Severin initially provided > with some additional changes and combined > with the spec update David made [3]. > > The webrev corrects the implementation for the free/total swap methods as Bob > noted to subtract the total > and free memory from the returned values. > > It also corrects getCpuLoad() implementation, as Bob advised, to cover the > case when CPU quotas are not active. > > The webrev also takes into account the case when > java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy > doesn’t grant "read" access to > "/proc/self/mountinfo" file). > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including > open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > -Daniil > >
RFR: 8226575: OperatingSystemMXBean should be made container aware
Please review the change that makes OperatingSystemMXBean methods return container specific information rather than the host based data. The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined with the spec update David made [3]. The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total and free memory from the returned values. It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn’t grant "read" access to "/proc/self/mountinfo" file). CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 Thank you, -Daniil
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Severin, Andrew, On 20/07/2019 2:07 am, Severin Gehwolf wrote: Hi All, I've filed a CSR and would like to solicit feedback at this point. There are some contradicting issues involved and IMHO there is no clear-cut answer to a couple of key points. https://bugs.openjdk.java.net/browse/JDK-8228428 This is still a Work-In-Progress :) I've added a lengthy comment to the CSR. I don't think we can meaningfully try to retrofit this onto the existing MXbean and attempting to do so will just open a can worms when it comes to compatibility. A new MXbean that exposes the desired information for what is available to the VM may be a simpler path forward. Cheers, David - Comments inline. On Fri, 2019-07-19 at 08:27 -0700, Mandy Chung wrote: On 7/18/19 3:49 PM, David Holmes wrote: Hi Andrew, Sorry - this whole discussion slipped past me while I was traveling in the US in late June (and have then been on vacation). As I just wrote in the bug report: There are two OperatingSystemMXBean interfaces: - java.lang.management.OperatingSystemMXBean - com.sun.management.OperatingSystemMXBean (a subtype of the former) What the methods of these two interface report depends on exactly how the method has been specified. If it refers to something "available to the JVM (or current process)" then it needs to be container aware. If it refers to the "whole system" or something "physical", then that should report information for the whole system. Arguably some of the whole-system/physical APIs are misguided and reflect a history where containers and virtualized environments were not envisaged. But changing them is a compatibility issue and so it may be better to add new methods that report more interesting/relevant information for the process's environment. Either way a CSR request will need to be filed and the changes examined from a compatibility perspective. Agree. When the APIs were introduced, it was defined for the whole system and the spec should be made clear. If the APIs should be made container-aware, the spec should be clarified of the new behavior. Monitoring the whole system still provides a useful view even on a container and virtualized environment. Right, but going to the extreme, when a JVM runs inside a full virtualized VM it reports it's view from *within* the VM, not the whole system (the host the vm runs under). While it might be useful, I'm not sure what the approach of "least surprises" would be. The JVM in a container with a memory limit won't use the hosts' full memory resources. One idea to consider would be to refactor com.sun.management.OperatingSystemMXBean and extract the container-aware methods in a new MXBean which OSMXBean extends and create one new singleton instance of the new MXBean to report container-specific metrics. It would be platform specific, though. My reading of JDK-8199944 suggests this is a dead end. Some of the APIs may not be "well formed" in the first place - for example getSystemLoadAverage(): "Returns the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. " This refers to the "system" but specifies the calculation is based on "available processors" - which is a contradiction as "available" != "system". The spec should be updated to "system" in this case. You mean "available processors averaged" => "system processors averaged"? If so, should the method name of getAvailableProcessors() be changed too? So I think we need to take a step back at the moment and look at exactly which methods can be container-aware based on their current specification, and which should be container-aware but can't be because of their current specification, and so what new methods may be needed. Bob's suggestion [1] to make the total/free physical memory and swap space size container-aware sounds reasonable. +1 It makes a lot of sense given how the JDK works otherwise in those quota'ed environments. Thanks, Severin Refactoring c.s.m.OperatingSystemMXBean could be one option to explore. Mandy [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028710.html
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi All, I've filed a CSR and would like to solicit feedback at this point. There are some contradicting issues involved and IMHO there is no clear-cut answer to a couple of key points. https://bugs.openjdk.java.net/browse/JDK-8228428 This is still a Work-In-Progress :) Comments inline. On Fri, 2019-07-19 at 08:27 -0700, Mandy Chung wrote: > > On 7/18/19 3:49 PM, David Holmes wrote: > > Hi Andrew, > > > > Sorry - this whole discussion slipped past me while I was traveling in > > the US in late June (and have then been on vacation). As I just wrote > > in the bug report: > > > > There are two OperatingSystemMXBean interfaces: > > - java.lang.management.OperatingSystemMXBean > > - com.sun.management.OperatingSystemMXBean (a subtype of the former) > > > > What the methods of these two interface report depends on exactly how > > the method has been specified. If it refers to something "available to > > the JVM (or current process)" then it needs to be container aware. If > > it refers to the "whole system" or something "physical", then that > > should report information for the whole system. > > > > Arguably some of the whole-system/physical APIs are misguided and > > reflect a history where containers and virtualized environments were > > not envisaged. But changing them is a compatibility issue and so it > > may be better to add new methods that report more interesting/relevant > > information for the process's environment. Either way a CSR request > > will need to be filed and the changes examined from a compatibility > > perspective. > > > > Agree. > > When the APIs were introduced, it was defined for the whole system and > the spec should be made clear. If the APIs should be made > container-aware, the spec should be clarified of the new behavior. > > Monitoring the whole system still provides a useful view even on a > container and virtualized environment. Right, but going to the extreme, when a JVM runs inside a full virtualized VM it reports it's view from *within* the VM, not the whole system (the host the vm runs under). While it might be useful, I'm not sure what the approach of "least surprises" would be. The JVM in a container with a memory limit won't use the hosts' full memory resources. > One idea to consider would be > to refactor com.sun.management.OperatingSystemMXBean and extract the > container-aware methods in a new MXBean which OSMXBean extends and > create one new singleton instance of the new MXBean to report > container-specific metrics. It would be platform specific, though. My reading of JDK-8199944 suggests this is a dead end. > > Some of the APIs may not be "well formed" in the first place - for > > example getSystemLoadAverage(): > > > > "Returns the system load average for the last minute. The system load > > average is the sum of the number of runnable entities queued to the > > available processors and the number of runnable entities running on > > the available processors averaged over a period of time. " > > > > This refers to the "system" but specifies the calculation is based on > > "available processors" - which is a contradiction as "available" != > > "system". > > > > The spec should be updated to "system" in this case. You mean "available processors averaged" => "system processors averaged"? If so, should the method name of getAvailableProcessors() be changed too? > > So I think we need to take a step back at the moment and look at > > exactly which methods can be container-aware based on their current > > specification, and which should be container-aware but can't be > > because of their current specification, and so what new methods may be > > needed. > > > > Bob's suggestion [1] to make the total/free physical memory and swap > space size container-aware sounds reasonable. +1 It makes a lot of sense given how the JDK works otherwise in those quota'ed environments. Thanks, Severin > Refactoring c.s.m.OperatingSystemMXBean could be one option to explore. > > Mandy > [1] > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028710.html
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
On 7/18/19 3:49 PM, David Holmes wrote: Hi Andrew, Sorry - this whole discussion slipped past me while I was traveling in the US in late June (and have then been on vacation). As I just wrote in the bug report: There are two OperatingSystemMXBean interfaces: - java.lang.management.OperatingSystemMXBean - com.sun.management.OperatingSystemMXBean (a subtype of the former) What the methods of these two interface report depends on exactly how the method has been specified. If it refers to something "available to the JVM (or current process)" then it needs to be container aware. If it refers to the "whole system" or something "physical", then that should report information for the whole system. Arguably some of the whole-system/physical APIs are misguided and reflect a history where containers and virtualized environments were not envisaged. But changing them is a compatibility issue and so it may be better to add new methods that report more interesting/relevant information for the process's environment. Either way a CSR request will need to be filed and the changes examined from a compatibility perspective. Agree. When the APIs were introduced, it was defined for the whole system and the spec should be made clear. If the APIs should be made container-aware, the spec should be clarified of the new behavior. Monitoring the whole system still provides a useful view even on a container and virtualized environment. One idea to consider would be to refactor com.sun.management.OperatingSystemMXBean and extract the container-aware methods in a new MXBean which OSMXBean extends and create one new singleton instance of the new MXBean to report container-specific metrics. Some of the APIs may not be "well formed" in the first place - for example getSystemLoadAverage(): "Returns the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. " This refers to the "system" but specifies the calculation is based on "available processors" - which is a contradiction as "available" != "system". The spec should be updated to "system" in this case. So I think we need to take a step back at the moment and look at exactly which methods can be container-aware based on their current specification, and which should be container-aware but can't be because of their current specification, and so what new methods may be needed. Bob's suggestion [1] to make the total/free physical memory and swap space size container-aware sounds reasonable. Refactoring c.s.m.OperatingSystemMXBean could be one option to explore. Mandy [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028710.html
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Andrew, Note that we should first get the CSR and - what the CSR should entail - in order before we can pursue this further. I'll help you with that. More below. On Thu, 2019-07-18 at 16:28 -0400, Andrew Azores wrote: > Hi Severin, > > On Thu, 2019-07-18 at 18:37 +0200, Severin Gehwolf wrote: > > Hi Andrew, > > > > src/jdk.management/share/classes/module-info.java > > > > I don't think you need to explicitly require java.base. Everything > > depends on java.base. Is there a specific reason why that was needed? > > > > Ah, you're right that that's not needed. I figured java.base should be > implicitly depended upon, but the first run through failed due to > module visibility. The actual fix needed was the other module-info > change in the patch, ie: > > +exports jdk.internal.platform to > +jdk.management; > > in the java.base module-info. I've gone ahead and removed the > unnecessary addition. OK, cool. > > I've noticed that this test fails with your patch: > > > > FAILED: > > com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java > > > > Fails with: > > java.lang.RuntimeException: getSystemCpuLoad() returns > > 173995.48501979 which is not in the [0.0,1.0] interval > > at GetSystemCpuLoad.main(GetSystemCpuLoad.java:43) > > What did you run to get this test to execute? I hadn't hit that failure > in running `make test TEST=tier1`, but I can verify the failure if I > run specifically that class, so I'm unsure what larger suite or tier > it's included in. I've run: $ jtreg -verbose:summary -jdk:./build/linux-x86_64-server-release/images/jdk test/jdk/com/sun/management/OperatingSystemMXBean test/jdk/com/sun/management/UnixOperatingSystemMXBean test/jdk/java/lang/management/OperatingSystemMXBean > I'll need to go back and rethink how this system load calculation can > be done. Sure. > > This makes me wonder what the effect of this patch is on Linux > > *outside* a container. Have you verified that Metric values > > correspond > > to what whas previously returned via native methods? Could you > > verify, > > please and tell us the before/after values? Thanks! > > Other than the system load, the other memory-related values appear to > be correct both on bare metal and in Docker. Manually verifying the > tests in com/sun/management/OperatingSystemMXBean with the "trace" > argument and the patch applied (system load change removed), the values > printed match my host machine's /proc/meminfo. Running > jtreg:test/hotspot/jtreg/containers/docker tests also shows that the > Docker tests succeed and the reported values within a container are as > expected as well. OK. > I can't see that there are any tests to verify the Metrics > implementation itself. Am I just missing them or have none been > written? There is MetricsTester.java in the test library which is beeing used by TestCgroupMetrics.java and TestSystemMetrics.java > > I wonder if we should split the OperatingSystemMXBean tests into it's > > own (and not piggy-back on > > TestCPUAwareness.java/TestMemoryAwareness.java). Have you considered > > that? > > > > Sure, I could do that for the next review round. Great, thanks! Cheers, Severin
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Andrew, Sorry - this whole discussion slipped past me while I was traveling in the US in late June (and have then been on vacation). As I just wrote in the bug report: There are two OperatingSystemMXBean interfaces: - java.lang.management.OperatingSystemMXBean - com.sun.management.OperatingSystemMXBean (a subtype of the former) What the methods of these two interface report depends on exactly how the method has been specified. If it refers to something "available to the JVM (or current process)" then it needs to be container aware. If it refers to the "whole system" or something "physical", then that should report information for the whole system. Arguably some of the whole-system/physical APIs are misguided and reflect a history where containers and virtualized environments were not envisaged. But changing them is a compatibility issue and so it may be better to add new methods that report more interesting/relevant information for the process's environment. Either way a CSR request will need to be filed and the changes examined from a compatibility perspective. Some of the APIs may not be "well formed" in the first place - for example getSystemLoadAverage(): "Returns the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. " This refers to the "system" but specifies the calculation is based on "available processors" - which is a contradiction as "available" != "system". So I think we need to take a step back at the moment and look at exactly which methods can be container-aware based on their current specification, and which should be container-aware but can't be because of their current specification, and so what new methods may be needed. David - On 18/07/2019 4:45 am, Andrew Azores wrote: Hi all, Please review this change that addresses JDK-8226575 according to a previous discussion on this list [0][1]. The webrev has been kindly uploaded on my behalf by Severin Gehwolf, since I am not an author. The initial problem was that the com.sun.management.OperatingSystemMXBean was inconsistent about its awareness of applicable container limits. On the one hand, the (inherited) getAvailableProcessors() return value is consistent with the container-aware Runtime.availableProcessors(). But on the other hand, c.s.m.OperatingSystemMXBean's getTotalPhysicalMemorySize(), getFreePhysicalMemorySize(), getTotalSwapSpaceSize(), and getFreeSwapSpaceSize() returned values reflecting the physical host machine with no awareness of container limits. getSystemCpuLoad() has also been updated to use cgroup-accounted load calculation. The fix applied is to use the JDK internal Metrics API to determine container memory limits and CPU accounting and use these values instead, if available, otherwise falling back on the pre-existing native implementations. bug: https://bugs.openjdk.java.net/browse/JDK-8226575 webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/aazores/JDK-8226575/01/webrev/ [0] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028439.html [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028709.html Thanks,
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Severin, On Thu, 2019-07-18 at 18:37 +0200, Severin Gehwolf wrote: > Hi Andrew, > > src/jdk.management/share/classes/module-info.java > > I don't think you need to explicitly require java.base. Everything > depends on java.base. Is there a specific reason why that was needed? > Ah, you're right that that's not needed. I figured java.base should be implicitly depended upon, but the first run through failed due to module visibility. The actual fix needed was the other module-info change in the patch, ie: +exports jdk.internal.platform to +jdk.management; in the java.base module-info. I've gone ahead and removed the unnecessary addition. > I've noticed that this test fails with your patch: > > FAILED: > com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java > > Fails with: > java.lang.RuntimeException: getSystemCpuLoad() returns > 173995.48501979 which is not in the [0.0,1.0] interval > at GetSystemCpuLoad.main(GetSystemCpuLoad.java:43) What did you run to get this test to execute? I hadn't hit that failure in running `make test TEST=tier1`, but I can verify the failure if I run specifically that class, so I'm unsure what larger suite or tier it's included in. I'll need to go back and rethink how this system load calculation can be done. > > This makes me wonder what the effect of this patch is on Linux > *outside* a container. Have you verified that Metric values > correspond > to what whas previously returned via native methods? Could you > verify, > please and tell us the before/after values? Thanks! Other than the system load, the other memory-related values appear to be correct both on bare metal and in Docker. Manually verifying the tests in com/sun/management/OperatingSystemMXBean with the "trace" argument and the patch applied (system load change removed), the values printed match my host machine's /proc/meminfo. Running jtreg:test/hotspot/jtreg/containers/docker tests also shows that the Docker tests succeed and the reported values within a container are as expected as well. I can't see that there are any tests to verify the Metrics implementation itself. Am I just missing them or have none been written? > > I wonder if we should split the OperatingSystemMXBean tests into it's > own (and not piggy-back on > TestCPUAwareness.java/TestMemoryAwareness.java). Have you considered > that? > Sure, I could do that for the next review round. Thanks, -- Andrew Azores Software Engineer, OpenJDK Team Red Hat
Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi Andrew, On Wed, 2019-07-17 at 14:45 -0400, Andrew Azores wrote: > Hi all, > > Please review this change that addresses JDK-8226575 according to a > previous discussion on this list [0][1]. The webrev has been kindly > uploaded on my behalf by Severin Gehwolf, since I am not an author. > > The initial problem was that the > com.sun.management.OperatingSystemMXBean was inconsistent about its > awareness of applicable container limits. On the one hand, the > (inherited) getAvailableProcessors() return value is consistent with > the container-aware Runtime.availableProcessors(). But on the other > hand, c.s.m.OperatingSystemMXBean's getTotalPhysicalMemorySize(), > getFreePhysicalMemorySize(), getTotalSwapSpaceSize(), and > getFreeSwapSpaceSize() returned values reflecting the physical host > machine with no awareness of container limits. getSystemCpuLoad() has > also been updated to use cgroup-accounted load calculation. > > The fix applied is to use the JDK internal Metrics API to determine > container memory limits and CPU accounting and use these values > instead, if available, otherwise falling back on the pre-existing > native implementations. > > bug: > https://bugs.openjdk.java.net/browse/JDK-8226575 > > webrev: > http://cr.openjdk.java.net/~sgehwolf/webrevs/aazores/JDK-8226575/01/webrev/ src/jdk.management/share/classes/module-info.java I don't think you need to explicitly require java.base. Everything depends on java.base. Is there a specific reason why that was needed? I've noticed that this test fails with your patch: FAILED: com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java Fails with: java.lang.RuntimeException: getSystemCpuLoad() returns 173995.48501979 which is not in the [0.0,1.0] interval at GetSystemCpuLoad.main(GetSystemCpuLoad.java:43) This makes me wonder what the effect of this patch is on Linux *outside* a container. Have you verified that Metric values correspond to what whas previously returned via native methods? Could you verify, please and tell us the before/after values? Thanks! I wonder if we should split the OperatingSystemMXBean tests into it's own (and not piggy-back on TestCPUAwareness.java/TestMemoryAwareness.java). Have you considered that? Thanks, Severin > [0] > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028439.html > [1] > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028709.html > > Thanks, >
RFR: 8226575: OperatingSystemMXBean should be made container aware
Hi all, Please review this change that addresses JDK-8226575 according to a previous discussion on this list [0][1]. The webrev has been kindly uploaded on my behalf by Severin Gehwolf, since I am not an author. The initial problem was that the com.sun.management.OperatingSystemMXBean was inconsistent about its awareness of applicable container limits. On the one hand, the (inherited) getAvailableProcessors() return value is consistent with the container-aware Runtime.availableProcessors(). But on the other hand, c.s.m.OperatingSystemMXBean's getTotalPhysicalMemorySize(), getFreePhysicalMemorySize(), getTotalSwapSpaceSize(), and getFreeSwapSpaceSize() returned values reflecting the physical host machine with no awareness of container limits. getSystemCpuLoad() has also been updated to use cgroup-accounted load calculation. The fix applied is to use the JDK internal Metrics API to determine container memory limits and CPU accounting and use these values instead, if available, otherwise falling back on the pre-existing native implementations. bug: https://bugs.openjdk.java.net/browse/JDK-8226575 webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/aazores/JDK-8226575/01/webrev/ [0] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028439.html [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028709.html Thanks, -- Andrew Azores Software Engineer, OpenJDK Team Red Hat