Misha,
Looks good. A couple of nits.
1. You might want to remove hotspot_containers section of ProblemList.txt since
there are no bugs listed.
2. Can you make this timeout a constant just like TIME_TO_RUN_MAIN_PROCESS
73 mainContainer.waitForMainMethodStart(5*1000);
3. assertIsAlive() is not used except for the commented out tests. Do you
think you’ll ultimately use
this method or is this left over from previous attempts?
222 public void assertIsAlive() throws Exception {
Bob.
> On Aug 29, 2019, at 11:41 AM, [email protected] wrote:
>
> I believe I need a second reviewer for this change. Could someone, please,
> review this change version 2 ? (David already reviewed it).
>
> http://cr.openjdk.java.net/~mseledtsov/8228960.02/
>
>
> Thank you in advance,
>
> Misha
>
>
> On 8/26/19 12:32 PM, [email protected] wrote:
>> Hi David,
>>
>> Thank you for review.
>>
>> On 8/26/19 12:57 AM, David Holmes wrote:
>>> Hi Misha,
>>>
>>> On 24/08/2019 3:21 am, [email protected] wrote:
>>>> Finally got some time to work on this issue.
>>>> Since I have encountered problem using files for passing messages between
>>>> a container and a test driver (due to permissions), I looked for
>>>> alternative solutions. I am using the output of a container process to
>>>> signal when the main method has started, and it works. This simplifies
>>>> things quite a bit as well.
>>>>
>>>> Normally, we use OutputAnalyzer test utility to collect the whole output
>>>> once the process has completed, and then analyze the resulting output for
>>>> "contains some string", match, etc. However, testutils/ProcessTools
>>>> provides an API to consume the output as it is produced. I am using this
>>>> API to detect when the main() method of the container has started.
>>>
>>> That seems reasonable. Do we want to make the following change to minimise
>>> unneeded output processing:
>>>
>>> private Consumer<String> outputConsumer = s -> {
>>> ! if (!mainMethodStarted &&
>>> s.contains(EventGeneratorLoop.MAIN_METHOD_STARTED)) {
>>> System.out.println("MainContainer: setting
>>> mainMethodStarted");
>>> mainMethodStarted = true;
>>> }
>>> };
>> Thank you for the suggestion. I will update the code accordingly.
>>>
>>>> Updated webrev:
>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.02/
>>>
>>> Otherwise looks okay. Hopefully those other test cases will be enabled in
>>> the not too distant future.
>>
>> I hope so as well.
>>
>>
>> Thank you,
>>
>> Misha
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>
>>>> Testing:
>>>>
>>>> Ran the test on Linux-x64, various multiple nodes in a test cluster 50
>>>> times - All PASS
>>>>
>>>>
>>>> Thank you,
>>>>
>>>> Misha
>>>>
>>>> On 8/13/19 2:05 PM, Bob Vandette wrote:
>>>>>
>>>>>> On Aug 13, 2019, at 3:28 PM, [email protected] wrote:
>>>>>>
>>>>>>
>>>>>> On 8/13/19 12:06 PM, Bob Vandette wrote:
>>>>>>>> On Aug 13, 2019, at 2:57 PM, [email protected] wrote:
>>>>>>>>
>>>>>>>> Hi Bob,
>>>>>>>>
>>>>>>>> The workdir (JTwork/scratch) is created with the "test user"
>>>>>>>> permissions. Let me try to place the "signal" file in /tmp instead,
>>>>>>>> since /tmp should normally have a 777 permission on Linux.
>>>>>>> Aren’t you creating a file inside a docker container and then checking
>>>>>>> for its existence outside of the container?
>>>>>> Correct
>>>>>>> Isn’t the root user running inside the container?
>>>>>> By default it is. But it still fails to create a file, for some reason.
>>>>>> Can be related to selinux settings (for instance, see this article:
>>>>>> https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443),
>>>>>> I can not change those.
>>>>> Is your JTWork/scratch on an NFS mounted file system? If this is the
>>>>> case then the problem is that root is equivalent to nobody on
>>>>> mounted file systems and can’t create files unless the directory has 777
>>>>> permissions. I just confirmed this. You’d have to either run
>>>>> the container test as test-user or change the scratch directory
>>>>> permission.
>>>>>
>>>>> Bob.
>>>>>
>>>>>> My hope is that /tmp is configured to be accessed by a container engine
>>>>>> as a general purpose directory, hence I was thinking to try it out.
>>>>>>
>>>>>>> Both processes don’t see the same /tmp right? So that shouldn’t help.
>>>>>> In my next experiment, I will map a /tmp from host to be a /host-tmp
>>>>>> inside the container (--volume /tmp:/host-tmp), then write a signal file
>>>>>> to /host-tmp.
>>>>>>> If scratch has 777 permissions, anyone can create a file.
>>>>>> scratch has "rwxr-xr-x"
>>>>>>> You have to be careful that you can clean up the
>>>>>>> file from outside the container. I’d make sure to create it with 777.
>>>>>> I do use deleteOnExit(), so it should work (unless the JVM crashes). I
>>>>>> guess I could add extra layer of safety here, and set the permissions to
>>>>>> 777. Thank you for advice.
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Misha
>>>>>>
>>>>>>> Bob.
>>>>>>>
>>>>>>>> If this works, I will have to add some unique number to the file name,
>>>>>>>> perhaps a PID of a child process.
>>>>>>>>
>>>>>>>> I will try this, and let you know how it works.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>>
>>>>>>>> Misha
>>>>>>>>
>>>>>>>> On 8/13/19 6:34 AM, Bob Vandette wrote:
>>>>>>>>> Sorry, I just looked at the webrev and you are trying the approach I
>>>>>>>>> suggested. I thought you
>>>>>>>>> were trying to use file change notification.
>>>>>>>>>
>>>>>>>>> Where does the workdir get created? Does it have 777 permissions?
>>>>>>>>>
>>>>>>>>> Bob.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> What if you just poll for the creation of the file waiting some
>>>>>>>>>> small amount of time between polling with a maximum timeout.
>>>>>>>>>>
>>>>>>>>>> Bob.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On Aug 12, 2019, at 8:22 PM, [email protected] wrote:
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately, this approach does not seem to work on many of our
>>>>>>>>>>> test cluster machines. The creation of a "signal" file results in
>>>>>>>>>>> "PermissionDenied".
>>>>>>>>>>>
>>>>>>>>>>> The possible reason is the selinux configuration, or some other
>>>>>>>>>>> permission related stuff. The container tries to create a new file
>>>>>>>>>>> on a mounted volume on a host system, but host system denies it. I
>>>>>>>>>>> will look a bit deeper into this, but I think this type of issue
>>>>>>>>>>> can be encountered on any automated test system. Hence, we may have
>>>>>>>>>>> to abandon this approach.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Misha
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 8/12/19 3:59 PM, [email protected] wrote:
>>>>>>>>>>>> Here is an updated webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.01/
>>>>>>>>>>>>
>>>>>>>>>>>> I am using a simple file-based mechanism to communicate between
>>>>>>>>>>>> the processes. The "EventGeneratorLoop" process creates a specific
>>>>>>>>>>>> "signal" file on a shared mounted volume, while the main test
>>>>>>>>>>>> process waits for the file to exist before running the test cases.
>>>>>>>>>>>>
>>>>>>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test
>>>>>>>>>>>> cluster is in progress.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>
>>>>>>>>>>>> Misha
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote:
>>>>>>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote:
>>>>>>>>>>>>>> Hi Severin, Bob,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for reviewing the code.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote:
>>>>>>>>>>>>>>> Can’t you come up with a better way of synchronizing the test
>>>>>>>>>>>>>>> by possibly writing a
>>>>>>>>>>>>>>> file and waiting for it to exist with a timeout?
>>>>>>>>>>>>>> I will try out this approach.
>>>>>>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing
>>>>>>>>>>>>> serviceability-dev.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But I'm pretty sure they recently addressed a similar issue with
>>>>>>>>>>>>> the premature sending of the attach signal?
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Misha
>>>>>>>>>>>>>>> Isn’t there a shared volume between the two
>>>>>>>>>>>>>>> processes?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We’ve been fighting test reliability for a while now. I can
>>>>>>>>>>>>>>> only hope we’re getting
>>>>>>>>>>>>>>> to the end.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Bob.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin
>>>>>>>>>>>>>>>> Gehwolf<[email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Misha,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700,
>>>>>>>>>>>>>>>> [email protected] wrote:
>>>>>>>>>>>>>>>>> Please review this change that fixes a container test
>>>>>>>>>>>>>>>>> TestJcmdWithSideCar.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My investigation indicated that a root cause for this failure
>>>>>>>>>>>>>>>>> is:
>>>>>>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class
>>>>>>>>>>>>>>>>> has not
>>>>>>>>>>>>>>>>> been loaded yet.
>>>>>>>>>>>>>>>>> The target test JVM has started, it is initializing, but has
>>>>>>>>>>>>>>>>> not loaded
>>>>>>>>>>>>>>>>> the main test class.
>>>>>>>>>>>>>>>> That's what I've found too.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with
>>>>>>>>>>>>>>>>> a short
>>>>>>>>>>>>>>>>> sleep in between.
>>>>>>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an
>>>>>>>>>>>>>>>> alternative.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also I have commented out the testCase02() due to another bug:
>>>>>>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException:
>>>>>>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance",
>>>>>>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the test
>>>>>>>>>>>>>>>>> and skip a
>>>>>>>>>>>>>>>>> sub-case than to skip the entire test.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960
>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.00/
>>>>>>>>>>>>>>>> Looks OK to me.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Severin
>>>>>>>>>>>>>>>>