Re: [PATCH] tester: Make the SIS time limit user configurable
On 06/07/2022 23:55, Chris Johns wrote: On 6/7/2022 6:00 pm, Sebastian Huber wrote: Yes, if tests go wrong the tester can kill a test execution after the specified timeout. Killing should be taken as a sign something in the test equipment is broken. A simulator which kills itself after an arbitrary amount of time is a broken test equipment. Why do we need this arbitrary SIS -tlim of 400 s? There was a few values and I select this one based on those. If you have a better value please say. I started this discussion by the best value from my point of view: no time limit for the simulator. I do not recommend removing the time limit option from testing. Why don't you recommend removing the time limit option? If you want to operate that way create a user config with: [erc32-sis] sis_time_limit = > Which problem does this solve? Repeatable test results across wide ranging hosts and host operating systems. I don't see how an arbitrary simulator timeout helps here. Killing the SIS is very reliably. I have never seen a zombie SIS process after an rtems-test exit. Why can't we let the tests run in SIS without a limit just like we do it for Qemu? Qemu is painful to make work in a consistent and reliable way. Normally, if a test is done, it terminates the SIS execution. The time outs are for the cases that end abnormally. Yes, this timeout should be defined by the --timeout command line option. The timeout depends on the tests you run. This is also selected through the command line. Test executions stopped by the tester due to a timeout are reported as a timed out test, however, test executions stopped by the simulator due to a simulator internal timeout are reported as "failed". The new performance tests can be used to catch performance regressions. We need to consider the existing benchmarks. In the long run I think we need a more modular approach. For example, one component which runs tests and reports the test output. Another component which analyses the test output. The test outputs can be archived. The rtems-test command is required to report regressions and as simply as possible. A user wants to build, test then know what they have matches what we released. Yes, the rtems-test command should do this, however, the machinery it uses could be more modular. If you want to catch performance regressions you need history data. -- embedded brains GmbH Herr Sebastian HUBER Dornierstr. 4 82178 Puchheim Germany email: sebastian.hu...@embedded-brains.de phone: +49-89-18 94 741 - 16 fax: +49-89-18 94 741 - 08 Registergericht: Amtsgericht München Registernummer: HRB 157899 Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler Unsere Datenschutzerklärung finden Sie hier: https://embedded-brains.de/datenschutzerklaerung/ ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: [PATCH] tester: Make the SIS time limit user configurable
On 6/7/2022 6:00 pm, Sebastian Huber wrote: > Yes, if tests go wrong the tester can kill a test execution after the > specified > timeout. Killing should be taken as a sign something in the test equipment is broken. > Why do we need this arbitrary SIS -tlim of 400 s? There was a few values and I select this one based on those. If you have a better value please say. I do not recommend removing the time limit option from testing. If you want to operate that way create a user config with: [erc32-sis] sis_time_limit = > Which problem does this solve? Repeatable test results across wide ranging hosts and host operating systems. > Why can't we let the tests run in SIS without a limit just like we do it for > Qemu? Qemu is painful to make work in a consistent and reliable way. > Normally, if a test is done, it terminates the SIS execution. The time outs are for the cases that end abnormally. > The new performance tests can be used to catch performance regressions. We need to consider the existing benchmarks. > In the > long run I think we need a more modular approach. For example, one component > which runs tests and reports the test output. Another component which analyses > the test output. The test outputs can be archived. The rtems-test command is required to report regressions and as simply as possible. A user wants to build, test then know what they have matches what we released. Chris ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: [PATCH] tester: Make the SIS time limit user configurable
On 06/07/2022 09:38, Chris Johns wrote: On 6/7/2022 4:26 pm, Sebastian Huber wrote: On 06/07/2022 01:51, chr...@rtems.org wrote: +# +# Timeout option. This is the default for timeout for the CPU realtime +# clock + +%ifn %{defined sis_time_limit} + %define sis_time_limit -tlim 400 s +%endif Making this configurable is good, but why do you impose a limit by default? Why can't the simulator run forever in the standard configuration? I think I understand your issue so I will explain my understanding and then we can see if we align. :) The SIS has the ability to set a time limit which I understand is a simulated CPU realtime clock period. A simulated realtime clock should not be effected by the host hardware specs, the host's current load or the number of parallel simulations it has running. The timeout fails a test that does not complete in the set period of time and that time is scaled against the host's realtime clock as the host schedules time to each simulator. The SIS -tlim option specifies the maximum simulation time. The tester has a timeout that time's out against the host's realtime clock and that clock is not adjusted based on the execution time given to a simulator. With hardware targets the host's realtime clock and the hardware target's realtime clock are aligned because Einstein says they are. We need timeouts to catch tests that fail and to catch test equipment that fails. In the case of simulators it could be a bug that locks the simulation up. The tester would like to be able to distinguish between a test failure and equipment failure. In the case of hardware targets using TFTP there is considerable effort put into determining if a target has failed to start a test so it can retry starting the test verses a test starting and failing to end. TFTP and networks do fail and in the case of the BeagleBoard it has a network device with no reset that can lock up after a software reset to restarts include power cycling. This change has a default because the original logic had a default and I did not want to changed what we had along with the config change. I am not sure the default I selected is right but it is now easier to change. Yes, if tests go wrong the tester can kill a test execution after the specified timeout. Why do we need this arbitrary SIS -tlim of 400 s? Which problem does this solve? Why can't we let the tests run in SIS without a limit just like we do it for Qemu? Normally, if a test is done, it terminates the SIS execution. This change does not cater for tests that have varying and valid long test times and I suspect this is the issue you are facing with the validation tests. This will also be the case for tests that are not a single test per test executable. I see this as a valid but separate issue to the SIS time limit parameter config change. I have thought for a while tests should output as test metsdata a time limit. The time limit can be maximum test period specified relative to the target's realtime clock. The time limit could arch or BSP specific to match the performance of the hardware being tested. A per test time limit means we do not stall the tester with really long timeouts that are needed on loaded simulation hosts. Hardware targets and simulators that have time limit options can be adjusted to a test's valid time limit. For qemu we could scale the per test timeout depending on the number of jobs plus a scale factor. The time limit metadata couls also privde an error factor so the tester could start to flag performance regression. The new performance tests can be used to catch performance regressions. In the long run I think we need a more modular approach. For example, one component which runs tests and reports the test output. Another component which analyses the test output. The test outputs can be archived. -- embedded brains GmbH Herr Sebastian HUBER Dornierstr. 4 82178 Puchheim Germany email: sebastian.hu...@embedded-brains.de phone: +49-89-18 94 741 - 16 fax: +49-89-18 94 741 - 08 Registergericht: Amtsgericht München Registernummer: HRB 157899 Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler Unsere Datenschutzerklärung finden Sie hier: https://embedded-brains.de/datenschutzerklaerung/ ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: [PATCH] tester: Make the SIS time limit user configurable
On 6/7/2022 4:26 pm, Sebastian Huber wrote: > On 06/07/2022 01:51, chr...@rtems.org wrote: >> +# >> +# Timeout option. This is the default for timeout for the CPU realtime >> +# clock >> + >> +%ifn %{defined sis_time_limit} >> + %define sis_time_limit -tlim 400 s >> +%endif > > Making this configurable is good, but why do you impose a limit by default? > Why > can't the simulator run forever in the standard configuration? I think I understand your issue so I will explain my understanding and then we can see if we align. :) The SIS has the ability to set a time limit which I understand is a simulated CPU realtime clock period. A simulated realtime clock should not be effected by the host hardware specs, the host's current load or the number of parallel simulations it has running. The timeout fails a test that does not complete in the set period of time and that time is scaled against the host's realtime clock as the host schedules time to each simulator. The tester has a timeout that time's out against the host's realtime clock and that clock is not adjusted based on the execution time given to a simulator. With hardware targets the host's realtime clock and the hardware target's realtime clock are aligned because Einstein says they are. We need timeouts to catch tests that fail and to catch test equipment that fails. In the case of simulators it could be a bug that locks the simulation up. The tester would like to be able to distinguish between a test failure and equipment failure. In the case of hardware targets using TFTP there is considerable effort put into determining if a target has failed to start a test so it can retry starting the test verses a test starting and failing to end. TFTP and networks do fail and in the case of the BeagleBoard it has a network device with no reset that can lock up after a software reset to restarts include power cycling. This change has a default because the original logic had a default and I did not want to changed what we had along with the config change. I am not sure the default I selected is right but it is now easier to change. This change does not cater for tests that have varying and valid long test times and I suspect this is the issue you are facing with the validation tests. This will also be the case for tests that are not a single test per test executable. I see this as a valid but separate issue to the SIS time limit parameter config change. I have thought for a while tests should output as test metsdata a time limit. The time limit can be maximum test period specified relative to the target's realtime clock. The time limit could arch or BSP specific to match the performance of the hardware being tested. A per test time limit means we do not stall the tester with really long timeouts that are needed on loaded simulation hosts. Hardware targets and simulators that have time limit options can be adjusted to a test's valid time limit. For qemu we could scale the per test timeout depending on the number of jobs plus a scale factor. The time limit metadata couls also privde an error factor so the tester could start to flag performance regression. Chris ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: [PATCH] tester: Make the SIS time limit user configurable
On 06/07/2022 01:51, chr...@rtems.org wrote: +# +# Timeout option. This is the default for timeout for the CPU realtime +# clock + +%ifn %{defined sis_time_limit} + %define sis_time_limit -tlim 400 s +%endif Making this configurable is good, but why do you impose a limit by default? Why can't the simulator run forever in the standard configuration? -- embedded brains GmbH Herr Sebastian HUBER Dornierstr. 4 82178 Puchheim Germany email: sebastian.hu...@embedded-brains.de phone: +49-89-18 94 741 - 16 fax: +49-89-18 94 741 - 08 Registergericht: Amtsgericht München Registernummer: HRB 157899 Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler Unsere Datenschutzerklärung finden Sie hier: https://embedded-brains.de/datenschutzerklaerung/ ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: [PATCH] tester: Make the SIS time limit user configurable
Optimization may affect the time but host speed is more likely Ok to commit On Tue, Jul 5, 2022, 6:51 PM wrote: > From: Chris Johns > > Let the user set the test time limit in a config file to > provide site specific overrides. Optimisation can effect > the time a test may take to run. > --- > tester/rtems/testing/bsps/erc32-sis.ini | 5 +- > tester/rtems/testing/bsps/gr740-sis.ini | 5 +- > tester/rtems/testing/bsps/griscv-sis-cov.ini | 5 +- > tester/rtems/testing/bsps/griscv-sis.ini | 5 +- > tester/rtems/testing/bsps/leon2-sis.ini | 5 +- > tester/rtems/testing/bsps/leon3-run.ini | 3 +- > tester/rtems/testing/bsps/leon3-sis-cov.ini | 5 +- > tester/rtems/testing/bsps/leon3-sis.ini | 5 +- > tester/rtems/testing/bsps/sis-run.ini| 5 +- > tester/rtems/testing/sis.cfg | 72 > 10 files changed, 89 insertions(+), 26 deletions(-) > create mode 100644 tester/rtems/testing/sis.cfg > > diff --git a/tester/rtems/testing/bsps/erc32-sis.ini > b/tester/rtems/testing/bsps/erc32-sis.ini > index fca2122..a025265 100644 > --- a/tester/rtems/testing/bsps/erc32-sis.ini > +++ b/tester/rtems/testing/bsps/erc32-sis.ini > @@ -34,6 +34,5 @@ > [erc32-sis] > bsp = erc32 > arch = sparc > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd = %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-sis > -bsp_run_opts = -nouartrx -r -tlim 600 s > +tester = %{_rtscripts}/sis.cfg > +bsp_run_opts = > diff --git a/tester/rtems/testing/bsps/gr740-sis.ini > b/tester/rtems/testing/bsps/gr740-sis.ini > index b71048c..c42d716 100644 > --- a/tester/rtems/testing/bsps/gr740-sis.ini > +++ b/tester/rtems/testing/bsps/gr740-sis.ini > @@ -33,6 +33,5 @@ > [gr740-sis] > bsp = gr740 > arch = sparc > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd = %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-sis > -bsp_run_opts = -gr740 -nouartrx -r -tlim 200 s -m 4 > +tester = %{_rtscripts}/sis.cfg > +bsp_run_opts = -gr740 -m 4 > diff --git a/tester/rtems/testing/bsps/griscv-sis-cov.ini > b/tester/rtems/testing/bsps/griscv-sis-cov.ini > index 9ab37a8..fa86b55 100644 > --- a/tester/rtems/testing/bsps/griscv-sis-cov.ini > +++ b/tester/rtems/testing/bsps/griscv-sis-cov.ini > @@ -34,7 +34,6 @@ > [griscv-sis-cov] > bsp= griscv-sis > arch = riscv > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd= %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-sis > -bsp_run_opts = -nouartrx -r -tlim 300 s -m 4 -cov > +tester = %{_rtscripts}/sis.cfg > +bsp_run_opts = -m 4 -cov > bsp_covoar_cmd = -S %{bsp_symbol_path} -E %{cov_explanations} -f TSIM > diff --git a/tester/rtems/testing/bsps/griscv-sis.ini > b/tester/rtems/testing/bsps/griscv-sis.ini > index b21cba1..bf32851 100644 > --- a/tester/rtems/testing/bsps/griscv-sis.ini > +++ b/tester/rtems/testing/bsps/griscv-sis.ini > @@ -34,6 +34,5 @@ > [griscv-sis] > bsp = griscv > arch = riscv > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd = %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-sis > -bsp_run_opts = -nouartrx -r -tlim 300 s -m 4 > +tester = %{_rtscripts}/sis.cfg > +bsp_run_opts = -m 4 > diff --git a/tester/rtems/testing/bsps/leon2-sis.ini > b/tester/rtems/testing/bsps/leon2-sis.ini > index 61205ad..810320c 100644 > --- a/tester/rtems/testing/bsps/leon2-sis.ini > +++ b/tester/rtems/testing/bsps/leon2-sis.ini > @@ -34,6 +34,5 @@ > [leon2-sis] > bsp = leon2 > arch = sparc > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd = %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-sis > -bsp_run_opts = -leon2 -nouartrx -r -tlim 200 s > +tester = %{_rtscripts}/sis.cfg > +bsp_run_opts = -leon2 > diff --git a/tester/rtems/testing/bsps/leon3-run.ini > b/tester/rtems/testing/bsps/leon3-run.ini > index a8c97a6..99c391b 100644 > --- a/tester/rtems/testing/bsps/leon3-run.ini > +++ b/tester/rtems/testing/bsps/leon3-run.ini > @@ -34,6 +34,5 @@ > [leon3-run] > bsp = leon3 > arch = sparc > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd = %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-run > +tester = %{_rtscripts}/sis.cfg > bsp_run_opts = -a -leon3 > diff --git a/tester/rtems/testing/bsps/leon3-sis-cov.ini > b/tester/rtems/testing/bsps/leon3-sis-cov.ini > index d8ffe28..7c6a279 100644 > --- a/tester/rtems/testing/bsps/leon3-sis-cov.ini > +++ b/tester/rtems/testing/bsps/leon3-sis-cov.ini > @@ -34,7 +34,6 @@ > [leon3-sis-cov] > bsp= leon3-sis > arch = sparc > -tester = %{_rtscripts}/run.cfg > -bsp_run_cmd= %{rtems_tools}/%{bsp_arch}-rtems%{rtems_version}-sis > -bsp_run_opts = -leon3 -nouartrx -r -tlim 200 s -cov > +tester = %{_rtscripts}/sis.cfg > +bsp_run_opts = -leon3 -cov > bsp_covoar_cmd = -S %{bsp_symbol_path} -E %{cov_explanations} -f TSIM > diff --git a/tester/rtems/testing/bsps/leon3-sis.ini >