On 25 May 2017 at 10:03, Neil Williams <[email protected]> wrote: > On Wed, 24 May 2017 21:07:45 +0200 > Vincent Guittot <[email protected]> wrote: > >> Hi Neil, >> >> Le 24 mai 2017 7:42 PM, "Lisa Nguyen" <[email protected]> a >> écrit : >> >> On 24 May 2017 at 17:02, Neil Williams <[email protected]> wrote: >> > On Fri, 19 May 2017 17:02:14 +0100 >> > Neil Williams <[email protected]> wrote: >> > >> >> On Fri, 19 May 2017 16:48:11 +0100 >> >> Steve McIntyre <[email protected]> wrote: >> >> >> >> > Hi folks! >> >> > >> >> > On Wed, May 17, 2017 at 03:05:41PM +0100, Neil Williams wrote: >> >> > >On Thu, 27 Apr 2017 08:19:19 +0100 >> >> > >Neil Williams <[email protected]> wrote: >> >> > > >> >> > >> >> > I've just run a local test with an AEP inside lxc on my local >> >> > machine. As far as I can see, there's nothing particularly magic >> >> > going on here. The only problem I see is Lisa's config file >> >> > pointing at the wrong device file. arm-probe needs a ttyACM-style >> >> > device to talk to. Using: >> >> > >> >> > # lxc-device -n lxc-aep-test-174524 add /dev/ttyACM0 >> >> > >> >> > I create that device in my container. I build libwebsockets and >> >> > the arm-probe software in the container, then >> >> > specify /dev/ttyACM0 in the AEP config file. I can run it just >> >> > fine: >> >> > >> >> > root@lxc-aep-test-174524:/arm-probe# ./arm-probe/arm-probe -C >> >> > panda-aep.cfg -l10 -x # configuration: panda-aep.cfg >> >> > # config_name: pandaboard >> >> > # trigger: 0.400000V (hyst 0.200000V) 0.000000W (hyst 0.200000W) >> >> > 400us Configuration: pandaboard >> >> > # date: Fri, 19 May 2017 16:29:50 +0100 >> >> > # host: lxc-aep-test-174524 >> >> > # >> >> > + /dev/ttyACM0 >> >> > Starting... >> >> > sending start to 0 >> >> > # VDD_ALL VDD ROOT #ff0000 SoC >> >> > # >> >> > # >> >> > time VDD(V) VDD(A) VDD(W) >> >> > 0.000500 5.11 0.0474 0.24196 >> >> > 0.000600 5.11 0.0364 0.18572 >> >> > 0.000700 5.11 0.0314 0.16012 >> >> > 0.000800 5.10 0.0544 0.27734 >> >> > 0.000900 5.10 0.0234 0.11923 >> >> > 0.001000 5.11 0.0304 0.15505 >> >> > ... >> >> > >> >> > I don't have any problems running things and getting output here. >> >> > >> >> > I *have* seen two real bugs here while trying to get things >> >> > running, though: >> >> > >> >> > 1. If the device specified in the config file doesn't exist, or >> >> > is the wrong type of device, or (maybe) there is any other kind >> >> > of problem with it, you get *no* useful feedback to say there's a >> >> > problem. Running things under strace will show the background >> >> > libarmep process attempt to use the device specified, but >> >> > there's no error handling. :-( >> >> > >> >> > 2. The "-x" option says that the arm-probe program is meant to >> >> > exit when you've done capturing, but it just sits there forever >> >> > when I'm testing. I've wrapped it using the "timeout" command to >> >> > work around that for now. >> >> > >> >> > If I knew where to file those bugs, I would, but it's really not >> >> > obvious. They're really easy to reproduce, I hope... >> >> > >> >> > In terms of the /dev/ttyACM0 creation, the lxc-device man page >> >> > says that it creates devices based on their existing entries on >> >> > the host. Double-check that the host (dispatcher) has an >> >> > appropriate /dev/ttyACM0 if you're still seeing problems? >> >> >> >> Steve was using staging-panda03 with the ARM Energy Probe which I'd >> >> been using for the tests of the new code to ensure >> >> that /dev/ttyACM0 can be attached to the LXC. >> >> >> >> That panda and AEP will shortly return to staging and then the >> >> changes to LAVA and the required changes to the test definition >> >> can be available for the 2017.6 release. >> > >> > OK. staging-panda03 is back and has been running tests. This is what >> > we've learnt so far: >> > >> > 0: This does not appear to be an LXC issue. Running the commands >> > manually on the worker with the same LXC on the same worker does >> > return data from the probe. >> > >> > 1: Running the same commands in "headless" mode shows that the probe >> > software starts successfully but something within the protocol >> > parser or sampler fails to retrieve data. >> >> >> What do you mean by headless mode? > > With no controlling terminal. > > LAVA runs as a daemon and forks processes to run the tests. This does > not usually cause issues and is fundamental to automation. When I run > the same commands in an LXC as a user logged into the machine, I get > output. When I run the commands from a daemon, the output is not seen.
even when you redirect the output to a file ? On workload automation, arm_probe is called in a dedicated process with subprocess.Popen and we are able to get data in the file. Just wonder what could be the difference in lava case > >> > >> > 2: The websockets dependency is completely unnecessary and has been >> > disabled in the build I've been testing: >> > https://git.linaro.org/lava-team/arm-probe.git/ >> >> >> Yes. I do the same. aepd is only useful for the web interface. >> >> >> > >> > 3: We've added a *lot* of debug to the arm-probe code >> > (https://staging.validation.linaro.org/scheduler/job/174969 which >> > was run using >> > https://git.linaro.org/lava-team/arm-probe.git/commit/?id=9b >> 2958e3045da77d7db25a7cfe48359211aa4cf1) >> > but are not much closer to identifying the precise problem with the >> > code. However, I am satisfied that this is a problem in the >> > arm-probe software when being run in automation. >> >> >> Can you give details about "this is a problem in arm probe software >> when being run in automation"? Do you mean workload automation? > > No. Not workload automation - that is a specific test framework which > can use LAVA. I'm talking about the process of running tests on behalf > of users without the users being logged in or interacting with the > shell. ok. Just to be sure about the context > >> > >> > 4: the arm-probe code is appallingly difficult to read and debug. It >> > also seems unnecessarily complex. >> > >> > 5: I plan to remove a lot of the debug from the cloned arm-probe >> > repository (which has also had a few fixes to compile with gcc6) but >> > I'm running out of time to work on the arm-probe software myself. >> > >> > Someone needs to update the arm-probe software: >> > >> > a) to remove websockets as a compile-time option as this only bloats >> > the build in automation where a web based UI is impossible anyway. >> > I've done this by brute force in my cloned repo, I just patched out >> > the dependency. >> > >> > b) improve the code to have comments and output about what is >> > happening and why when verbose mode is used. >> > >> > c) Identify what is preventing the software from receiving data from >> > the probe when run in automation. >> > >> > d) the config file still needs fixes to allow for changes in the >> > device node name from one probe to another. >> > >> > -- >> >> CC'ing Vincent, so he can read Neil's and Steve's comments above and >> respond (if he has anything to say) while I'm on holiday until early >> June. > > Steve & I are also on annual leave next week. > > -- > > > Neil Williams > ============= > http://www.linux.codehelp.co.uk/ > _______________________________________________ linaro-validation mailing list [email protected] https://lists.linaro.org/mailman/listinfo/linaro-validation
