Op 11/03/2022 om 10:11 schreef Roel Schroeven:
Op 10/03/2022 om 13:16 schreef Loris Bennett:
Hi,

I have a command which produces output like the
following:

   Job ID: 9431211
   Cluster: curta
   User/Group: build/staff
   State: COMPLETED (exit code 0)
   Nodes: 1
   Cores per node: 8
   CPU Utilized: 01:30:53
   CPU Efficiency: 83.63% of 01:48:40 core-walltime
   Job Wall-clock time: 00:13:35
   Memory Utilized: 6.45 GB
   Memory Efficiency: 80.68% of 8.00 GB

I want to parse this and am using subprocess.Popen and accessing the
contents via Popen.stdout.  However, for testing purposes I want to save
various possible outputs of the command as text files and use those as
inputs.

What format should I use to pass data to the actual parsing function?

Is this a command you run, produces that output, and then stops (as opposed to a long-running program that from time to time generates a bunch of output)? Because in that case I would use subprocess.run() with capture_output=True instead of subprocess.Popen(). subprocess.run() returns a CompletedProcess instance wich has stdout and stderr members that contain the captured output as byte sequences or strings, depending on the parameters you passed.

So in your case I would simply read the content of each text file as a whole into a string, and use subprocess.run() to get the command's output also as a string. Then you can have a parse function that accepts such strings, and works exactly the same for the text files as for the command output. Your parse function can then use splitlines() to access the lines individually. The data size is very small so it's not a problem to have it all in memory at the same time (i.e. no need to worry about trying to stream it instead).

Very simple example:

    import subprocess
    from pprint import pprint

    def parse(state_data):
        lines = state_data.splitlines(keepends=False)
        state_dict = {}
        for line in lines:
            key, value = line.split(': ')
            state_dict[key] = value
        return state_dict

    def read_from_command():
        return subprocess.run(['./jobstate'], capture_output=True, check=True, encoding='UTF-8').stdout

    def read_from_file(fn):
        with open(fn, 'rt', encoding='UTF-8') as f:
            return f.read()

    pprint(parse(read_from_command()))
    pprint(parse(read_from_file('jobfile')))

--
"Iceland is the place you go to remind yourself that planet Earth is a
machine... and that all organic life that has ever existed amounts to a greasy
film that has survived on the exterior of that machine thanks to furious
improvisation."
        -- Sam Hughes, Ra

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to