Re: How to test?

DL Neil via Python-list Sat, 25 Apr 2020 20:30:39 -0700

On 25/04/20 7:53 PM, Manfred Lotz wrote:

On Sat, 25 Apr 2020 18:41:37 +1200
DL Neil <pythonl...@danceswithmice.info> wrote:

On 25/04/20 5:16 PM, Manfred Lotz wrote:

On Fri, 24 Apr 2020 19:12:39 -0300
Cholo Lennon <chololen...@hotmail.com> wrote:

On 24/4/20 15:40, Manfred Lotz wrote:

I have a command like application which checks a directory tree
for certain things. If there are errors then messages will be
written to stdout.


> What I do here specifically is to check directory trees' file objects
> for user and group ownerships as well as permissions according to a
> given policy. There is a general policy and there could be exceptions
> which specify a specific policy for a certain file.

If I have understood correctly, the objective is to check a dir-tree toensure that specific directory/file-permissions are in-effect/have notbeen changed. The specifications come from a .JSON file and may beover-ridden by command-line arguments. Correct?

There must be a whole 'genre' of programs which inspect a directory-treeand doing 'something' to the files-contained. I had a few, and needinganother, decided to write a generic 'scanner' which would then call atailored-function to perform the particular 'something' - come to thinkof it, am not sure if that was ever quite finished. Sigh!



>>>>> One idea was for the error situations to write messages to files
>>>>> and then later when running the tests to compare the error
>>>>> messages output to the previously saved output.
>>>>>
>>>>> Is there anything better?

The next specification appears to be that you want a list of files andtheir stats, perhaps reporting only any exceptions which weren't there'last time'.

In which case, an exception could be raised, or it might be simpler tocall a reporting-function when a file should be 'reported'.

I'm still a little confused about the difference betweenprinting/logging/reporting some sort of exception, and the need tocompare with 'history'.

The problem with the "previously saved output" is the process of linking'today's data' with that from the last run. I would use a database (butthen, that's my background/bias) to store each file's stat.

Alternately, if you only need to store exceptions, and that number islikely to be small, perhaps output only the exceptions from 'this run'to a .JSON/.yaml file - which would become input (and a dict for easylook-ups) next time?



>>>> Maybe I am wrong because I don't understand your scenario: If your
>>>> application is like a command, it has to return an error code to
>>>> the system, a distinct number for each error condition. The error
>>>> code is easier to test than the stdout/stderr.

Either way, you might decrease the amount of data to be stored byreducing the file's stat to a code.

How to test this in the best way?

The best way to prepare for unit-testing is to have 'units' of code(apologies!).

An effective guide is to break the code into functions and methods, sothat each performs exactly one piece/unit of work - and only the one. Abetter guide is that if you cannot name the procedure using onedescription and want to add an "and" an "or" or some other conjunction,perhaps there should be more than one procedure!


For example:

>    for cat in cats:
>          ...
>          for d in scantree(cat.dir):
>              # if `keep_fs` was specified then we must
>              # make sure the file is on the same device
>              if cat.keep_fs and devid != get_devid(d.path):
>                  continue
>
>              cat.check(d)

If this were a function, how would you name it? First of all we areprocessing every category, then we are scanning a dir-tree, and finallywe are doing something to each directory found.


If we split these into separate routines, eg (sub-setting the above):

>          for d in scantree(cat.dir):
               do_something_with_directory( d )

and you devise a (much) more meaningful name than mine, it will actuallyhelp readers (others - but also you!) to understand the steps within thelogic.

Now if we have a function which checks a single fileNM for 'whatever',the code will start something like:


        def stat_file( fileNM ):
                '''Gather stat information for nominated file.'''
                etc

So, we can now write a test-function because we don't need any"categories", we don't need all the dir-tree, and we don't need to haveperformed a scan - all we need is a (valid and previously inspected)file-path. Thus (using Pytest):


        def test_stat_file( ... ):
                '''Test file stat.'''
                assert stat_file( "...file-path" ) == ...its known-stat

        def test_stat_dir( ... ):
                '''Test file stat.'''
                assert stat_file( "...dir-path" ) == ...its known-stat

There is no need to test for a non-existent file, if you are the only user!

In case you hadn't thought about it, make the test path, part of yourtest directory - not part of 'the real world'!

This is why TDD ("Test-Driven Development) theory says that one shoulddevise tests 'first' (and the actual code, 'second'). You must designthe logic first, and then start bottom-up, by writing tests to 'prove'the most simple functionality (and then the code to 'pass' those tests),gradually expanding the scope of the tests as single functions arecombined in calls from larger/wider/controlling functions...

This takes the previous time-honored practice of writing the code first,and then testing it (or even, 'throwing it over the wall' to a testdepartment), and turns it on its head. (no wall necessary)

I find it a very good way of making sure that when coding a systemcomprising many 'moving parts', when I make "a little change" to somecomponent 'over here', the test-regime will quickly reveal anyunintended consequences (effects, artifacts), etc, 'over there' - eventhough my (very) small brain hadn't realised the impact! ("it's justone, little, change!")

The policy file is a JSON file and could have different categories.
Each category defines a policy for a certain directory tree. Comand
line args could overwrite directory names, as well as user and group.
Or if for example a directory is not specified in the JSON file I am
required to specify it via command line. Otherwise no check can take
place.
default":
   {
       "match_perms":  "644",
       "match_permsd":  "755",
       "match_permsx":  "755",
       "owner":  "manfred",
       "group":  "manfred"
   }


Won't a directory-path be required, to tie 'policy' to 'tree'?


Is that enough feedback to help you take a few more steps?
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list

Re: How to test?

Reply via email to