On 23Sep2020 13:24, pascal z <barp...@yahoo.com> wrote:
>Hello, I'm working on a script where I want to loop into folders somehow 
>recursively to get information but I want to limit the infos for the files on 
>a certain level of folders for example:
>
>/home/user/Documents/folder1
>/home/user/Documents/folder2
>/home/user/Documents/folder3/folder1/file1
>/home/user/Documents/folder4/file1
>/home/user/Documents/file1
>/home/user/Documents/file2
>/home/user/Documents/file3
>
>I only want file1, 2, 3 at the root of Documents to show (write to a csv) and 
>I'm using the script below

Like the -depth option to the find command?

First I would note that if you only want the top level, you don't need 
to use os.walk, just look at os.listdir.

But if you want to be more flexible (eg down to level 3, or just 
particular directories), then you need os.walk. So, to your script...

The first thing to note is that os.walk hands you to actual list of 
subdirectory names it is going to use as "dirs". If you modify that list 
(in place), os.walk uses the modified list. You could sort that list to 
do a lexically ordered tree walk, and you can delete things from the 
list to prevent os.walk walking into those items.

For the deletion thing, indeed for the sort thing, it is often easiest 
to construct a new list with the desired result, then update the 
original - you need to update the original because that is the object 
os.walk is using:

    for root, dirs, files in os.walk(Lpath, topdown=False):
        # make a new list of "x*" names, sorted
        new_subdirs = sorted(
            dirname for dirname in dirs if dirname.startswith('z')
        )
        # update the original list in place
        dirs[:] = new_subdirs

For the general form of your problem (limiting the depth) you need to 
know the depth of "root", which os.walk does not tell you directly.

Two approaches com to mind:

The lexical approach is to look at the path from Lpath to root and 
measure its length. os.path.relpath will get you the relative path, and 
you could split that on os.sep to get path components. There's also the 
Pathlib library.

The other approach to it compute the expected path depths. Keep a 
mapping (a dict) keyed on path with depth as the value. Prepare it with:

    depths = {Lpath: 0}

Inside the loop, get the current depth:

    depth = depths[root]

You can make decisions on that. For example, if the depth reaches your 
threshold you could simply empty the "dirs" list (remember to do that in 
place) to prevent os.walk going any deeper.

Before the end of the loop, compute and store the depth for each 
subdirectory:

    for subdir in dirs:
        subpath = os.path.join(root, subdir)
        depths[subpath] = depth + 1

so that they are ready for use when os.walk gives them to you on later 
iterations.

Cheers,
Cameron Simpson <c...@cskk.id.au>
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to