if the @ is consistent with all the files, that makes it relatively easy. find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c
-wes On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes <[email protected]> wrote: > On Mon, Aug 16, 2021 at 5:29 PM David Fleck <[email protected]> wrote: > > > As Wes said, an example or two would help greatly. > > > > --- David Fleck > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > > > > On Monday, August 16th, 2021 at 7:17 PM, wes <[email protected]> wrote: > > > > > are firstnames and lastnames always separated by the same character in > > each > > > > > > filename? > > > > > > are the names separated from the rest of the info in the filename the > > same > > > > > > way for each file? > > > > > > are you doing this once, or will this be a repeating task that would be > > > > > > handy to automate? > > > > > > would you be able to provide a few same filenames, perhaps with the > > > > > > personal info obfuscated? > > > > > > generally, the way I would approach this is to pare the filenames down > to > > > > > > the people's names, and then run uniq against that list. uniq -c will > > > > > > provide a count of how many times a given string appears in the input. > if > > > > > > I'm doing this once, I would generate a text file containing the list > of > > > > > > filenames I will be working with, for example: > > > > > > find Processed -type f > processed-files.txt > > > > > > then use a text editor to pare down the entries as described above, > using > > > > > > find and replace functions to remove the extra data, so only the > people's > > > > > > names remain. then simply uniq -c that file and you're done. I > personally > > > > > > use vi for this, but just about any editor will do. I like this > approach > > > > > > for a number of reasons, not the least of which is that I can > spot-check > > > > > > random samples after each editing step to try to spot unexpected > results. > > > > > > if you want to automate this, it may be a little more complicated, and > > the > > > > > > answers to my initial questions become important. if you can provide a > > > > > > little more context, I will try to help further. > > > > > > -wes > > > > > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes [email protected] > > > > > > wrote: > > > > > > > Here's a fun trivia task. For an activity I am involved in, I get > files > > > > > > > > from members to process. The filename starts with the member's name > > and has > > > > > > > > other info to identify the file. After processing, the file goes in > the > > > > > > > > ./Processed folder. There are thousands of files now in that folder. > > Right > > > > > > > > now, I'm looking for a couple basic pieces of information. First, I > > want to > > > > > > > > know how many unique names I have in the list. Second, I'd like a > list > > of > > > > > > > > names and how many files go with each name. > > > > > > > > I'm sure this is trivial, but my mind is blanking out on it. A couple > > > > > > > > simple examples would be nice. Non-answers, like "easy to do > > with'xxx'" or > > > > > > > > references to man pages or George's Book, etc. are not helpful right > > now. > > > > > > > > Thanks, > > > > > > > > Michael > > > > Actually, they are callsigns instead of names. A couple of examples: > > [email protected] > [email protected] > [email protected] > > I would like a simple count of the unique callsigns on a random basis and > possibly an occasional report listing each callsign and how many files are > in the folder for each. > > Michael >
