To get the count of unique callsigns, you can just feed this same command into wc -l.
find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c | wc -l -wes On Mon, Aug 16, 2021 at 7:21 PM wes <[email protected]> wrote: > if the @ is consistent with all the files, that makes it relatively easy. > > find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c > > -wes > > On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes <[email protected]> > wrote: > >> On Mon, Aug 16, 2021 at 5:29 PM David Fleck <[email protected]> >> wrote: >> >> > As Wes said, an example or two would help greatly. >> > >> > --- David Fleck >> > >> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ >> > >> > On Monday, August 16th, 2021 at 7:17 PM, wes <[email protected]> wrote: >> > >> > > are firstnames and lastnames always separated by the same character in >> > each >> > > >> > > filename? >> > > >> > > are the names separated from the rest of the info in the filename the >> > same >> > > >> > > way for each file? >> > > >> > > are you doing this once, or will this be a repeating task that would >> be >> > > >> > > handy to automate? >> > > >> > > would you be able to provide a few same filenames, perhaps with the >> > > >> > > personal info obfuscated? >> > > >> > > generally, the way I would approach this is to pare the filenames >> down to >> > > >> > > the people's names, and then run uniq against that list. uniq -c will >> > > >> > > provide a count of how many times a given string appears in the >> input. if >> > > >> > > I'm doing this once, I would generate a text file containing the list >> of >> > > >> > > filenames I will be working with, for example: >> > > >> > > find Processed -type f > processed-files.txt >> > > >> > > then use a text editor to pare down the entries as described above, >> using >> > > >> > > find and replace functions to remove the extra data, so only the >> people's >> > > >> > > names remain. then simply uniq -c that file and you're done. I >> personally >> > > >> > > use vi for this, but just about any editor will do. I like this >> approach >> > > >> > > for a number of reasons, not the least of which is that I can >> spot-check >> > > >> > > random samples after each editing step to try to spot unexpected >> results. >> > > >> > > if you want to automate this, it may be a little more complicated, and >> > the >> > > >> > > answers to my initial questions become important. if you can provide a >> > > >> > > little more context, I will try to help further. >> > > >> > > -wes >> > > >> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes [email protected] >> > > >> > > wrote: >> > > >> > > > Here's a fun trivia task. For an activity I am involved in, I get >> files >> > > > >> > > > from members to process. The filename starts with the member's name >> > and has >> > > > >> > > > other info to identify the file. After processing, the file goes in >> the >> > > > >> > > > ./Processed folder. There are thousands of files now in that folder. >> > Right >> > > > >> > > > now, I'm looking for a couple basic pieces of information. First, I >> > want to >> > > > >> > > > know how many unique names I have in the list. Second, I'd like a >> list >> > of >> > > > >> > > > names and how many files go with each name. >> > > > >> > > > I'm sure this is trivial, but my mind is blanking out on it. A >> couple >> > > > >> > > > simple examples would be nice. Non-answers, like "easy to do >> > with'xxx'" or >> > > > >> > > > references to man pages or George's Book, etc. are not helpful right >> > now. >> > > > >> > > > Thanks, >> > > > >> > > > Michael >> > >> >> Actually, they are callsigns instead of names. A couple of examples: >> >> [email protected] >> [email protected] >> [email protected] >> >> I would like a simple count of the unique callsigns on a random basis and >> possibly an occasional report listing each callsign and how many files are >> in the folder for each. >> >> Michael >> >
