if the @ is consistent with all the files, that makes it relatively easy.

find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c

-wes

On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes <barnmich...@gmail.com>
wrote:

> On Mon, Aug 16, 2021 at 5:29 PM David Fleck <dcfl...@protonmail.ch> wrote:
>
> > As Wes said, an example or two would help greatly.
> >
> > --- David Fleck
> >
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> >
> > On Monday, August 16th, 2021 at 7:17 PM, wes <p...@the-wes.com> wrote:
> >
> > > are firstnames and lastnames always separated by the same character in
> > each
> > >
> > > filename?
> > >
> > > are the names separated from the rest of the info in the filename the
> > same
> > >
> > > way for each file?
> > >
> > > are you doing this once, or will this be a repeating task that would be
> > >
> > > handy to automate?
> > >
> > > would you be able to provide a few same filenames, perhaps with the
> > >
> > > personal info obfuscated?
> > >
> > > generally, the way I would approach this is to pare the filenames down
> to
> > >
> > > the people's names, and then run uniq against that list. uniq -c will
> > >
> > > provide a count of how many times a given string appears in the input.
> if
> > >
> > > I'm doing this once, I would generate a text file containing the list
> of
> > >
> > > filenames I will be working with, for example:
> > >
> > > find Processed -type f > processed-files.txt
> > >
> > > then use a text editor to pare down the entries as described above,
> using
> > >
> > > find and replace functions to remove the extra data, so only the
> people's
> > >
> > > names remain. then simply uniq -c that file and you're done. I
> personally
> > >
> > > use vi for this, but just about any editor will do. I like this
> approach
> > >
> > > for a number of reasons, not the least of which is that I can
> spot-check
> > >
> > > random samples after each editing step to try to spot unexpected
> results.
> > >
> > > if you want to automate this, it may be a little more complicated, and
> > the
> > >
> > > answers to my initial questions become important. if you can provide a
> > >
> > > little more context, I will try to help further.
> > >
> > > -wes
> > >
> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes barnmich...@gmail.com
> > >
> > > wrote:
> > >
> > > > Here's a fun trivia task. For an activity I am involved in, I get
> files
> > > >
> > > > from members to process. The filename starts with the member's name
> > and has
> > > >
> > > > other info to identify the file. After processing, the file goes in
> the
> > > >
> > > > ./Processed folder. There are thousands of files now in that folder.
> > Right
> > > >
> > > > now, I'm looking for a couple basic pieces of information. First, I
> > want to
> > > >
> > > > know how many unique names I have in the list. Second, I'd like a
> list
> > of
> > > >
> > > > names and how many files go with each name.
> > > >
> > > > I'm sure this is trivial, but my mind is blanking out on it. A couple
> > > >
> > > > simple examples would be nice. Non-answers, like "easy to do
> > with'xxx'" or
> > > >
> > > > references to man pages or George's Book, etc. are not helpful right
> > now.
> > > >
> > > > Thanks,
> > > >
> > > > Michael
> >
>
> Actually, they are callsigns instead of names. A couple of examples:
>
> w7...@k-0496-20210526.txt
> wa7...@k-0497-20210714.txt
> n8...@k-4386-20210725.txt
>
> I would like a simple count of the unique callsigns on a random basis and
> possibly an occasional report listing each callsign and how many files are
> in the folder for each.
>
> Michael
>

Reply via email to