subject:"Re\: \[PLUG\] Counting Files"

Re: [PLUG] Counting Files

2021-08-19 Thread Michael Barnes

On Mon, Aug 16, 2021 at 7:25 PM wes  wrote:

> To get the count of unique callsigns, you can just feed this same command
> into wc -l.
>
> find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c | wc -l
>
> -wes
>
>
> On Mon, Aug 16, 2021 at 7:21 PM wes  wrote:
>
> > if the @ is consistent with all the files, that makes it relatively easy.
> >
> > find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c
> >
> > -wes
> >
> > On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes 
> > wrote:
> >
> >> On Mon, Aug 16, 2021 at 5:29 PM David Fleck 
> >> wrote:
> >>
> >> > As Wes said, an example or two would help greatly.
> >> >
> >> > --- David Fleck
> >> >
> >> > ‐‐‐ Original Message ‐‐‐
> >> >
> >> > On Monday, August 16th, 2021 at 7:17 PM, wes 
> wrote:
> >> >
> >> > > are firstnames and lastnames always separated by the same character
> in
> >> > each
> >> > >
> >> > > filename?
> >> > >
> >> > > are the names separated from the rest of the info in the filename
> the
> >> > same
> >> > >
> >> > > way for each file?
> >> > >
> >> > > are you doing this once, or will this be a repeating task that would
> >> be
> >> > >
> >> > > handy to automate?
> >> > >
> >> > > would you be able to provide a few same filenames, perhaps with the
> >> > >
> >> > > personal info obfuscated?
> >> > >
> >> > > generally, the way I would approach this is to pare the filenames
> >> down to
> >> > >
> >> > > the people's names, and then run uniq against that list. uniq -c
> will
> >> > >
> >> > > provide a count of how many times a given string appears in the
> >> input. if
> >> > >
> >> > > I'm doing this once, I would generate a text file containing the
> list
> >> of
> >> > >
> >> > > filenames I will be working with, for example:
> >> > >
> >> > > find Processed -type f > processed-files.txt
> >> > >
> >> > > then use a text editor to pare down the entries as described above,
> >> using
> >> > >
> >> > > find and replace functions to remove the extra data, so only the
> >> people's
> >> > >
> >> > > names remain. then simply uniq -c that file and you're done. I
> >> personally
> >> > >
> >> > > use vi for this, but just about any editor will do. I like this
> >> approach
> >> > >
> >> > > for a number of reasons, not the least of which is that I can
> >> spot-check
> >> > >
> >> > > random samples after each editing step to try to spot unexpected
> >> results.
> >> > >
> >> > > if you want to automate this, it may be a little more complicated,
> and
> >> > the
> >> > >
> >> > > answers to my initial questions become important. if you can
> provide a
> >> > >
> >> > > little more context, I will try to help further.
> >> > >
> >> > > -wes
> >> > >
> >> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes
> barnmich...@gmail.com
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Here's a fun trivia task. For an activity I am involved in, I get
> >> files
> >> > > >
> >> > > > from members to process. The filename starts with the member's
> name
> >> > and has
> >> > > >
> >> > > > other info to identify the file. After processing, the file goes
> in
> >> the
> >> > > >
> >> > > > ./Processed folder. There are thousands of files now in that
> folder.
> >> > Right
> >> > > >
> >> > > > now, I'm looking for a couple basic pieces of information. First,
> I
> >> > want to
> >> > > >
> >> > > > know how many unique names I have in the list. Second, I'd like a
> >> list
> >> > of
> >> > > >
> >> > > > names and how many files go with each name.
> >> > > >
> >> > > > I'm sure this is trivial, but my mind is blanking out on it. A
> >> couple
> >> > > >
> >> > > > simple examples would be nice. Non-answers, like "easy to do
> >> > with'xxx'" or
> >> > > >
> >> > > > references to man pages or George's Book, etc. are not helpful
> right
> >> > now.
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Michael
> >> >
> >>
> >> Actually, they are callsigns instead of names. A couple of examples:
> >>
> >> w7...@k-0496-20210526.txt
> >> wa7...@k-0497-20210714.txt
> >> n8...@k-4386-20210725.txt
> >>
> >> I would like a simple count of the unique callsigns on a random basis
> and
> >> possibly an occasional report listing each callsign and how many files
> are
> >> in the folder for each.
> >>
> >> Michael
> >>
> >
>

Thanks Everybody,

This has been educational for me. It looks like there were several working
options. I started with Wes' option refined by Robert.
$ find  -type f | cut -d @ -f1 | sort | uniq -c

Since I was working from within the /Processed folder, I did not specify it
on the command line.
Then, I discovered some of the callsigns were not capitalized, so I added
the ignore case option.
$ find  -type f | cut -d @ -f1 | sort | uniq -i -c

That gave me usable output #1.

I added the count with
$ find  -type f | cut -d @ -f1 | sort | uniq -i -c | wc -l

Which gave me output #2.

Finally, I added another sort to give Output #3 for the frequency option.
$ find  -type f | cut -d @ -f1 | sort | uniq -i -c | sort -n


I gave Wes'

Re: [PLUG] Counting Files

2021-08-17 Thread David Fleck

Ugh. Sorry, too little coffee.  Didn't notice this had already been covered.

--- David Fleck

‐‐‐ Original Message ‐‐‐

On Tuesday, August 17th, 2021 at 7:39 AM, David Fleck  
wrote:

> 'cut' might work well also.
>
> > ls | cut -f1 -d@ | sort | uniq -c
>
> to get a list in sort order, or
>
> > ls | cut -f1 -d@ |sort | uniq -c | sort -n
>
> to get a list ordered by frequency.
>
> --- David Fleck
>
> ‐‐‐ Original Message ‐‐‐
>
> On Tuesday, August 17th, 2021 at 1:46 AM, Russell Senior 
> russ...@personaltelco.net wrote:
>
> > From the uniq manpage:
> >
> > Note: 'uniq' does not detect repeated lines unless they are
> >
> > adjacent. You may want to sort the input first, or use 'sort -u'
> >
> > without 'uniq'. Also, comparisons honor the rules specified by
> >
> > 'LC_COLLATE'.
> >
> > On Mon, Aug 16, 2021 at 10:45 PM wes p...@the-wes.com wrote:
> >
> > > On Mon, Aug 16, 2021 at 8:45 PM Randy Bush ra...@psg.com wrote:
> > >
> > > > can you point me to where it is documented that `find` is guaranteed
> > > >
> > > > to produce an ordered list?
> > >
> > > I don't have any such documentation or belief. my belief is that uniq will
> > >
> > > count non-consecutive matches, that's what I'm relying on. however, 
> > > sorting
> > >
> > > first doesn't hurt anything, so have at it.
> > >
> > > yeah, awk is often a more appropriate tool for this type of job, it's just
> > >
> > > that I happened to learn sed first, so I default to that. it's the same
> > >
> > > reason I use vi instead of emacs, it's largely down to complete 
> > > coincidence.
> > >
> > > -wes

Re: [PLUG] Counting Files

2021-08-17 Thread David Fleck

'cut' might work well also.

> ls | cut -f1 -d@ | sort | uniq -c

to get a list in sort order, or

> ls | cut -f1 -d@ |sort | uniq -c | sort -n

to get a list ordered by frequency.

--- David Fleck

‐‐‐ Original Message ‐‐‐

On Tuesday, August 17th, 2021 at 1:46 AM, Russell Senior 
 wrote:

> From the uniq manpage:
>
> Note: 'uniq' does not detect repeated lines unless they are
>
> adjacent. You may want to sort the input first, or use 'sort -u'
>
> without 'uniq'. Also, comparisons honor the rules specified by
>
> 'LC_COLLATE'.
>
> On Mon, Aug 16, 2021 at 10:45 PM wes p...@the-wes.com wrote:
>
> > On Mon, Aug 16, 2021 at 8:45 PM Randy Bush ra...@psg.com wrote:
> >
> > > can you point me to where it is documented that `find` is guaranteed
> > >
> > > to produce an ordered list?
> >
> > I don't have any such documentation or belief. my belief is that uniq will
> >
> > count non-consecutive matches, that's what I'm relying on. however, sorting
> >
> > first doesn't hurt anything, so have at it.
> >
> > yeah, awk is often a more appropriate tool for this type of job, it's just
> >
> > that I happened to learn sed first, so I default to that. it's the same
> >
> > reason I use vi instead of emacs, it's largely down to complete coincidence.
> >
> > -wes

Re: [PLUG] Counting Files

2021-08-17 Thread Randy Bush

>> can you point me to where it is documented that `find` is guaranteed
>> to produce an ordered list?
> I don't have any such documentation or belief. my belief is that uniq
> will count non-consecutive matches

it won't

randy

Re: [PLUG] Counting Files

2021-08-17 Thread Russell Senior

>From the uniq manpage:

  Note: 'uniq' does not detect repeated lines unless they are
adjacent.  You may want to sort the input first, or use 'sort -u'
without 'uniq'.  Also, comparisons honor the rules specified by
'LC_COLLATE'.

On Mon, Aug 16, 2021 at 10:45 PM wes  wrote:
>
> On Mon, Aug 16, 2021 at 8:45 PM Randy Bush  wrote:
>
> >
> > can you point me to where it is documented that `find` is guaranteed
> > to produce an ordered list?
> >
>
> I don't have any such documentation or belief. my belief is that uniq will
> count non-consecutive matches, that's what I'm relying on. however, sorting
> first doesn't hurt anything, so have at it.
>
> yeah, awk is often a more appropriate tool for this type of job, it's just
> that I happened to learn sed first, so I default to that. it's the same
> reason I use vi instead of emacs, it's largely down to complete coincidence.
>
> -wes

Re: [PLUG] Counting Files

2021-08-16 Thread wes

On Mon, Aug 16, 2021 at 8:45 PM Randy Bush  wrote:

>
> can you point me to where it is documented that `find` is guaranteed
> to produce an ordered list?
>

I don't have any such documentation or belief. my belief is that uniq will
count non-consecutive matches, that's what I'm relying on. however, sorting
first doesn't hurt anything, so have at it.

yeah, awk is often a more appropriate tool for this type of job, it's just
that I happened to learn sed first, so I default to that. it's the same
reason I use vi instead of emacs, it's largely down to complete coincidence.

-wes

Re: [PLUG] Counting Files

2021-08-16 Thread David Fleck

As Wes said, an example or two would help greatly.

--- David Fleck

‐‐‐ Original Message ‐‐‐

On Monday, August 16th, 2021 at 7:17 PM, wes  wrote:

> are firstnames and lastnames always separated by the same character in each
>
> filename?
>
> are the names separated from the rest of the info in the filename the same
>
> way for each file?
>
> are you doing this once, or will this be a repeating task that would be
>
> handy to automate?
>
> would you be able to provide a few same filenames, perhaps with the
>
> personal info obfuscated?
>
> generally, the way I would approach this is to pare the filenames down to
>
> the people's names, and then run uniq against that list. uniq -c will
>
> provide a count of how many times a given string appears in the input. if
>
> I'm doing this once, I would generate a text file containing the list of
>
> filenames I will be working with, for example:
>
> find Processed -type f > processed-files.txt
>
> then use a text editor to pare down the entries as described above, using
>
> find and replace functions to remove the extra data, so only the people's
>
> names remain. then simply uniq -c that file and you're done. I personally
>
> use vi for this, but just about any editor will do. I like this approach
>
> for a number of reasons, not the least of which is that I can spot-check
>
> random samples after each editing step to try to spot unexpected results.
>
> if you want to automate this, it may be a little more complicated, and the
>
> answers to my initial questions become important. if you can provide a
>
> little more context, I will try to help further.
>
> -wes
>
> On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes barnmich...@gmail.com
>
> wrote:
>
> > Here's a fun trivia task. For an activity I am involved in, I get files
> >
> > from members to process. The filename starts with the member's name and has
> >
> > other info to identify the file. After processing, the file goes in the
> >
> > ./Processed folder. There are thousands of files now in that folder. Right
> >
> > now, I'm looking for a couple basic pieces of information. First, I want to
> >
> > know how many unique names I have in the list. Second, I'd like a list of
> >
> > names and how many files go with each name.
> >
> > I'm sure this is trivial, but my mind is blanking out on it. A couple
> >
> > simple examples would be nice. Non-answers, like "easy to do with'xxx'" or
> >
> > references to man pages or George's Book, etc. are not helpful right now.
> >
> > Thanks,
> >
> > Michael

Re: [PLUG] Counting Files

2021-08-16 Thread Randy Bush

> find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c

can you point me to where it is documented that `find` is guaranteed
to produce an ordered list?  yes, it seems to often do so, but i have
learned not to trust it.  so i would

... | sort | uniq -c

woulda been cool to use `sort -u` but it does not produce a count

but i like your `sed` hack.  i a more of an `awk` user, but i like
the terseness of your hack

randy

Re: [PLUG] Counting Files

2021-08-16 Thread wes

To get the count of unique callsigns, you can just feed this same command
into wc -l.

find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c | wc -l

-wes


On Mon, Aug 16, 2021 at 7:21 PM wes  wrote:

> if the @ is consistent with all the files, that makes it relatively easy.
>
> find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c
>
> -wes
>
> On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes 
> wrote:
>
>> On Mon, Aug 16, 2021 at 5:29 PM David Fleck 
>> wrote:
>>
>> > As Wes said, an example or two would help greatly.
>> >
>> > --- David Fleck
>> >
>> > ‐‐‐ Original Message ‐‐‐
>> >
>> > On Monday, August 16th, 2021 at 7:17 PM, wes  wrote:
>> >
>> > > are firstnames and lastnames always separated by the same character in
>> > each
>> > >
>> > > filename?
>> > >
>> > > are the names separated from the rest of the info in the filename the
>> > same
>> > >
>> > > way for each file?
>> > >
>> > > are you doing this once, or will this be a repeating task that would
>> be
>> > >
>> > > handy to automate?
>> > >
>> > > would you be able to provide a few same filenames, perhaps with the
>> > >
>> > > personal info obfuscated?
>> > >
>> > > generally, the way I would approach this is to pare the filenames
>> down to
>> > >
>> > > the people's names, and then run uniq against that list. uniq -c will
>> > >
>> > > provide a count of how many times a given string appears in the
>> input. if
>> > >
>> > > I'm doing this once, I would generate a text file containing the list
>> of
>> > >
>> > > filenames I will be working with, for example:
>> > >
>> > > find Processed -type f > processed-files.txt
>> > >
>> > > then use a text editor to pare down the entries as described above,
>> using
>> > >
>> > > find and replace functions to remove the extra data, so only the
>> people's
>> > >
>> > > names remain. then simply uniq -c that file and you're done. I
>> personally
>> > >
>> > > use vi for this, but just about any editor will do. I like this
>> approach
>> > >
>> > > for a number of reasons, not the least of which is that I can
>> spot-check
>> > >
>> > > random samples after each editing step to try to spot unexpected
>> results.
>> > >
>> > > if you want to automate this, it may be a little more complicated, and
>> > the
>> > >
>> > > answers to my initial questions become important. if you can provide a
>> > >
>> > > little more context, I will try to help further.
>> > >
>> > > -wes
>> > >
>> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes barnmich...@gmail.com
>> > >
>> > > wrote:
>> > >
>> > > > Here's a fun trivia task. For an activity I am involved in, I get
>> files
>> > > >
>> > > > from members to process. The filename starts with the member's name
>> > and has
>> > > >
>> > > > other info to identify the file. After processing, the file goes in
>> the
>> > > >
>> > > > ./Processed folder. There are thousands of files now in that folder.
>> > Right
>> > > >
>> > > > now, I'm looking for a couple basic pieces of information. First, I
>> > want to
>> > > >
>> > > > know how many unique names I have in the list. Second, I'd like a
>> list
>> > of
>> > > >
>> > > > names and how many files go with each name.
>> > > >
>> > > > I'm sure this is trivial, but my mind is blanking out on it. A
>> couple
>> > > >
>> > > > simple examples would be nice. Non-answers, like "easy to do
>> > with'xxx'" or
>> > > >
>> > > > references to man pages or George's Book, etc. are not helpful right
>> > now.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Michael
>> >
>>
>> Actually, they are callsigns instead of names. A couple of examples:
>>
>> w7...@k-0496-20210526.txt
>> wa7...@k-0497-20210714.txt
>> n8...@k-4386-20210725.txt
>>
>> I would like a simple count of the unique callsigns on a random basis and
>> possibly an occasional report listing each callsign and how many files are
>> in the folder for each.
>>
>> Michael
>>
>

Re: [PLUG] Counting Files

2021-08-16 Thread wes

if the @ is consistent with all the files, that makes it relatively easy.

find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c

-wes

On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes 
wrote:

> On Mon, Aug 16, 2021 at 5:29 PM David Fleck  wrote:
>
> > As Wes said, an example or two would help greatly.
> >
> > --- David Fleck
> >
> > ‐‐‐ Original Message ‐‐‐
> >
> > On Monday, August 16th, 2021 at 7:17 PM, wes  wrote:
> >
> > > are firstnames and lastnames always separated by the same character in
> > each
> > >
> > > filename?
> > >
> > > are the names separated from the rest of the info in the filename the
> > same
> > >
> > > way for each file?
> > >
> > > are you doing this once, or will this be a repeating task that would be
> > >
> > > handy to automate?
> > >
> > > would you be able to provide a few same filenames, perhaps with the
> > >
> > > personal info obfuscated?
> > >
> > > generally, the way I would approach this is to pare the filenames down
> to
> > >
> > > the people's names, and then run uniq against that list. uniq -c will
> > >
> > > provide a count of how many times a given string appears in the input.
> if
> > >
> > > I'm doing this once, I would generate a text file containing the list
> of
> > >
> > > filenames I will be working with, for example:
> > >
> > > find Processed -type f > processed-files.txt
> > >
> > > then use a text editor to pare down the entries as described above,
> using
> > >
> > > find and replace functions to remove the extra data, so only the
> people's
> > >
> > > names remain. then simply uniq -c that file and you're done. I
> personally
> > >
> > > use vi for this, but just about any editor will do. I like this
> approach
> > >
> > > for a number of reasons, not the least of which is that I can
> spot-check
> > >
> > > random samples after each editing step to try to spot unexpected
> results.
> > >
> > > if you want to automate this, it may be a little more complicated, and
> > the
> > >
> > > answers to my initial questions become important. if you can provide a
> > >
> > > little more context, I will try to help further.
> > >
> > > -wes
> > >
> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes barnmich...@gmail.com
> > >
> > > wrote:
> > >
> > > > Here's a fun trivia task. For an activity I am involved in, I get
> files
> > > >
> > > > from members to process. The filename starts with the member's name
> > and has
> > > >
> > > > other info to identify the file. After processing, the file goes in
> the
> > > >
> > > > ./Processed folder. There are thousands of files now in that folder.
> > Right
> > > >
> > > > now, I'm looking for a couple basic pieces of information. First, I
> > want to
> > > >
> > > > know how many unique names I have in the list. Second, I'd like a
> list
> > of
> > > >
> > > > names and how many files go with each name.
> > > >
> > > > I'm sure this is trivial, but my mind is blanking out on it. A couple
> > > >
> > > > simple examples would be nice. Non-answers, like "easy to do
> > with'xxx'" or
> > > >
> > > > references to man pages or George's Book, etc. are not helpful right
> > now.
> > > >
> > > > Thanks,
> > > >
> > > > Michael
> >
>
> Actually, they are callsigns instead of names. A couple of examples:
>
> w7...@k-0496-20210526.txt
> wa7...@k-0497-20210714.txt
> n8...@k-4386-20210725.txt
>
> I would like a simple count of the unique callsigns on a random basis and
> possibly an occasional report listing each callsign and how many files are
> in the folder for each.
>
> Michael
>

Re: [PLUG] Counting Files

2021-08-16 Thread Robert Citek

On Mon, Aug 16, 2021 at 8:17 PM Michael Barnes 
wrote:

> Actually, they are callsigns instead of names. A couple of examples:
>
> w7...@k-0496-20210526.txt
> wa7...@k-0497-20210714.txt
> n8...@k-4386-20210725.txt
>
> I would like a simple count of the unique callsigns on a random basis and
> possibly an occasional report listing each callsign and how many files are
> in the folder for each.
>
> Michael
>

Here's a mock solution based on what Wes wrote:

$ mkdir Processed/

$ touch Processed/{foo@{01..10},bar@{05..20},dog@{10..30}}

$ ls Processed/
bar@05  bar@08  bar@11  bar@14  bar@17  bar@20  dog@12  dog@15  dog@18
 dog@21  dog@24  dog@27  dog@30  foo@03  foo@06  foo@09
bar@06  bar@09  bar@12  bar@15  bar@18  dog@10  dog@13  dog@16  dog@19
 dog@22  dog@25  dog@28  foo@01  foo@04  foo@07  foo@10
bar@07  bar@10  bar@13  bar@16  bar@19  dog@11  dog@14  dog@17  dog@20
 dog@23  dog@26  dog@29  foo@02  foo@05  foo@08

$ find Processed/ -type f | cut -d @ -f1 | sort | uniq -c
 16 Processed/bar
 21 Processed/dog
 10 Processed/foo

Is that close to what you are looking for?

Regards,
- Robert

Re: [PLUG] Counting Files

2021-08-16 Thread Michael Barnes

On Mon, Aug 16, 2021 at 5:29 PM David Fleck  wrote:

> As Wes said, an example or two would help greatly.
>
> --- David Fleck
>
> ‐‐‐ Original Message ‐‐‐
>
> On Monday, August 16th, 2021 at 7:17 PM, wes  wrote:
>
> > are firstnames and lastnames always separated by the same character in
> each
> >
> > filename?
> >
> > are the names separated from the rest of the info in the filename the
> same
> >
> > way for each file?
> >
> > are you doing this once, or will this be a repeating task that would be
> >
> > handy to automate?
> >
> > would you be able to provide a few same filenames, perhaps with the
> >
> > personal info obfuscated?
> >
> > generally, the way I would approach this is to pare the filenames down to
> >
> > the people's names, and then run uniq against that list. uniq -c will
> >
> > provide a count of how many times a given string appears in the input. if
> >
> > I'm doing this once, I would generate a text file containing the list of
> >
> > filenames I will be working with, for example:
> >
> > find Processed -type f > processed-files.txt
> >
> > then use a text editor to pare down the entries as described above, using
> >
> > find and replace functions to remove the extra data, so only the people's
> >
> > names remain. then simply uniq -c that file and you're done. I personally
> >
> > use vi for this, but just about any editor will do. I like this approach
> >
> > for a number of reasons, not the least of which is that I can spot-check
> >
> > random samples after each editing step to try to spot unexpected results.
> >
> > if you want to automate this, it may be a little more complicated, and
> the
> >
> > answers to my initial questions become important. if you can provide a
> >
> > little more context, I will try to help further.
> >
> > -wes
> >
> > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes barnmich...@gmail.com
> >
> > wrote:
> >
> > > Here's a fun trivia task. For an activity I am involved in, I get files
> > >
> > > from members to process. The filename starts with the member's name
> and has
> > >
> > > other info to identify the file. After processing, the file goes in the
> > >
> > > ./Processed folder. There are thousands of files now in that folder.
> Right
> > >
> > > now, I'm looking for a couple basic pieces of information. First, I
> want to
> > >
> > > know how many unique names I have in the list. Second, I'd like a list
> of
> > >
> > > names and how many files go with each name.
> > >
> > > I'm sure this is trivial, but my mind is blanking out on it. A couple
> > >
> > > simple examples would be nice. Non-answers, like "easy to do
> with'xxx'" or
> > >
> > > references to man pages or George's Book, etc. are not helpful right
> now.
> > >
> > > Thanks,
> > >
> > > Michael
>

Actually, they are callsigns instead of names. A couple of examples:

w7...@k-0496-20210526.txt
wa7...@k-0497-20210714.txt
n8...@k-4386-20210725.txt

I would like a simple count of the unique callsigns on a random basis and
possibly an occasional report listing each callsign and how many files are
in the folder for each.

Michael

Re: [PLUG] Counting Files

2021-08-16 Thread wes

are firstnames and lastnames always separated by the same character in each
filename?

are the names separated from the rest of the info in the filename the same
way for each file?

are you doing this once, or will this be a repeating task that would be
handy to automate?

would you be able to provide a few same filenames, perhaps with the
personal info obfuscated?

generally, the way I would approach this is to pare the filenames down to
the people's names, and then run uniq against that list. uniq -c will
provide a count of how many times a given string appears in the input. if
I'm doing this once, I would generate a text file containing the list of
filenames I will be working with, for example:

find Processed -type f > processed-files.txt

then use a text editor to pare down the entries as described above, using
find and replace functions to remove the extra data, so only the people's
names remain. then simply uniq -c that file and you're done. I personally
use vi for this, but just about any editor will do. I like this approach
for a number of reasons, not the least of which is that I can spot-check
random samples after each editing step to try to spot unexpected results.

if you want to automate this, it may be a little more complicated, and the
answers to my initial questions become important. if you can provide a
little more context, I will try to help further.

-wes

On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes 
wrote:

> Here's a fun trivia task. For an activity I am involved in, I get files
> from members to process. The filename starts with the member's name and has
> other info to identify the file. After processing, the file goes in the
> ./Processed folder. There are thousands of files now in that folder. Right
> now, I'm looking for a couple basic pieces of information. First, I want to
> know how many unique names I have in the list. Second, I'd like a list of
> names and how many files go with each name.
>
> I'm sure this is trivial, but my mind is blanking out on it. A couple
> simple examples would be nice. Non-answers, like "easy to do with'xxx'" or
> references to man pages or George's Book, etc. are not helpful right now.
>
> Thanks,
> Michael
>

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

Re: [PLUG] Counting Files

13 matches

Site Navigation

Mail list logo

Footer information