Quoting Benny Lofgren <bl-li...@lofgren.biz>:

On 2015-08-02 08:23, Alessandro DE LAURENZIS wrote:
On Sat 01/08/2015 14:09, Vijay Sankar wrote:
alias nof='ls -l . | egrep -c '^-''
I have always wondered if there is a better way of doing this.

In general, I would avoid using a pipe when a native command exists (and
particularly in this case, where grep string comparison is a slow
operation); this could probably be more appropriate:

There IS no native command doing what Vijay wants... you introduced a
pipe in your own example, too.

Don't be afraid of pipes!

There isn't necessarily a disadvantage in splitting jobs through a pipe.
For example, it enables the system to better utilize multiple
processors/cores, which may or may not make a difference.


In this case, your example is undoubtedly faster.

*But*, what you did was to speed optimize a process, involving a human
operator, to work half a second faster in a rather constructed scenario
with over a hundred thousand files in one directory.

In practice, the difference is completely imperceptible for the operator:


----8<--------8<--------8<--------8<--------8<---- (cut)
bl@paddan:~$ cd /usr/share/man/man3      # [1]
bl@paddan:/usr/share/man/man3$ time ls -l . | egrep -c '^-'
4045
    0m0.05s real     0m0.02s user     0m0.03s system
bl@paddan:/usr/share/man/man3$ time find . -maxdepth 1 -type f | wc -l
    4045
    0m0.04s real     0m0.01s user     0m0.02s system
bl@paddan:/usr/share/man/man3$ _
----8<--------8<--------8<--------8<--------8<---- (cut)


This kind of optimization is really not that productive.

There is for sure a good lesson in showing how to do things in different
ways, to broaden ones horizon when it comes to thinking outside the box
(or pipe).


But, starting to talk about shaving fractions of a second off of an
interactive command in an edge case is just a red herring in my opinion.
It teaches the wrong message.


A much better optimization for this question, in my mind, is this:

Don't use an alias at all! Instead use a shell function, like this:

----8<--------8<--------8<--------8<--------8<---- (cut)
nof() {
        ls -l $1 | egrep -c '^-'
}
----8<--------8<--------8<--------8<--------8<---- (cut)

(In this case, substituting find is *not* immediately applicable.)


The advantage of this approach is that in the regular case "nof" works
just like in Vijay's original alias, but this has the added
functionality of being able to "nof" any directory with a command line
argument, like this:

----8<--------8<--------8<--------8<--------8<---- (cut)
bl@paddan:/usr/share/man/man3$ nof
4045
bl@paddan:/usr/share/man/man3$ nof /bin
42             <-- (Who knew Douglas Adams was an OpenBSD contributor!)
bl@paddan:/usr/share/man/man3$
----8<--------8<--------8<--------8<--------8<---- (cut)


You can't do the above (as easily) with the find approach, since it
doesn't work without a directory argument. (Yes, of course we can add
code to fix that, but that's not the point here.)


This isn't a SPEED optimization, it is a FUNCTIONALITY optimization.

It is a better way to do the same thing, just what Vijay asked for. :-)


Moral of my story: KISS. Keep It Simple, Stupid.

Put your efforts in the right place.


Regards,

/Benny



[1] I first did this to quickly find out which directory in my machine
was the biggest, to have somewhere to play:

bl@paddan:~$ sudo find / -type d -ls | cut -c48- | grep -v "^   "

The cut and grep business sorts out all smaller directories with three
or four digit sizes, giving me a quick overview over the biggest
directories.

This whole operation took me less than a minute, including a couple of
trial-and-error runs to find out the best position for the cut.

I am sure there are much better and more accurate ways of doing this,
still with simple shell commands and pipe chaining, but this is what I
thought of off the top of my head, and it did this one-shot job much
more quickly than if I had sat down to come up with a more accurate or
general solution.

Optimizing your *work* doesn't have to include measuring cpu cycles!



alias nof='find ./ -type f -maxdepth 1 | wc -l'

See the difference in runtime in case of a huge file listing (not so
uncommon...):

just22@poseidon:[tmp]> time find ./ -type f -maxdepth 1 | wc -l
  113069

  real    0m1.732s
  user    0m0.100s
  sys     0m1.560s


just22@poseidon:[tmp]> time ls -l ./ | egrep -c "^-"
113069

real    0m2.238s
user    0m0.630s
sys     0m1.550s


All the best

Thanks very much Alessandro, Raul, and Benny. Really appreciate all your thoughtful comments. They were very educational for me.

Benny's shell function is more appropriate for what I am doing with nof. I am very embarrassed to admit this but unfortunately I never thought of using a shell function in .profile till I read this thread. Thanks again Benny Lofgren.

Just in case it is of any relevance this is what I use "nof" for. I have a few ports building systems running and they run different versions of OpenBSD. As a result they have different distfiles, all the original stuff from CVS plus patches and Makefiles I have mangled and screwed up and so on. To have a copy of all the distfiles if I ever have to build a package on an older version and recover from mistakes I may have made, I do an rsync of all the official distfiles to another system. I like to verify that I have the same number of files on both systems and the nof becomes handy to do that. However because I was using an alias, I always had to cd to that directory. Now with Benny's shell function idea, it works perfectly for me!!!

Thanks again,

Vijay



--
Vijay Sankar, M.Eng., P.Eng.
ForeTell Technologies Limited
vsan...@foretell.ca

Reply via email to