I have made some more experiments with less;
I am attaching two patches;
less346_bufsize32k.diff  one is like the previous one, except
that buffer size is 32k, which seems to be enough, and is power of 2, so
division, multiplication and remainder may be computed more efficiently
(the compiler might convert those operations to shifts and masks).

less346_bufsize_env.diff is an attempt so see what the impact of allowing
user-defined value would be;

I am attaching some test results below.

On Fri, 14 Dec 2001, Thomas Dodd wrote:

> Date: Fri, 14 Dec 2001 16:48:58 -0600
> From: Thomas Dodd <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: making less faster on large amounts of stdin data
> 
> 
> 
> John Summerfield wrote:
> 
> > This is a silly, negative response. If the patch does what Wojtek says, 
> > the IMV it should be applied to the source.
> 
> 
> A patch to speed up a strange use of a program is what
> seams silly. less (and more) are interactive. why use
> them in a non interactive way? What's the purpose?
> 
This is addressed in a separate mail I sent to the list a few hours ago.

> 
> > It has no significant impact on its size. I can't tell whether there's 
> > an adverse impact on small machines - if so, then it needs to caclulate 
> > a buffer size and use the and that's more involved, but if Wojtek's 
> > right, worth doing.
> 
> 
> Since he said to run on a 128M+ system,
> I take it a low mem system would have trouble.
> 
> A 100 x increase in the buffer size is a lot if
> 
> the current buffer is 1MB, but not much if it's only 1KB.
> 
> Would making it dynamic not slow it down and negate
> the improvement?
The slowdown is more noticable then I have expected, but not as bad
to negate the improvement.


> 
> 
> > I suggest offering the patch to its author. See http://www.greenwoodsoft
> > ware.com/less/
> 
> 
> That's reasonable.
> 
> 
>       -Thomas
> 
> 
Now about the additiona tests I have done.
I have a file called test40M.lzo which contains 40 million bytes
in 800000 lines, each 50 bytes long;

I have 3 patched less executables, compiled on RedHat Linux with -O2
optimization (standard less 3.46 make);
32k_lless  is less with 32KB buffer size (patch attached)
128k_lless is less with 128KB buffer size
enlless is less patched with less346_bufsize_env.diff; buffer size can be
   specified by setting environment variable LESS_BUFSIZE

To test less in interactive operation immediately after starting the command
I pressed G (go to end of data) and the q (quit), so less would quit immediately
after processing the data (including line counting).

The hardware is the same as described in my earlier today posting (PII/540 MHz,
128 MB RAM, i440BX/ZX chipset, 100 MHz FSB)


[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time 128k_lless 
4.82user 0.32system 0:06.13elapsed 83%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+10091minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time 128k_lless 
4.78user 0.37system 0:06.05elapsed 85%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+10091minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time 128k_lless 
4.69user 0.42system 0:06.07elapsed 84%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+10091minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time 128k_lless
4.88user 0.28system 0:06.01elapsed 85%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+10091minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time 128k_lless
4.80user 0.30system 0:06.02elapsed 84%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+10091minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time enlless 
6.09user 0.41system 0:07.47elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (167major+10092minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=128k time enlless 
6.21user 0.39system 0:07.47elapsed 88%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (167major+10092minor)pagefaults 0swaps

[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=32k time 32k_lless 
5.14user 0.29system 0:06.36elapsed 85%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+9796minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=32k time 32k_lless 
5.14user 0.33system 0:06.32elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (165major+9796minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=32k time enlless 
6.70user 0.31system 0:07.76elapsed 90%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (167major+9796minor)pagefaults 0swaps
[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=32k time enlless 
6.55user 0.28system 0:07.78elapsed 87%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (167major+9796minor)pagefaults 0swaps

[wp@wpst lfiles]$ lzop -dc test40M.lzo | env LESS_BUFSIZE=1m time enlless 
6.17user 0.29system 0:07.38elapsed 87%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (167major+9825minor)pagefaults 0swaps

As can be seen from the tests timings, there is little gain (when processing
 40 MB of data on pipe) with increasing buffer size above 32KB, and the price 
for having the buffer size user-specified (using my suboptimal implementation)
is slowed down by about 30%. Please keep in mind that original less, using 1KB
 buffers would need more than 500 seconds for the same operation.

I would appreciate if someone with access to a 1GB RAM box could perform similar
tests for larger amount of data (say 40 MB, 200 MB and 400 MB) for a few values
of buffer size (say 32KB, 128KB, 1MB) to see how these affect execution time.

For my needs 32KB is a perfect match (the data I need to see on pipe is in most
 cases les than 40MB).

My statement about execution time being propotional to square of
 (data size/buffer size) was oversimplification; It surely does hold quite well
 for amount of data > 10MB and buffer size of 1KB, when buffer size is significantly
 increased, the part of quadratic complexity no longer dominates execution time.

To summarize, if one needs to have a good performance using less on pipe for
 amounts of data in range 0..50MB I would recommend the following patch to less:


--- less-346/ch.c       Fri Nov  5 02:47:33 1999
+++ less-346.bld02/ch.c Thu Dec 20 11:51:43 2001
@@ -29,7 +29,7 @@
  * in order from most- to least-recently used.
  * The circular list is anchored by the file state "thisfile".
  */
-#define LBUFSIZE       1024
+#define LBUFSIZE       32768
 struct buf {
        struct buf *next, *prev;  /* Must be first to match struct filestate */
        long block;


This patch would reduce the time needed for less to process 40MB of data on pipe
about 80 times, on PII/450 I tries down from 510 seconds to quite acceptable 6-7
seconds. 

I would recommend that Red Hat consider applying it to 'less'.


Best regards,

Wojtek

--- less-346/ch.c       Fri Nov  5 02:47:33 1999
+++ less-346.bld02/ch.c Thu Dec 20 12:53:58 2001
@@ -29,12 +29,62 @@
  * in order from most- to least-recently used.
  * The circular list is anchored by the file state "thisfile".
  */
-#define LBUFSIZE       1024
+
+static int lbufsize = 0;
+#if 0
+#define LBUFSIZE       get_lbufsize()
+#else
+#define LBUFSIZE       (lbufsize ? lbufsize : get_lbufsize())
+#endif
+
+#define LBUFSIZE_ENVVAR "LESS_BUFSIZE"
+#define LBUFSIZE_MIN    512
+/* FIXME: the max and default values below are good only for 32-bit systems */
+#define LBUFSIZE_DFLT   131072         /* 128 KB */
+#define LBUFSIZE_MAX    33554432       /* 32 MB */
+
+/*
+ * We allow user to modify LESS buffer size via an environment variable
+ * names LESS_BUFSIZE (see LBUFSIZE_ENVVAR #define)
+ * If the enironment variable defined, it should contain decimal number
+ * optonally followed by 'm', 'M', 'k' or 'K' suffix for megabytes of kilobytes
+ * If there is no valid number of the variable is not defined the LBUFSIZE_MIN
+ * is assumed.
+ * If the value if outside of LBUFSIZE_MIN, LBUFSIZE_MAX range it is coerced into it;
+ * Non-positive values are replaced with LBUFSIZE_DFLT
+ * When setting value keep in mind that time needed by less to process data on stdin
+ * is roughly proportional to square of (data size/less buffer size)
+ */
+
+       static int
+get_lbufsize() {
+       if (!lbufsize) {
+               char *envvp = getenv(LBUFSIZE_ENVVAR);
+               int val = 0;
+               if (envvp) {
+                       char *endp=0;
+                       val = strtol(envvp, &endp, 10);   /* get value from 
+environment variable */
+                       if (endp && *endp) {
+                               switch (*endp) {
+                                       case 'k': case 'K': val *= 1024;      break;
+                                       case 'm': case 'M': val *= 1024*1024; break;
+                                       /* we silently ignore other suffixes or 
+illegal numbers */
+                               }
+                       }
+               }
+               if (val <= 0) val = LBUFSIZE_DFLT;  /* if none or illegal assume 
+default */
+               if (val < LBUFSIZE_MIN) val = LBUFSIZE_MIN;  /* make sure val is >= 
+minimum value */
+               if (val > LBUFSIZE_MAX) val = LBUFSIZE_MAX;  /* make sure val is <= 
+maximum value */
+               lbufsize = val;
+       }
+       return lbufsize;
+}
+
 struct buf {
        struct buf *next, *prev;  /* Must be first to match struct filestate */
        long block;
        unsigned int datasize;
-       unsigned char data[LBUFSIZE];
+       unsigned char data[1 /* actually LBUFSIZE */ ];
 };
 
 /*
@@ -622,7 +672,7 @@
         * Allocate and initialize a new buffer and link it 
         * onto the tail of the buffer list.
         */
-       bp = (struct buf *) calloc(1, sizeof(struct buf));
+       bp = (struct buf *) calloc(1, sizeof(struct buf)+LBUFSIZE-1);  /* 1 is size of 
+buf.data */
        if (bp == NULL)
                return (1);
        ch_nbufs++;
--- less-346/ch.c       Fri Nov  5 02:47:33 1999
+++ less-346.bld02/ch.c.32k     Thu Dec 20 11:51:43 2001
@@ -29,7 +29,7 @@
  * in order from most- to least-recently used.
  * The circular list is anchored by the file state "thisfile".
  */
-#define LBUFSIZE       1024
+#define LBUFSIZE       32768
 struct buf {
        struct buf *next, *prev;  /* Must be first to match struct filestate */
        long block;

Reply via email to