On 11/7/2012 11:21 AM, Taco Hoekwater wrote:
On 11/07/2012 11:07 AM, Hans Hagen wrote:
On 11/7/2012 10:33 AM, luigi scarso wrote:


On Wed, Nov 7, 2012 at 10:07 AM, Hans Hagen <[email protected]
<mailto:[email protected]>> wrote:

    On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:

        What I don't grok is why it's 20 times slower to load a file
at once
        than in little chunks.  But what I'm mostly concerned about is
        memory
        consumption.  In xosview I can see that the garbage collector
        reduces
        the consumed memory from time to time, but it doesn't seem to be
        overly hungry.  Even worse, I have the impression that memory
        consumption grows exponentially with time.  With a slightly
larger
        test file my system (4GB RAM) would certainly run out of memory.


    I think 20 times is somewhat off at your end because here I get
this:

Out of memory here  with a testfile of  662M
Linux 32bit, 4GByte, PAE extension

# time ./read_blocks.lua

real    0m2.458s
user    0m0.372s
sys    0m1.084s
# time ./read_whole_file.lua
not enough memory

real    0m17.125s
user    0m11.737s
sys    0m4.292s

# texlua -v
This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277)

Indeed not enough mem on my laptop for a 600M+ test.

Windows 8, 32 bit:

-- all      1.082   34176584    120272.328125
-- chunked  0.668   34176584    169908.59667969
-- once     1.065   34176584    111757.03710938

-- all      7.078   136706337   535063.34863281
-- chunked  3.441   136706337   787195.56933594
-- once     6.621   136706337   501559.83691406

the larger values for 'all' and 'once' still puzzle me.

malloc time, perhaps. The 'fast' *a loader does a couple of fseek()
ftell()s to find the file size, then malloc()s the whole string
before feeding it to Lua, then free-ing it again. There is a
fairly large copy in the feed process that I cannot avoid without
using lua internals instead of the published API.

Btw, on my SSD disk, there is no noticeable difference between all
three cases for an 85MB file.

Here (also ssd, but relatively slow sata as it's an 6 year old laptop):

-- all      2.015   85000000    291368
-- chunked  1.140   85000000    268040
-- once     1.997   85000000    291119

Can you explain the

    collectgarbage("collect")
    local m = collectgarbage("count")
    local t = os.clock()
    local f = io.open(name,'rb')
    local n = f:seek("end")
    f:seek("set",0)
    local d = f:read(n)
    f:close()
    print("once",os.clock()-t,#d,collectgarbage("count")-m)

is seek slow? Doesn't seem so, as

    local n = lfs.attributes(name,"size")

gives the same timings. So, maybe the chunker is also mallocing but on smaller chunks.

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
    tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------
_______________________________________________
dev-luatex mailing list
[email protected]
http://www.ntg.nl/mailman/listinfo/dev-luatex

Reply via email to