On 2020-06-28 at 13:52:56 +0200, Hans Hagen wrote: > On 6/28/2020 3:26 AM, Reinhard Kotucha wrote: > > > > that adds passing parameters and checking them for each call > > > ... you can then as well use lua's 'read' function and > > > convert with string.byte/char which is then about equally > > > fast > > > > This is what I actually did. It took 14 s to process a PNM file, > > way too much if I have to process hundreds or thousands files. I > > ported the script to C and could process the file within 270 ms. > > I can't imagine that obeying a variable in C can slow down > > everything so much. > > how big a file ... also, i bet you do more than just reading, you > don't define what 'process' is (270 ms for 100K files is still not > fast I guess)
96MiB per file. Processing means to apply a lookup table and a 3×3 color matrix, quite inexpensive operations. What takes most of the time is to extract single bytes with string.sub() and to convert them to integers. Finally I have to convert everything back to uint16. In C I convert to host byte order with ntohs(3) and access the color triplets by pushing a pointer around. In both cases I read the file line by line (30024 bytes per line). > > I'm not very familiar with C programming. You say that it's expensive > > to pass arguments to a function. What I had in mind is that functions > > obey a global variable at runtime which denotes whether byte order > > conversion is necessary or not. > > passing variables in c is no issue (also because compilers are > smart enough to deal with it) > > a global variable would not work because one can read several files > a the same time interleaved with different properties > > i'm talking of picking up some optional argument passed by lua > (passed on stack, checking needed, etc) > > anyway, there's nothing wrong with writing and using a c program if > that is more suitable esp when you need to process that many files > ... opening closing in lua is slower than in c, as is storing all > your read bytes in lua variables (and i'm not even talking about > the fact that a file metatable has to be looked up and type being > checked for every read) plus some garbage collection every now and > then > > as you can compile c, you can also write a dedicated library and add > that to luatex (assuming you need to do this runtime from luatex) > > (you could consider using ffi) Thanks for the info. I wasn't aware that reading bytes into lua variables is expensive too. Maybe it's better indeed to stay with C. > I downloaded the 3.7 GB texlive iso and read integers from that one > > -- 360 sec : one byte integers + counting > -- 224 sec : two byte integers + counting > -- 166 sec : four byte integers + counting (160 no counting) > > But that's a lot of lua calls. Reading 96MB as two byte integers would then take 6 seconds, much more than I expected. > Then I downloaded the tug logo from the website > > -- string : .55 sec for 1000 times (including opening / loading) > -- file : .67 sec for 1000 times (including opening / loading) > > So, that's milliseconds per file. > > Finally I processed the 3414 files in the 268M context distribution and > read 2 byte integers from those till end of file which took 15 seconds > for the lot. So, no complaints from my end. This means that file opening is quite fast: 3700/224 = 16.518 268/15 = 17.867 > I think it's not the file handling that is your bottleneck. Yes, thanks for the info. Regards, Reinhard -- ------------------------------------------------------------------ Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:[email protected] ------------------------------------------------------------------
