I have lots of large latex files, with lots of pages in them with large number of equations, generated by computer algebra systems. I also have lots of includegraphics in these files for svg images.
I noticed that tex4ht becomes very slow as number of pages increase. This is becoming so bad, that I ended up buying new PC and installing Linux on it just in the hope it will speed things up (I was using Vbox on windows, then I tried cygwin on windows). For example, for one file, using Vbox, it took 14 hrs for make4ht to compile the file to html. On cygwin, it took little less than than. About 10 hrs. This is on windows 7, 64 bit 16 GB ram, fast intel i7-3930k CPU. On new PC (24 GB RAM), intel i7-6700k, 64 bit, it took 5 hrs. Ok, much better. so TL is more optimized for native Linux vs. cygwin. VBox is expected to be slower since it is software emulation of PC. Note also, the disk is solid state in all cases. So fast disk. This is all using TL 2015. This on a PC with nothing else running on it. But the issue is, pdflatex and lualatex take about 5 minutes on the same file to compile it to pdf ! I can understand converting to HTML will take more time, since each equation is converted to svg image, etc... but why is the timing so much more? Is this really to be expected? What happens in this: tex4ht starts fast initially, I see (./report.4ct) [3] [4] [5] [6] [7] [8] [9]... printed on the terminal very fast, then it starts to slow down, the higher the number becomes (I assume these correspond to page numbers that tex4ht is processing). When it gets to [3596] [3597] [3598] [3599] [3600] [3601].... it starts to take few second to update. The larger the numbers, the slower it gets. It also seems tex4ht has more than one pass. As I see it generating these sequence of numbers more than one time. I can make a zip file with typical large latex file with all the images it uses and my .cfg and main.mk4 and the command I used to compile the latex file if any one wants to confirm this problem. Would this be ok? Or should I file a "performance bug" first on this at tex4ht and put a link to the zip file? Or is it better discuses this first? I think the slow down is in the IO to the .dvi or dvi file, but this is just a guess. I chatted with Michal about this in tex stackexchange chat room also. I can provide more information, etc... I have many many latex files this large, and now it takes 20 days at this slow level to compile one set of them to HTML. This is way too long, given that lualatex takes one hr or so. Finally, is there a document that describes the passes/process that tex4ht uses to compile to HTML at some high level? Like block diagram, or such. I am not able to find such design document. --Nasser