To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=55666 Issue #:|55666 Summary:|helpex speed optimalizations? Component:|l10n Version:|OOo 2.0 Platform:|All URL:| OS/Version:|All Status:|NEW Status whiteboard:| Keywords:| Resolution:| Issue type:|DEFECT Priority:|P3 Subcomponent:|code Assigned to:|ihi Reported by:|pjanik
------- Additional comments from [EMAIL PROTECTED] Sat Oct 8 13:00:33 -0700 2005 ------- Hi Ivo, I'm now digging deeper into helpcontent2 build slowness. The first piece I investigated is helpex. It gets called once for every .xhp file. This is a collection of issues/ideas I realized: A. libraries. helpex is linked with: [EMAIL PROTECTED]:~/BuildDir/ooo_SRC680_m133_src> ldd `which helpex`|sed 's#=>.*##' libuno_sal.so.3 libtl680li.so libvos3gcc3.so libdl.so.2 libpthread.so.0 libstlport_gcc.so libstdc++.so.6 libm.so.6 libgcc_s.so.1 libc.so.6 libcrypt.so.1 libucbhelper3gcc3.so libuno_cppu.so.3 libbasegfx680li.so /lib/ld-linux.so.2 libuno_cppuhelpergcc3.so.3 libuno_salhelpergcc3.so.3 [EMAIL PROTECTED]:~/BuildDir/ooo_SRC680_m133_src> Do we need them all? Do we need e.g. basegfx? Then I run strace on this randomly chosen command: helpex -QQ -r ../../../.. -i mm_newaddblo .xhp -x ../../../../unxlngi6.pro/misc -y text/swriter/01/mm_newaddblo.xhp -l all -lf en-US -m localize.sdf several issues there: C. lstat64("mm_newaddblo.xhp", {st_mode=S_IFREG|0755, st_size=3991, ...}) = 0 open("mm_newaddblo.xhp", O_RDONLY) = 3 lseek(3, 0, SEEK_SET) = 0 lseek(3, 0, SEEK_CUR) = 0 read(3, "<?xml version=\"1.0\" encoding=\"UT"..., 1024) = 1024 close(3) = 0 lstat64("mm_newaddblo.xhp", {st_mode=S_IFREG|0755, st_size=3991, ...}) = 0 open("mm_newaddblo.xhp", O_RDONLY) = 3 lseek(3, 0, SEEK_SET) = 0 lseek(3, 0, SEEK_CUR) = 0 read(3, "<?xml version=\"1.0\" encoding=\"UT"..., 32768) = 3991 close(3) -> the file is actually opened twice?? In the first open, 1024 bytes are read. in the second run, it is read completely: [EMAIL PROTECTED]:~/BuildDir/ooo_SRC680_m133_src/helpcontent2> ls -al ./source/text/swriter/ 01/mm_newaddblo.xhp -rwxr-xr-x 1 oo users 3991 Oct 2 17:04 ./source/text/swriter/01/mm_newaddblo.xhp [EMAIL PROTECTED]:~/BuildDir/ooo_SRC680_m133_src/helpcontent2> Who is reading it in the first run? Why? D. Another issue here is: lseek(3, 0, SEEK_SET) = 0 lseek(3, 0, SEEK_CUR) = 0 This means: 1. set file position to 0. Why? We just opened it! 2. move 0 bytes to the end (and retunr the current filepos). Why? We just set the position to 0! IIRC, Michael Meeks pointed out very similar problem in one of his talks at OOoCon. This probably comes from sal module. Who can debug this further? Michael, did you? E. Another issue here is: for every small .xhp file (3991 bytes now), we read the complete localize.sdf file, which is: -rw-r--r-- 1 oo users 19996196 Oct 2 17:04 localize.sdf -> 20MB for every file. F. And we read this file in 1024 (why only 1024?) buffer... I think here we waste approx. million of read system calls for all .xhp files. I know the file is cached, etc, but this is really ineffective. Can we re-think this approach from one process per file to one-process-per directory and read the localize.sdf at the beginning and only once? G. For the debugging purposes, I started the build of helpcontent with only en-US language and it is opening *many* files: open("../../../../unxlngi6.pro/misc/es/text/swriter/01/mm_newaddblo.xhpRC4Admunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/et/text/swriter/01/mm_newaddblo.xhpttkZZcunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/fi/text/swriter/01/mm_newaddblo.xhphqNvn3unxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/fr/text/swriter/01/mm_newaddblo.xhpzCwUISunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/hr/text/swriter/01/mm_newaddblo.xhpdqfYeMunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/it/text/swriter/01/mm_newaddblo.xhpB412dJunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/ja/text/swriter/01/mm_newaddblo.xhp2TmrEEunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/km/text/swriter/01/mm_newaddblo.xhpBbKNRyunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/ko/text/swriter/01/mm_newaddblo.xhpDxqAiwunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/nl/text/swriter/01/mm_newaddblo.xhpPU3z9wunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 open("../../../../unxlngi6.pro/misc/pl/text/swriter/01/mm_newaddblo.xhpR9wwOyunxlngi6.pro", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 does this mean that helpex is creating also files that are not needed at all in this run? --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]