[libreoffice-users] Python UNO odt to pdf converter - running time issue
Hi there, using python we prepared a tool to fill/replace some TAGs in odt files and then to save such file to PDF. This is part of code from such tool: for key in values.keys(): search = doc.createSearchDescriptor() search.SearchString = "{" + key + "}" search.SearchAll = True search.SearchWords = False search.SearchCaseSensitive = False selsfound = doc.findAll(search) for selIndex in range(selsfound.getCount()): selfound = selsfound.getByIndex(selIndex) selfound.setString(values[key]) end = time.time() print("time replace tags in doc:", end - start) The issue we faced is a bit strange as total time for single file (structure of this file in 90% is a table in table) to convert differs a lot when running on laptop and on server: laptop: real0m9,700s user0m3,090s sys0m2,619s server: real0m37.190s user0m9.194s sys0m7.791s Libreoffice version is the latest from PPA stable Ubuntu. Server is running Ubuntu 22.04 and laptop Ubuntu 22.10. This is CPU info for both laptop and server: laptop: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 140 model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz stepping : 1 microcode : 0xa4 cpu MHz : 982.290 cache size : 12288 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 27 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear ibt flush_l1d arch_capabilities vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs bogomips : 5606.40 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: server (virtual machine) processor : 17 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz stepping : 7 microcode : 0x5002f01 cpu MHz : 2100.000 cache size : 22528 KB physical id : 34 siblings : 1 core id : 0 cpu cores : 1 apicid : 34 initial apicid : 34 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat pku ospke avx512_vnni md_clear flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit mmio_stale_data retbleed bogomips : 4200.00 clflush size : 64 cache_alignment : 64for filling converting odt files to pdf. address sizes : 45 bits physical, 48 bits virtual power management: Has anyone ever found such an issue ? Thx Marcin -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette List archive: https://listarchives.libreoffice.org/global/users/ Privacy Policy: https://www.documentfoundation.org/privacy
Re: [libreoffice-users] Python UNO odt to pdf converter - running time issue
1. you ask about a python timing. but don't state specifically whether it's python 2, or python 3, or mixed between machines. That especially would be my focus, on the python execution and exactly what modules are in place and are they all from the same sources. (That is, that would seem to be the only place where you might have something to change for improvement.) 2. You're comparing a physical machine with 4 cores against a virtual machine with 1 core, those numbers seem a little long for the virtual, but not by much. You can expect a factor of 2 for the virtualization, and then another 50% for the lack of multi-core (if I'm remembering right and LO 7x does support parallel core processing. but even if LO 7 doesn't support multicore, a single core machine becomes very sensitive to everything currently running, as all processes wait for the same core.) On Fri, Nov 4, 2022 at 5:05 AM Marcin Giedz wrote: > Hi there, > > using python we prepared a tool to fill/replace some TAGs in odt files and > then to save such file to PDF. This is part of code from such tool: > > for key in values.keys(): > search = doc.createSearchDescriptor() > search.SearchString = "{" + key + "}" > search.SearchAll = True > search.SearchWords = False > search.SearchCaseSensitive = False > selsfound = doc.findAll(search) > for selIndex in range(selsfound.getCount()): > selfound = selsfound.getByIndex(selIndex) > selfound.setString(values[key]) > end = time.time() > print("time replace tags in doc:", end - start) > > The issue we faced is a bit strange as total time for single file > (structure of this file in 90% is a table in table) to convert differs a > lot when running on laptop and on server: > > laptop: > real0m9,700s > user0m3,090s > sys0m2,619s > > server: > real0m37.190s > user0m9.194s > sys0m7.791s > > Libreoffice version is the latest from PPA stable Ubuntu. Server is running > Ubuntu 22.04 and laptop Ubuntu 22.10. > > > This is CPU info for both laptop and server: > laptop: > processor : 7 > vendor_id : GenuineIntel > cpu family : 6 > model : 140 > model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz > stepping : 1 > microcode : 0xa4 > cpu MHz : 982.290 > cache size : 12288 KB > physical id : 0 > siblings : 8 > core id : 3 > cpu cores : 4 > apicid : 7 > initial apicid : 7 > fpu : yes > fpu_exception : yes > cpuid level : 27 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb > rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology > nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor > ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic > movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm > 3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb > stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase > tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed > adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw > avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat > pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip > pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg > avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear > ibt flush_l1d arch_capabilities > vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad > ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid > unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling > bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs > bogomips : 5606.40 > clflush size : 64 > cache_alignment : 64 > address sizes : 39 bits physical, 48 bits virtual > power management: > > > server (virtual machine) > processor : 17 > vendor_id : GenuineIntel > cpu family : 6 > model : 85 > model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz > stepping : 7 > microcode : 0x5002f01 > cpu MHz : 2100.000 > cache size : 22528 KB > physical id : 34 > siblings : 1 > core id : 0 > cpu cores : 1 > apicid : 34 > initial apicid : 34 > fpu : yes > fpu_exception : yes > cpuid level : 22 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm > constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid > tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe > popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm > 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced > fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx > smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 > xsaves
Re: [libreoffice-users] Python UNO odt to pdf converter - running time issue
Hi Michael, 1. python3 and as both machines are running Ubuntu 22 I think all modules are almost the same 2. and this could be an answer ... I'll change in my VM number of cores per socket and repeat the same test. Thx for input, will let you know the results. Thx Marcin On Fri, 4 Nov 2022 at 13:47, Michael H wrote: > 1. you ask about a python timing. but don't state specifically whether > it's python 2, or python 3, or mixed between machines. That especially > would be my focus, on the python execution and exactly what modules are in > place and are they all from the same sources. (That is, that would seem to > be the only place where you might have something to change for > improvement.) > > 2. You're comparing a physical machine with 4 cores against a virtual > machine with 1 core, those numbers seem a little long for the virtual, but > not by much. You can expect a factor of 2 for the virtualization, and then > another 50% for the lack of multi-core (if I'm remembering right and LO 7x > does support parallel core processing. but even if LO 7 doesn't support > multicore, a single core machine becomes very sensitive to everything > currently running, as all processes wait for the same core.) > > On Fri, Nov 4, 2022 at 5:05 AM Marcin Giedz wrote: > >> Hi there, >> >> using python we prepared a tool to fill/replace some TAGs in odt files and >> then to save such file to PDF. This is part of code from such tool: >> >> for key in values.keys(): >> search = doc.createSearchDescriptor() >> search.SearchString = "{" + key + "}" >> search.SearchAll = True >> search.SearchWords = False >> search.SearchCaseSensitive = False >> selsfound = doc.findAll(search) >> for selIndex in range(selsfound.getCount()): >> selfound = selsfound.getByIndex(selIndex) >> selfound.setString(values[key]) >> end = time.time() >> print("time replace tags in doc:", end - start) >> >> The issue we faced is a bit strange as total time for single file >> (structure of this file in 90% is a table in table) to convert differs a >> lot when running on laptop and on server: >> >> laptop: >> real0m9,700s >> user0m3,090s >> sys0m2,619s >> >> server: >> real0m37.190s >> user0m9.194s >> sys0m7.791s >> >> Libreoffice version is the latest from PPA stable Ubuntu. Server is >> running >> Ubuntu 22.04 and laptop Ubuntu 22.10. >> >> >> This is CPU info for both laptop and server: >> laptop: >> processor : 7 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 140 >> model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz >> stepping : 1 >> microcode : 0xa4 >> cpu MHz : 982.290 >> cache size : 12288 KB >> physical id : 0 >> siblings : 8 >> core id : 3 >> cpu cores : 4 >> apicid : 7 >> initial apicid : 7 >> fpu : yes >> fpu_exception : yes >> cpuid level : 27 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat >> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb >> rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology >> nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor >> ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic >> movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm >> 3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb >> stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase >> tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed >> adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw >> avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat >> pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip >> pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg >> avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear >> ibt flush_l1d arch_capabilities >> vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad >> ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid >> unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling >> bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs >> bogomips : 5606.40 >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 39 bits physical, 48 bits virtual >> power management: >> >> >> server (virtual machine) >> processor : 17 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 85 >> model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz >> stepping : 7 >> microcode : 0x5002f01 >> cpu MHz : 2100.000 >> cache size : 22528 KB >> physical id : 34 >> siblings : 1 >> core id : 0 >> cpu cores : 1 >> apicid : 34 >> initial apicid : 34 >> fpu : yes >> fpu_exception : yes >> cpuid level : 22 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat >> pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm >> c