[libreoffice-users] Python UNO odt to pdf converter - running time issue

2022-11-04 Thread Marcin Giedz
Hi there,

using python we prepared a tool to fill/replace some TAGs in odt files and
then to save such file to PDF. This is part of code from such tool:

for key in values.keys():
search = doc.createSearchDescriptor()
search.SearchString = "{" + key + "}"
search.SearchAll = True
search.SearchWords = False
search.SearchCaseSensitive = False
selsfound = doc.findAll(search)
for selIndex in range(selsfound.getCount()):
selfound = selsfound.getByIndex(selIndex)
selfound.setString(values[key])
end = time.time()
print("time replace tags in doc:", end - start)

The issue we faced is a bit strange as total time for single file
(structure of this file in 90% is a table in table) to convert differs a
lot when running on laptop and on server:

laptop:
real0m9,700s
user0m3,090s
sys0m2,619s

server:
real0m37.190s
user0m9.194s
sys0m7.791s

Libreoffice version is the latest from PPA stable Ubuntu. Server is running
Ubuntu 22.04 and laptop Ubuntu 22.10.


This is CPU info for both laptop and server:
laptop:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 140
model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
stepping : 1
microcode : 0xa4
cpu MHz : 982.290
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 27
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor
ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb
stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase
tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed
adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw
avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat
pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip
pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg
avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear
ibt flush_l1d arch_capabilities
vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad
ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid
unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs
bogomips : 5606.40
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:


server (virtual machine)
processor : 17
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
stepping : 7
microcode : 0x5002f01
cpu MHz : 2100.000
cache size : 22528 KB
physical id : 34
siblings : 1
core id : 0
cpu cores : 1
apicid : 34
initial apicid : 34
fpu : yes
fpu_exception : yes
cpuid level : 22
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid
tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm
3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced
fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx
smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1
xsaves arat pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit
mmio_stale_data retbleed
bogomips : 4200.00
clflush size : 64
cache_alignment : 64for filling converting odt files to pdf.
address sizes : 45 bits physical, 48 bits virtual
power management:

Has anyone ever found such an issue ?

Thx
Marcin

-- 
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy


Re: [libreoffice-users] Python UNO odt to pdf converter - running time issue

2022-11-04 Thread Michael H
1. you ask about a python timing. but don't state specifically whether it's
python 2, or python 3, or mixed between machines. That especially would be
my focus, on the python execution and exactly what modules are in place and
are they all from the same sources. (That is, that would seem to be the
only place where you might have something to change for improvement.)

2. You're comparing a physical machine with 4 cores against a virtual
machine with 1 core, those numbers seem a little long for the virtual, but
not by much. You can expect a factor of 2 for the virtualization, and then
another 50% for the lack of multi-core (if I'm remembering right and LO 7x
does support parallel core processing. but even if LO 7 doesn't support
multicore, a single core machine becomes very sensitive to everything
currently running, as all processes wait for the same core.)

On Fri, Nov 4, 2022 at 5:05 AM Marcin Giedz  wrote:

> Hi there,
>
> using python we prepared a tool to fill/replace some TAGs in odt files and
> then to save such file to PDF. This is part of code from such tool:
>
> for key in values.keys():
> search = doc.createSearchDescriptor()
> search.SearchString = "{" + key + "}"
> search.SearchAll = True
> search.SearchWords = False
> search.SearchCaseSensitive = False
> selsfound = doc.findAll(search)
> for selIndex in range(selsfound.getCount()):
> selfound = selsfound.getByIndex(selIndex)
> selfound.setString(values[key])
> end = time.time()
> print("time replace tags in doc:", end - start)
>
> The issue we faced is a bit strange as total time for single file
> (structure of this file in 90% is a table in table) to convert differs a
> lot when running on laptop and on server:
>
> laptop:
> real0m9,700s
> user0m3,090s
> sys0m2,619s
>
> server:
> real0m37.190s
> user0m9.194s
> sys0m7.791s
>
> Libreoffice version is the latest from PPA stable Ubuntu. Server is running
> Ubuntu 22.04 and laptop Ubuntu 22.10.
>
>
> This is CPU info for both laptop and server:
> laptop:
> processor : 7
> vendor_id : GenuineIntel
> cpu family : 6
> model : 140
> model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
> stepping : 1
> microcode : 0xa4
> cpu MHz : 982.290
> cache size : 12288 KB
> physical id : 0
> siblings : 8
> core id : 3
> cpu cores : 4
> apicid : 7
> initial apicid : 7
> fpu : yes
> fpu_exception : yes
> cpuid level : 27
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
> rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology
> nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor
> ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
> movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
> 3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb
> stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase
> tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed
> adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw
> avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat
> pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip
> pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg
> avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear
> ibt flush_l1d arch_capabilities
> vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad
> ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid
> unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling
> bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs
> bogomips : 5606.40
> clflush size : 64
> cache_alignment : 64
> address sizes : 39 bits physical, 48 bits virtual
> power management:
>
>
> server (virtual machine)
> processor : 17
> vendor_id : GenuineIntel
> cpu family : 6
> model : 85
> model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
> stepping : 7
> microcode : 0x5002f01
> cpu MHz : 2100.000
> cache size : 22528 KB
> physical id : 34
> siblings : 1
> core id : 0
> cpu cores : 1
> apicid : 34
> initial apicid : 34
> fpu : yes
> fpu_exception : yes
> cpuid level : 22
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
> constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid
> tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe
> popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm
> 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced
> fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx
> smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1
> xsaves 

Re: [libreoffice-users] Python UNO odt to pdf converter - running time issue

2022-11-04 Thread Marcin Giedz
Hi Michael,

1. python3 and as both machines are running Ubuntu 22 I think all modules
are almost the same
2. and this could be an answer ... I'll change in my VM number of cores per
socket and repeat the same test.

Thx for input, will let you know the results.

Thx
Marcin


On Fri, 4 Nov 2022 at 13:47, Michael H  wrote:

> 1. you ask about a python timing. but don't state specifically whether
> it's python 2, or python 3, or mixed between machines. That especially
> would be my focus, on the python execution and exactly what modules are in
> place and are they all from the same sources. (That is, that would seem to
> be the only place where you might have something to change for
> improvement.)
>
> 2. You're comparing a physical machine with 4 cores against a virtual
> machine with 1 core, those numbers seem a little long for the virtual, but
> not by much. You can expect a factor of 2 for the virtualization, and then
> another 50% for the lack of multi-core (if I'm remembering right and LO 7x
> does support parallel core processing. but even if LO 7 doesn't support
> multicore, a single core machine becomes very sensitive to everything
> currently running, as all processes wait for the same core.)
>
> On Fri, Nov 4, 2022 at 5:05 AM Marcin Giedz  wrote:
>
>> Hi there,
>>
>> using python we prepared a tool to fill/replace some TAGs in odt files and
>> then to save such file to PDF. This is part of code from such tool:
>>
>> for key in values.keys():
>> search = doc.createSearchDescriptor()
>> search.SearchString = "{" + key + "}"
>> search.SearchAll = True
>> search.SearchWords = False
>> search.SearchCaseSensitive = False
>> selsfound = doc.findAll(search)
>> for selIndex in range(selsfound.getCount()):
>> selfound = selsfound.getByIndex(selIndex)
>> selfound.setString(values[key])
>> end = time.time()
>> print("time replace tags in doc:", end - start)
>>
>> The issue we faced is a bit strange as total time for single file
>> (structure of this file in 90% is a table in table) to convert differs a
>> lot when running on laptop and on server:
>>
>> laptop:
>> real0m9,700s
>> user0m3,090s
>> sys0m2,619s
>>
>> server:
>> real0m37.190s
>> user0m9.194s
>> sys0m7.791s
>>
>> Libreoffice version is the latest from PPA stable Ubuntu. Server is
>> running
>> Ubuntu 22.04 and laptop Ubuntu 22.10.
>>
>>
>> This is CPU info for both laptop and server:
>> laptop:
>> processor : 7
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 140
>> model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
>> stepping : 1
>> microcode : 0xa4
>> cpu MHz : 982.290
>> cache size : 12288 KB
>> physical id : 0
>> siblings : 8
>> core id : 3
>> cpu cores : 4
>> apicid : 7
>> initial apicid : 7
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 27
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
>> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
>> rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology
>> nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor
>> ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
>> movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
>> 3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb
>> stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase
>> tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed
>> adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw
>> avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat
>> pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip
>> pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg
>> avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear
>> ibt flush_l1d arch_capabilities
>> vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad
>> ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid
>> unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling
>> bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs
>> bogomips : 5606.40
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 39 bits physical, 48 bits virtual
>> power management:
>>
>>
>> server (virtual machine)
>> processor : 17
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 85
>> model name : Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
>> stepping : 7
>> microcode : 0x5002f01
>> cpu MHz : 2100.000
>> cache size : 22528 KB
>> physical id : 34
>> siblings : 1
>> core id : 0
>> cpu cores : 1
>> apicid : 34
>> initial apicid : 34
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 22
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
>> pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
>> c