https://bugs.documentfoundation.org/show_bug.cgi?id=165566
Bug ID: 165566
Summary: EDITING: New "Remove Duplicate" feature in Release
24.2.1 is very slow
Product: LibreOffice
Version: 25.2.1.2 release
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: enhancement
Priority: medium
Component: Calc
Assignee: [email protected]
Reporter: [email protected]
Description:
Referring to the new feature "Remove Duplicates" - see
https://bugs.documentfoundation.org/show_bug.cgi?id=85976
This is more an enhancement of the new feature than a bug. But due to the fact,
that the new feature is much too slow, this "bugzilla post" can be seen as a
bug as well.
I had an own version for Remove Duplicates based on a self-developed Python
script.
In an example with appr. 10.000 rows and 2 columns, my script needs 0.2
seconds.
The new built-in function needs 28 seconds for the same. This is a time
multiplier of 133.
A subsequent 'undo' takes no time in case of my version (let's assume 0.05
secs), and with the built-in version 86 seconds (time multiplier: 1720).
This looks like that the built-in function needs a big improvement to become
usable.
I can share my python script if needed. The main way of working is, that the
data are read from the spreadsheet into a python table, all work is done in
that table and then the table is written back to the spreadsheet.
It also contains an option to automatically sort the data which makes a further
big speed up. In case of 10.000 rows this is not needed with a low number of
columns. In case of 100.000 rows this should be used to avoid long waiting
time.
Only issue of my script is a limitation due to the "selection.getData" function
which stops working in case of more than 262.144 rows (2^18).
The limitation of this function should be filed in another bug report if
needed.
Steps to Reproduce:
Use of "Remove Duplicate"
1. Create a spreadsheet with 10000 rows and 2 columns
2. Run new Built-In "Remove Duplicate" function
Undo
1. Press Ctrl-Z directly after the above Remove Duplicates
Actual Results:
Very slow : > 20 seconds for Remove Duplicates resp. > 80 seconds for undo
Expected Results:
Both steps should be in less than one second
Reproducible: Always
User Profile Reset: No
Additional Info:
I think, all is mentioned in the "Description" already.
--
You are receiving this mail because:
You are the assignee for the bug.