https://bugs.documentfoundation.org/show_bug.cgi?id=165566

            Bug ID: 165566
           Summary: EDITING: New "Remove Duplicate" feature in Release
                    24.2.1 is very slow
           Product: LibreOffice
           Version: 25.2.1.2 release
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: medium
         Component: Calc
          Assignee: [email protected]
          Reporter: [email protected]

Description:
Referring to the new feature "Remove Duplicates" - see
https://bugs.documentfoundation.org/show_bug.cgi?id=85976

This is more an enhancement of the new feature than a bug. But due to the fact,
that the new feature is much too slow, this "bugzilla post" can be seen as a
bug as well.

I had an own version for Remove Duplicates based on a self-developed Python
script.

In an example with appr. 10.000 rows and 2 columns, my script needs 0.2
seconds.
The new built-in function needs 28 seconds for the same. This is a time
multiplier of 133.

A subsequent 'undo' takes no time in case of my version (let's assume 0.05
secs), and with the built-in version 86 seconds (time multiplier: 1720).

This looks like that the built-in function needs a big improvement to become
usable.

I can share my python script if needed. The main way of working is, that the
data are read from the spreadsheet into a python table, all work is done in
that table and then the table is written back to the spreadsheet.

It also contains an option to automatically sort the data which makes a further
big speed up. In case of 10.000 rows this is not needed with a low number of
columns. In case of 100.000 rows this should be used to avoid long waiting
time.

Only issue of my script is a limitation due to the "selection.getData" function
which stops working in case of more than 262.144 rows (2^18).

The limitation of this function should be filed in another bug report if
needed.


Steps to Reproduce:
Use of "Remove Duplicate"
1. Create a spreadsheet with 10000 rows and 2 columns
2. Run new Built-In "Remove Duplicate" function

Undo
1. Press Ctrl-Z directly after the above Remove Duplicates

Actual Results:
Very slow :  > 20 seconds for Remove Duplicates resp. > 80 seconds for undo

Expected Results:
Both steps should be in less than one second


Reproducible: Always


User Profile Reset: No

Additional Info:
I think, all is mentioned in the "Description" already.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to