Re: [GNC-dev] Performance regression loading account

2023-03-16 Thread Robert Fewell
Bug created, https://bugs.gnucash.org/show_bug.cgi?id=798788 with test file.
Regards,
Bob

On Thu, 16 Mar 2023 at 03:38, john  wrote:

> Bob,
>
> Please open a bug report and attach your sample test file.
>
> Regards,
> John Ralls
>
> On Mar 15, 2023, at 4:02 AM, Robert Fewell <14ubo...@gmail.com> wrote:
>
> I wanted to get some numbers on this as my test file seemed OK.
> I used Calc to create a CSV transaction import file with 8402 rows and
> some description columns with 16, 64 and 128 character random strings.
> Used this to import several times to a new empty gnucash xml file and
> added some timing for the account open command in 4.903 and 4.13 with
> results below...
>
> With 4.903
> Description 16 Characters and 16804 unique transactions / descriptions,
> 0.93, second time 0.73
> Description 64 Characters and 16804 unique transactions / descriptions,
> 2.39, second time 1.67
> Description 128 Characters and 16804 unique transactions / descriptions,
> 4.08, second time 2.90
>
> With 4.13
> Description 16 Characters and 16804 unique transactions / descriptions,
> 0.49, second time 0.35
> Description 16 Characters and 16804 unique transactions / descriptions,
> 1.22, second time 0.61
> Description 16 Characters and 16804 unique transactions / descriptions,
> 1.91, second time 0.93
>
> Regards,
> Bob
>
> On Tue, 14 Mar 2023 at 18:54, Maarten Bosmans  wrote:
>
>> Op ma 13 mrt 2023 om 04:44 schreef john :
>> > My first guess is that it's from creating a cache of quickfill entries
>> to populate a drop-down list of possible entries similar to the way the
>> transfer account field has worked for a couple of years.
>>
>> Yes, I've isolated it to the commit "Change the Register description
>> layout cell type", Bob in CC.
>> That branch adds the combobox and quickfill to the description field
>> of the register. In my case those are fairly long (~100 chars) and all
>> unique strings, as they come frome downloaded bank statements and
>> include a timestamp, account holder, actual description, etc. So for
>> my use case having a combo box to easy filling out new items is not
>> that useful anyway. May be we can think of a way to adapt the
>> behaviour to be useful in Bob's case (I suppose manual entry of a
>> short and often reused description text), but not slow down my case?
>>
>> > An obvious optimization is to get a collation key with
>> g_utf8_collate_key for each string and use that for doing the actual
>> sorting/ordered inserting. It's still a char-by-char comparison but it
>> saves having to validate and normalize the strings on every compare.
>> I will have a look into storing the collated string in the QuickFill.
>> That probably doubles the memory usage, but should not be too bad.
>>
>> Maarten
>>
>
>
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-15 Thread john
Bob,

Please open a bug report and attach your sample test file. 

Regards,
John Ralls

> On Mar 15, 2023, at 4:02 AM, Robert Fewell <14ubo...@gmail.com> wrote:
> 
> I wanted to get some numbers on this as my test file seemed OK.
> I used Calc to create a CSV transaction import file with 8402 rows and some 
> description columns with 16, 64 and 128 character random strings.
> Used this to import several times to a new empty gnucash xml file and added 
> some timing for the account open command in 4.903 and 4.13 with results 
> below...
> 
> With 4.903
> Description 16 Characters and 16804 unique transactions / descriptions, 0.93, 
> second time 0.73
> Description 64 Characters and 16804 unique transactions / descriptions, 2.39, 
> second time 1.67
> Description 128 Characters and 16804 unique transactions / descriptions, 
> 4.08, second time 2.90
> 
> With 4.13
> Description 16 Characters and 16804 unique transactions / descriptions, 0.49, 
> second time 0.35
> Description 16 Characters and 16804 unique transactions / descriptions, 1.22, 
> second time 0.61
> Description 16 Characters and 16804 unique transactions / descriptions, 1.91, 
> second time 0.93
> 
> Regards,
> Bob
> 
> On Tue, 14 Mar 2023 at 18:54, Maarten Bosmans  > wrote:
>> Op ma 13 mrt 2023 om 04:44 schreef john > >:
>> > My first guess is that it's from creating a cache of quickfill entries to 
>> > populate a drop-down list of possible entries similar to the way the 
>> > transfer account field has worked for a couple of years.
>> 
>> Yes, I've isolated it to the commit "Change the Register description
>> layout cell type", Bob in CC.
>> That branch adds the combobox and quickfill to the description field
>> of the register. In my case those are fairly long (~100 chars) and all
>> unique strings, as they come frome downloaded bank statements and
>> include a timestamp, account holder, actual description, etc. So for
>> my use case having a combo box to easy filling out new items is not
>> that useful anyway. May be we can think of a way to adapt the
>> behaviour to be useful in Bob's case (I suppose manual entry of a
>> short and often reused description text), but not slow down my case?
>> 
>> > An obvious optimization is to get a collation key with g_utf8_collate_key 
>> > for each string and use that for doing the actual sorting/ordered 
>> > inserting. It's still a char-by-char comparison but it saves having to 
>> > validate and normalize the strings on every compare.
>> I will have a look into storing the collated string in the QuickFill.
>> That probably doubles the memory usage, but should not be too bad.
>> 
>> Maarten

___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-15 Thread Robert Fewell
I wanted to get some numbers on this as my test file seemed OK.
I used Calc to create a CSV transaction import file with 8402 rows and some
description columns with 16, 64 and 128 character random strings.
Used this to import several times to a new empty gnucash xml file and added
some timing for the account open command in 4.903 and 4.13 with results
below...

With 4.903
Description 16 Characters and 16804 unique transactions / descriptions,
0.93, second time 0.73
Description 64 Characters and 16804 unique transactions / descriptions,
2.39, second time 1.67
Description 128 Characters and 16804 unique transactions / descriptions,
4.08, second time 2.90

With 4.13
Description 16 Characters and 16804 unique transactions / descriptions,
0.49, second time 0.35
Description 16 Characters and 16804 unique transactions / descriptions,
1.22, second time 0.61
Description 16 Characters and 16804 unique transactions / descriptions,
1.91, second time 0.93

Regards,
Bob

On Tue, 14 Mar 2023 at 18:54, Maarten Bosmans  wrote:

> Op ma 13 mrt 2023 om 04:44 schreef john :
> > My first guess is that it's from creating a cache of quickfill entries
> to populate a drop-down list of possible entries similar to the way the
> transfer account field has worked for a couple of years.
>
> Yes, I've isolated it to the commit "Change the Register description
> layout cell type", Bob in CC.
> That branch adds the combobox and quickfill to the description field
> of the register. In my case those are fairly long (~100 chars) and all
> unique strings, as they come frome downloaded bank statements and
> include a timestamp, account holder, actual description, etc. So for
> my use case having a combo box to easy filling out new items is not
> that useful anyway. May be we can think of a way to adapt the
> behaviour to be useful in Bob's case (I suppose manual entry of a
> short and often reused description text), but not slow down my case?
>
> > An obvious optimization is to get a collation key with
> g_utf8_collate_key for each string and use that for doing the actual
> sorting/ordered inserting. It's still a char-by-char comparison but it
> saves having to validate and normalize the strings on every compare.
> I will have a look into storing the collated string in the QuickFill.
> That probably doubles the memory usage, but should not be too bad.
>
> Maarten
>
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-14 Thread Maarten Bosmans
Op ma 13 mrt 2023 om 04:44 schreef john :
> My first guess is that it's from creating a cache of quickfill entries to 
> populate a drop-down list of possible entries similar to the way the transfer 
> account field has worked for a couple of years.

Yes, I've isolated it to the commit "Change the Register description
layout cell type", Bob in CC.
That branch adds the combobox and quickfill to the description field
of the register. In my case those are fairly long (~100 chars) and all
unique strings, as they come frome downloaded bank statements and
include a timestamp, account holder, actual description, etc. So for
my use case having a combo box to easy filling out new items is not
that useful anyway. May be we can think of a way to adapt the
behaviour to be useful in Bob's case (I suppose manual entry of a
short and often reused description text), but not slow down my case?

> An obvious optimization is to get a collation key with g_utf8_collate_key for 
> each string and use that for doing the actual sorting/ordered inserting. It's 
> still a char-by-char comparison but it saves having to validate and normalize 
> the strings on every compare.
I will have a look into storing the collated string in the QuickFill.
That probably doubles the memory usage, but should not be too bad.

Maarten
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-13 Thread Thomas Baumgart
On Montag, 13. März 2023 17:08:22 CET Maarten Bosmans wrote:

> Op ma 13 mrt 2023 om 12:27 schreef Robert Fewell <14ubo...@gmail.com>:
> > I am curious how you created that graph?
> 
> With Intel VTune. That's the software I know to use for $DAYJOB
> (mainly HPC related stuff). I think there are also several free
> software options available to produce flame-graphs based on profiles
> recorded by perf.

I can recommend Hotspot: https://www.kdab.com/hotspot-video/

Even though provided by a company it's open source.

-- 

Regards

Thomas Baumgart

-
Real backups of your NAS can be found with the NSA
-


signature.asc
Description: This is a digitally signed message part.
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-13 Thread Maarten Bosmans
Op ma 13 mrt 2023 om 12:27 schreef Robert Fewell <14ubo...@gmail.com>:
> I am curious how you created that graph?

With Intel VTune. That's the software I know to use for $DAYJOB
(mainly HPC related stuff). I think there are also several free
software options available to produce flame-graphs based on profiles
recorded by perf.

> Also can you build 4.903?

I've been putting off building GnuCash myself for now, but I think
will have to do that at some point in the near future, so I'll retry
it once I get around to doing that.

> I am wondering if it is down to the sorting of the list store, by default 
> gnc_item_list_new has sorting enabled so on every entry the list will be 
> sorted.
> I wonder if it was disabled and in gnc_split_register_load we enabled the 
> sorting at the end after all entries added it would improve the situation.
> I only have 719 transactions for an account in my test file so not sure I 
> will see the difference.

Ah, yes. If I remeber correctly from the Gtk2 era, there was some
advice for disabling list sorting temporarily when inserting a large
number of items.
Although, I would have to check which gtk call in `gnc_item_list_new`
is the culprit here.

Maarten
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-13 Thread Robert Fewell
Maarten,
I am curious how you created that graph?
Also can you build 4.903?

I am wondering if it is down to the sorting of the list store, by default
gnc_item_list_new has sorting enabled so on every entry the list will be
sorted.
I wonder if it was disabled and in gnc_split_register_load we enabled the
sorting at the end after all entries added it would improve the situation.
I only have 719 transactions for an account in my test file so not sure I
will see the difference.
Regards,
Bob



On Mon, 13 Mar 2023 at 03:44, john  wrote:

>
>
> > On Mar 12, 2023, at 1:28 PM, Maarten Bosmans 
> wrote:
> >
> > Hi all,
> >
> > When testing the 4.902 flatpak, I noticed that loading an account
> > takes several (~4.5) seconds. This account has about 24k transactions
> > and loads in less than half a second on the GnuCash 4.8 from my
> > distribution. Is this a regression because of a code change, or
> > perhaps simply the result of a debug build in the flatpak?
> >
> > From the attached flamegraph you can see that a lot of time is spent
> > in `g_utf8_collate` for the quickfill insert. That function does not
> > appear below `gnc_quickfill_insert` for the 4.8 run. I did a quick
> > search in the git history to see that may be some caller was changed
> > to the `QUICKFILL_ALPHA` sort method, but could not easily find such a
> > thing.
> >
> > Another difference with 4.8 is stat `gnc_table_realize_gui` (which
> > takes 0.85s) does not appear below
> > `gnc_plugin_page_register_create_widget` and that function only takes
> > 0.03s in total. This time seems to be spent in Gtk.
> >
> > I don't think this is a release blocking bug, but GnuCash feels quite
> > sluggish to me, so it would be nice to fix this papercut before 5.0.
> > Any pointers on how to proceed?
>
>
> My first guess is that it's from creating a cache of quickfill entries to
> populate a drop-down list of possible entries similar to the way the
> transfer account field has worked for a couple of years.
>
> An obvious optimization is to get a collation key with g_utf8_collate_key
> for each string and use that for doing the actual sorting/ordered
> inserting. It's still a char-by-char comparison but it saves having to
> validate and normalize the strings on every compare.
>
> Regards,
> John Ralls
>
> ___
> gnucash-devel mailing list
> gnucash-devel@gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel
>
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: [GNC-dev] Performance regression loading account

2023-03-12 Thread john



> On Mar 12, 2023, at 1:28 PM, Maarten Bosmans  wrote:
> 
> Hi all,
> 
> When testing the 4.902 flatpak, I noticed that loading an account
> takes several (~4.5) seconds. This account has about 24k transactions
> and loads in less than half a second on the GnuCash 4.8 from my
> distribution. Is this a regression because of a code change, or
> perhaps simply the result of a debug build in the flatpak?
> 
> From the attached flamegraph you can see that a lot of time is spent
> in `g_utf8_collate` for the quickfill insert. That function does not
> appear below `gnc_quickfill_insert` for the 4.8 run. I did a quick
> search in the git history to see that may be some caller was changed
> to the `QUICKFILL_ALPHA` sort method, but could not easily find such a
> thing.
> 
> Another difference with 4.8 is stat `gnc_table_realize_gui` (which
> takes 0.85s) does not appear below
> `gnc_plugin_page_register_create_widget` and that function only takes
> 0.03s in total. This time seems to be spent in Gtk.
> 
> I don't think this is a release blocking bug, but GnuCash feels quite
> sluggish to me, so it would be nice to fix this papercut before 5.0.
> Any pointers on how to proceed?


My first guess is that it's from creating a cache of quickfill entries to 
populate a drop-down list of possible entries similar to the way the transfer 
account field has worked for a couple of years.

An obvious optimization is to get a collation key with g_utf8_collate_key for 
each string and use that for doing the actual sorting/ordered inserting. It's 
still a char-by-char comparison but it saves having to validate and normalize 
the strings on every compare.

Regards,
John Ralls

___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel