Re: optimizing reports
Andrew Sackville-West [EMAIL PROTECTED] writes: I suppose this raises the larger issue of what is going on long-term with the reports. Is the amount of work necesary to fix the performance issues sufficient to warrant looking at the long term goal with reporting with an eye to just doing that change now? I know Derek wants to work e-guile into the mix and implement some kind of templating. Others have suggested binding in a whole 'nother language for reporting. I don't know the answers and lack the knowledge, experience and position to answer them. Should the reporting structure be re-worked altogether? now or later? Should the current structure be cleaned up in parallel to that effort to improve performance in the interim? Or should it be left as it is, just improved? (What follows are excerpts from some notes I've been working on on the subject.) Reports have some problems: - defined in scheme, hard to modify, weird errors, c. - emit HTML, no conventional templating - bad html, no css - name-based identity - options are stored as scheme code - strange to add new report types - report options are inconsistent - report options dialogs are not HIG-compliant In particular, the storage of report options and saved/open reports as evaluated lisp expressions tightly couples us to a particular technology for the reports. The rough form of a revised reporting infrastructure I'd like to see is: - reports declared as data, rather than registered as code. - separation of the report generation code from the report rendering code. something like: book v generator, report-provided report-model (dict) v renderer (template application) report-output (html) v HTML engine, GOG display v {screen,file} Where the report generator specifically emits only a data structure (language neutral, dictionary-list-string-number-boolean, c.) that is the input to the rendering phase. The separation supports separation of concerns, layering, independent evolution and development/testing. I think a report might be well-suited to be a bundle ... some structured, self-contained collection of files. It is a ${format} archive with a known-location manifest file report.def, which contains the basic report definition. By example: foo.tar: - report.def - report.script - template.html - local.css report.def: [gnucash-report-v2] name = Foo Report desc = The ISO-1234-26b foo report. id = 0a1b2c3d4e5f6a7b8c9d0e1f parent = assets-expense report-type = scm load-files = report.asl, helper.asl options-entry-point = foo-options generator-entry-point = foo-report renderer-entry-point = template_apply template-file = template.thtml name.fr = Foo Réport desc.fr = Lé réport du generallies foo... A goal would be to have something like ~/.gnucash/reports.d/, where a user could publish a new report (as that single archvie), and other users could simply save it there and have it appear in their gnucash instance. On the front-end, we should move to a more normal generation scripting language (perl, python, ruby) and template-based rendering solution. That should be consumed by gecko, not gtkhtml, as it doesn't suck. I could see a time where we are in transition, and have both v1 (existing) and v2 (proposed here) reports co-implemented. With respect to the Options, we should convert the options implementation From scheme + closures to C + GObject/GInterface + signals. The existing saved/open reports are all basically of the form: (let ((optionDb (report-default-options report-name))) (set optionDb 'optionA new-value) (set optionDb 'optionB new-value) (create-report report-name optionDB)) As such, we strongly require a guile interpreter to parse/eval all this. As the options are moved into C, they'll need bindings to support at least handling these existing files, but should then serialize back into a non-guile-specific format. I've started down this path on a private source tree. -- ...jsled http://asynchronous.org/ - a=jsled; b=asynchronous.org; echo [EMAIL PROTECTED] pgpYY4yXc1FwY.pgp Description: PGP signature ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
As a humble suggestion, it might pay large dividends to invest some time researching scheme profiling methods. This would cover Christian's ideas of checking what was really bogging down the report(s), and would help you or anyone else fix or speed up other reports in the future. Perhaps folks here already know of something. Google popped this up in a brief perusal just now: http://www.cygwin.com/ml/guile/2000-07/msg00206.html BEGIN QUOTE Is there any Scheme code profiler that works with Guile? It seems Guile's core (libguile/eval.c) has no such code in it. Is it a good idea to work on this? (I guess the debug evaluator may have such facilities...) This is actually fairly easy. Even the patch below gives some useful information: % guile guile (set! *profile-all* #t) guile (use-modules (oop goops)) guile (load profile.scm) END QUOTE This post included a short patch to guile source. This is just an example, I don't think it will really work (based on comments in follow-ups to that post), but hopefully things have improved since 2000 when that post was made. Regards, Dan W. okay, thanks for this history. I agree (as I think everyone does) that there are some significant performance issues. As far as the layout/html stuff goes, I really don't care, but the performance is a huge factor. FWIW, Firefox is pretty nimble for me most of the time. Except when it comes to rendering large tables. Therefore, layout is intimately tied to performance in this one particular extreme case (which happens to be not-quite-just-a-corner-case in GnuCash reports). ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
Andrew Sackville-West [EMAIL PROTECTED] writes: I suppose this raises the larger issue of what is going on long-term with the reports. Is the amount of work necesary to fix the performance issues sufficient to warrant looking at the long term goal with reporting with an eye to just doing that change now? I know Derek wants to work e-guile into the mix and implement some kind of templating. Others have suggested binding in a whole 'nother language for reporting. I don't know the answers and lack the knowledge, experience and position to answer them. Should the reporting structure be re-worked altogether? now or later? Should the current structure be cleaned up in parallel to that effort to improve performance in the interim? Or should it be left as it is, just improved? My personal opinion: I'd add in e-guile if I could find a free weekend to actually hack on GnuCash. I think it's a short term fix for the templating issue, not a long-term solution. In the long-term I think we need to change both the reporting infrastructure and the display methodology. I.e., I think we want to swap out GtkHTML for Gecko, and probably, simultaneously, drop in a new reporting infrastructure. Granted, these are PROBABLY separable projects, but why not work to get it done at the same time? I think the newer system should definitely be template-based, and we can choose whatever template wrapper language seems correct at the time. A -derek -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH [EMAIL PROTECTED]PGP key available ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
On Sun, Oct 07, 2007 at 09:48:00AM -0400, Derek Atkins wrote: My personal opinion: I'd add in e-guile if I could find a free weekend to actually hack on GnuCash. I think it's a short term fix for the templating issue, not a long-term solution. ITs more complicated than I can handle at *this* point. Though I think I understand the gist of it, the implementation is beyond me -- I lack the knowledge of the existing codebase and some understanding of who it all flows. Its been percolating in my brain for a couple years now (and in fact I've got a tarball of e-guile sitting here), but I'venot looked at gnucash code *at all* since then. So, based on that, I'm going to forge ahead with the current project (fixing up the income statement so that it runs faster). That'll do two things 1)make my accoutning go much faster and 2) get me familiar with the code again so that maybe in the future I can help with the below. A In the long-term I think we need to change both the reporting infrastructure and the display methodology. I.e., I think we want to swap out GtkHTML for Gecko, and probably, simultaneously, drop in a new reporting infrastructure. Granted, these are PROBABLY separable projects, but why not work to get it done at the same time? I think the newer system should definitely be template-based, and we can choose whatever template wrapper language seems correct at the time. A -derek -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH [EMAIL PROTECTED]PGP key available ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel -- signature.asc Description: Digital signature ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
Am Samstag, 6. Oktober 2007 05:28 schrieb Andrew Sackville-West: I'm bent on improving the speed of some of these reports That's great! I think I've mentioned that before, but I'm also very unhappy with the current speed aka slowness of the text reports. I'm especially unhappy about that because my initial implementation of those was significantly faster, and only the rewriting of those reports by the changes mainly in r10078 (David Montenegro, June/July 2004) made them much slower (see my gnucash-devel msgs on 2006-06-20 and 2006-03-19). My investigation into this points to the function gnc:html-acct-table-add-accounts! in reports/report-system/html-acct-table.scm. This function appears to be the workhorse of the income statement but is also used in balance sheet, budget and trial balance reports, so fixing it would likely help those guys as well. Yes. But we would have to find out more specifically the sub-functions where the CPU time is actually spent. Currently the report recurses through the account tree gathering totals for each account and it sub accounts. It appears that it walks all the way out to the leaf nodes at each level, so that the sub account totals get calculated repeatedly making this a hugely inefficient function. Yes. However, the previous implementation did exactly the same in terms of balance calculation. It is still available in html-utilities.scm's function gnc:html-build-acct-table and in particular add-group! there. For that reason I would believe the balance calculation itself doesn't seem to be the main problem. The main problem must be somewhere else around this... for example, the newer code might run the balance calculation on much more accounts than the old account; or the large (append env ...) statement in html-acct-table.scm:746 ff might consume a lot of time; or yet something else. I want to clean that up and what I'm thinking is to recurse through the tree once totalling up each relevant account and returning those totals in some structure that contains the accounts and their totals. Then walk through the tree generating the output table based on the required depth. This means I'd still be walking a tree structure twice, but I'd only be doing the per-account math once. Again, this might indeed help, but on the other hand this inefficiency was present in the earlier implementation and didn't seem to cause big problems there. It might still be reasonable to work on this part, but maybe it would pay off to examine a bit more whether this is really the trouble-causing part. Christian ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
On Sat, Oct 06, 2007 at 11:23:53AM +0200, Christian Stimming wrote: Am Samstag, 6. Oktober 2007 05:28 schrieb Andrew Sackville-West: I'm bent on improving the speed of some of these reports That's great! I think I've mentioned that before, but I'm also very unhappy with the current speed aka slowness of the text reports. I'm especially unhappy about that because my initial implementation of those was significantly faster, and only the rewriting of those reports by the changes mainly in r10078 (David Montenegro, June/July 2004) made them much slower (see my gnucash-devel msgs on 2006-06-20 and 2006-03-19). I'll check those out. thanks. My investigation into this points to the function gnc:html-acct-table-add-accounts! in ... Yes. But we would have to find out more specifically the sub-functions where the CPU time is actually spent. ... be the main problem. The main problem must be somewhere else around this... for example, the newer code might run the balance calculation on much more accounts than the old account; or the large (append env ...) statement in html-acct-table.scm:746 ff might consume a lot of time; or yet something else. I'll spend some more time trying to narrow it down. There is one part that references (in the comments) recursing over accounts that aren't even used. I want to clean that up and what I'm thinking is to recurse through the tree once totalling up each relevant account and returning those totals in some structure that contains the accounts and their totals. Then walk through the tree generating the output table based on the required depth. This means I'd still be walking a tree structure twice, but I'd only be doing the per-account math once. Again, this might indeed help, but on the other hand this inefficiency was present in the earlier implementation and didn't seem to cause big problems there. It might still be reasonable to work on this part, but maybe it would pay off to examine a bit more whether this is really the trouble-causing part. In all honesty, and in due respect to whoever rewrote that part, it looks like it was written by someone who doesn't know scheme or functional languages in general. I'll admit that I have next to no experience except that my earlier years of hacking involved som euse of funtional languages and they just seem to work for my brain. So there are lots of things that are done in that function that appear to my eye to be done very inefficiently. I saw that and sort of assumed that the rpoblem was the overall achitecture of the function. Of course, I could be completely wrong and it could certainly be any number of other things called from within that function. I'll dig in some more and see if can can narrow it down to more than just the whole function is slow. Though I still believe that to be the case. :) thanks Christian, I'll be in touch. A signature.asc Description: Digital signature ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
On Sat, Oct 06, 2007 at 09:11:57AM -0700, Andrew Sackville-West wrote: ... In all honesty, and in due respect to whoever rewrote that part, it looks like it was written by someone who doesn't know scheme or functional languages in general. that came out way worse sounding than I meant or believe. I hope no one was offended. And I apologise. I'll admit that I have next to no experience except that my earlier years of hacking involved som euse of funtional languages and they just seem to work for my brain. emphasis on the next to no experience and thats double-plus true. humbly A signature.asc Description: Digital signature ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Re: optimizing reports
On Sat, Oct 06, 2007 at 11:23:53AM +0200, Christian Stimming wrote: Am Samstag, 6. Oktober 2007 05:28 schrieb Andrew Sackville-West: I'm bent on improving the speed of some of these reports That's great! I think I've mentioned that before, but I'm also very unhappy with the current speed aka slowness of the text reports. I'm especially unhappy about that because my initial implementation of those was significantly faster, and only the rewriting of those reports by the changes mainly in r10078 (David Montenegro, June/July 2004) made them much slower (see my gnucash-devel msgs on 2006-06-20 and 2006-03-19). okay, thanks for this history. I agree (as I think everyone does) that there are some significant performance issues. As far as the layout/html stuff goes, I really don't care, but the performance is a huge factor. I think that the formatting issues could be dealt with at a later time, especially in light of how long it would take to modify code.run test reports/look at results with really slow reports ;) I suppose this raises the larger issue of what is going on long-term with the reports. Is the amount of work necesary to fix the performance issues sufficient to warrant looking at the long term goal with reporting with an eye to just doing that change now? I know Derek wants to work e-guile into the mix and implement some kind of templating. Others have suggested binding in a whole 'nother language for reporting. I don't know the answers and lack the knowledge, experience and position to answer them. Should the reporting structure be re-worked altogether? now or later? Should the current structure be cleaned up in parallel to that effort to improve performance in the interim? Or should it be left as it is, just improved? A signature.asc Description: Digital signature ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel
optimizing reports
Hi guys, I'm bent on improving the speed of some of these reports and want to bounce some ideas off you folks. The particular report I'm interested in improving is the income statement. On my smaller file (133k) it takes about 20 seconds to run a bog standard income statement. I haven't timed it on my larger file (1.4M), but it takes way too long (I usually switch over to /. please help me!). My investigation into this points to the function gnc:html-acct-table-add-accounts! in reports/report-system/html-acct-table.scm. This function appears to be the workhorse of the income statement but is also used in balance sheet, budget and trial balance reports, so fixing it would likely help those guys as well. Currently the report recurses through the account tree gathering totals for each account and it sub accounts. It appears that it walks all the way out to the leaf nodes at each level, so that the sub account totals get calculated repeatedly making this a hugely inefficient function. For example, giving this: Toplvl--- A --- A1 |-A2 |-A3 |-A4A4a |-A4b To calculate the balances of all these, it would calculate the whole tree for the balance ot Toplvl; re-calculate all of A sub-tree to get A's balance; re-calculate A1; re-calc A2; re-calc A3; re-calc the whole A4 sub-tree; then re-calc A4a; re-calc A4b etc etc etc... bad. I want to clean that up and what I'm thinking is to recurse through the tree once totalling up each relevant account and returning those totals in some structure that contains the accounts and their totals. Then walk through the tree generating the output table based on the required depth. This means I'd still be walking a tree structure twice, but I'd only be doing the per-account math once. I imagine the first walk would end up returning a list of toplevel accounts, each member of which would be a cons of that account's balance and a list of it subaccounts, each member of which would be a cons... you get the idea. So before I start hacking my fingers off, does this idea make sense? (it does to me...) or is there something blatantly obvious that I'm missing in this general idea? Also, if I'm mis-reading that code, please let me know, but I think I have the gist of it pretty well. A signature.asc Description: Digital signature ___ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel