Re: Status fvwm-menu-desktop

2011-12-20 Thread Thomas Adam
On Mon, Dec 19, 2011 at 06:11:57PM -0500, Dan Espen wrote:
 I think TA mentioned it was slow.  I don't know if he measured it
 or not.

Yeah:  http://xteddy.org/~n6tadam/fvwm/nytprof/

Note interpret_menu() is the culprit -- the scan_AppDir() and
read_desktop_entry() subroutines don't help, but I suspect most of that is
down to having to stat() and read in a lot of small files, so I wouldn't
concentrate on those two subroutines just yet.

If anyone needs help interpreting the output from that, let me know.

-- Thomas Adam

-- 
Deep in my heart I wish I was wrong.  But deep in my heart I know I am
not. -- Morrissey (Girl Least Likely To -- off of Viva Hate.)



Re: Status fvwm-menu-desktop

2011-12-20 Thread Dan Espen
Thomas Adam tho...@fvwm.org writes:

 On Mon, Dec 19, 2011 at 06:11:57PM -0500, Dan Espen wrote:
 I think TA mentioned it was slow.  I don't know if he measured it
 or not.

 Yeah:  http://xteddy.org/~n6tadam/fvwm/nytprof/

Nice!

 Note interpret_menu() is the culprit -- the scan_AppDir() and
 read_desktop_entry() subroutines don't help, but I suspect most of that is
 down to having to stat() and read in a lot of small files, so I wouldn't
 concentrate on those two subroutines just yet.

 If anyone needs help interpreting the output from that, let me know.

Well, aren't we most concerned with elapsed time?
Doesn't that put read_desktop_entry right at the top
of our list.

Our script runs slowly the first time and then pretty fast
the second time.  Most certainly because all the files it reads
are cached.  Were you able to control for that in your timing tests?

I notice a number of calls to check_file which would be doing the
stat you mention.  I don't understand the point of calling check_file
without checking the return value.  It doesn't set any globals or
print any diagnostics.  I count five calls like that.


-- 
Dan Espen



Re: Status fvwm-menu-desktop

2011-12-20 Thread Dan Espen
Thomas Adam tho...@fvwm.org writes:

 On Tue, Dec 20, 2011 at 01:57:37PM -0500, Dan Espen wrote:
 Thomas Adam tho...@fvwm.org writes:
 
  On Mon, Dec 19, 2011 at 06:11:57PM -0500, Dan Espen wrote:
  I think TA mentioned it was slow.  I don't know if he measured it
  or not.
 
  Yeah:  http://xteddy.org/~n6tadam/fvwm/nytprof/
 
 Nice!
 
  Note interpret_menu() is the culprit -- the scan_AppDir() and
  read_desktop_entry() subroutines don't help, but I suspect most of that is
  down to having to stat() and read in a lot of small files, so I wouldn't
  concentrate on those two subroutines just yet.
 
  If anyone needs help interpreting the output from that, let me know.
 
 Well, aren't we most concerned with elapsed time?
 Doesn't that put read_desktop_entry right at the top
 of our list.

 Running this multiple times with a hot-cache (which is what the above tests
 are showing) still make the running time of fvwm-menu-desktop to be ~3
 seconds in total.  The kernel will cache most of this for us after the first
 run, but it's still slow.

 Part of the problem is down to us stat()ing files from check_file() without
 checking what it's returning (as you've rightly pointed out), as well as the
 recursive nature of some of these Scan_App*() subroutines.

I like that web page.  Here's the breakout of stat() calls:

# 874 times by main::read_desktop_entry at line 596remove
# 764 times by main::scan_AppDir at line 469   remove
# 764 times by main::scan_AppDir at line 473
# 123 times by main::scan_DirectoryDir at line 495 remove
#  97 times by main::read_directory_entry at line 520  remove

Those are the top 5.  Out of the 5 big users, 4 of the stat calls
are unnecessary.  I marked them with remove but I'm sure you've
already found them.

 Ideally, File::Find for its ilk could be used here.  This has the advantage
 of not needing to be as-recursive.  Note also that to reduce some of the
 stat() calls in our current implementation, one should be using perl's magic
 placeholder _ on file tests, to not disregard the original stat(2) call,
 reducing the number of stat()s overall.

After removing the ones we don't need (above), there's not many calls left.
We only deal with a few directories.  We could readdir and build a hash.

 The recursive nature of some of these seek-and-find subroutines can be
 further sped-up with a very simple hash -- which I've gone ahead and done.

 See patch attached with this email.  It's only a proof-of-concept, and I do
 not intend to merge this in as-is without some further work done to it.

Sorry, haven't looked at the patch yet, but I will.

 The temptation here is to use something like Memoize, but this won't work
 for us, alas.

 Timings with the patch attached applied are here:

 http://xteddy.org/~n6tadam/fvwm/nytprof-updated/

 intepret_menu() still has to be looked at though.

Regarding interpret menu, it deals with trees build by the parser.
I'm wondering if we should try to use the 'sub' callback method
instead of 'tree' for the xml parse.
Putting the tree in memory and then scanning it
seems like extra work.  Especially because half the entries appear to
be something we don't want.
I don't think it would be an easy change though.


-- 
Dan Espen



Re: Status fvwm-menu-desktop

2011-12-20 Thread Thomas Adam
On Tue, Dec 20, 2011 at 06:14:54PM -0500, Dan Espen wrote:
 After removing the ones we don't need (above), there's not many calls left.
 We only deal with a few directories.  We could readdir and build a hash.

Yes, but the recusiveness of those subroutines before the hashing I added in
the patch were having to continally stat the same directories when they were
told to recurse.  Using File::Find would avoid this -- and since it's a core
module anyway, would save us also reinventing the wheel and maintaining it.

Refactoring to improve readability, BTW, even if the gain is only negligable
-- or where the techniques used are an improvement within the remit of the
language -- is something I tend to like doing.  In this case, using more
core perl modules is always an easy win.

 Regarding interpret menu, it deals with trees build by the parser.
 I'm wondering if we should try to use the 'sub' callback method
 instead of 'tree' for the xml parse.

That's one possibility.  But my main gripe with this is not so much how the
XML is parsed but the way in which the information is stored (I agree with
you, we're storing redundant data) and then accessed.

We've got a multitude of hashes split off from larger hashes, banded about
between different subroutines -- the way the data is stored does not make it
at all clear what the end result of the parsing of this data is for -- and
that's my critical problem; maintaining this becomes impossible.

Take a look at interpret_entry_node() -- considering that splice() is
already used elsewhere in the code, for instance, it should also have been
used here -- I shudder when I see this:

elsif ($tree-[$i] eq '0')
{
$i++;
$i++;
}

It's just sloppy -- and shows there was little thought put in to what the
data representation of this should be -- and arrays clearly weren't meant to
be it here -- not when you have to increment $i by 2 in some cases.

It's not as if this is useful -- you'd have to explicitly remember the
ordering of the items in the array; make some guarantee that they're always
returned in the same order, and perhaps even go to lengths to ensure that.
Not to mention we're looking at O(n) for the lookups, maybe a little less if
the array is splice()d, but it still sucks.  If there's more than one node
we're looking over (which we are in the case of this tree structure), then
that's O(n * m) -- just because the wrong structure was used.

And no, I am not interested in big-oh really, I'm just using it to prove a
point that even in something as simple as this -- the stat() call problem is
masking another more fundamental problem which NYTProf is hinting at
slightly for us.

But I know you are not responsible for this Dan, it's just academically
interesting to note these observations.

 Putting the tree in memory and then scanning it
 seems like extra work.  Especially because half the entries appear to
 be something we don't want.
 I don't think it would be an easy change though.

You're right, it wouldn't be an easy change, which is why I am very hesitant
at Thomas Funk's improvements to this -- ripping out the logic for some of
these interpret_*() subroutines is one thing -- but that's pretty pointless
when the turd you're polishing is already leaving the smear stains behind it
as you do so.

So if were going to do this, this is where I would start -- looking at the
structure of what XML::Parse is giving us, and what data we need from XDG --
when we can reliably get a sense of that, then we can look at changing
structures, etc.

This is something I tried to get across in my reply to the last thread on
fvwm-menu-desktop; perhaps this reply is clearer.   I do hope so.

-- Thomas Adam

-- 
Deep in my heart I wish I was wrong.  But deep in my heart I know I am
not. -- Morrissey (Girl Least Likely To -- off of Viva Hate.)



Re: Status fvwm-menu-desktop

2011-12-20 Thread Dan Espen
Thomas Adam tho...@fvwm.org writes:

 Timings with the patch attached applied are here:

 http://xteddy.org/~n6tadam/fvwm/nytprof-updated/

That's what I call impressive improvments!

I applied the patch, works well for me.

-- 
Dan Espen