Re: need help optimizing a function

2002-10-09 Thread Andrew J Bromage

G'day all.

On Wed, Oct 09, 2002 at 02:29:26PM -0400, David Roundy wrote:

> I get a speedup of about a factor of 6 for the test files I was using (of
> course, this would depend on file size), and find that now only 2% of my
> time is spent in that function.  I'm still something like 100 times slower
> than GNU diff, so I think (hope, certainly!) there's still room for more
> improvement (another day).  I'm sure I'll have more questions in a few
> days, once I've tracked down what the new bottlenecks are.

If you understand logic languages, you might want to look at the diff
implementation which I wrote for the Mercury distribution:

http://www.cs.mu.oz.au/research/mercury/

It's pretty close to GNU diff in speed.  In fact, it was
indistinguishable on every test case I could think of.

There are, two main differences to what I could gather from your 
implementation.

First thing was that I noticed that a lot of time was being spent
doing string comparisons.  I inserted a pre-pass which mapped strings
to (unboxed) integers with the constraint that the integers are equal
if and only if the strings are equal.  This also turned out to be
critical for implementing flags such as --ignore-space-change.

The other was I used a different algorithm than you did:

Eugene W. Myers. "An O(ND) difference algorithm and its
variations." Algorithmica, 1:251-266, 1986. 

I found it to be much faster than the O(n log n) algorithm, even
on cases where you would expect it to perform poorly (i.e. where
D is large), partly because the constant factors are really, really
low and because in the pre-pass you can optimise the case where you
have a number of consecutive lines none of which appear anywhere
else, which is typical for most uses of diff.

Cheers,
Andrew Bromage
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: need help optimizing a function

2002-10-09 Thread David Roundy

On Tue, Oct 08, 2002 at 07:52:52AM -0700, Hal Daume III wrote:
> I already saw one reply mentioning using state threaded arrays.  This is
> probably your best (in terms of performance) option.
> 
> Keep in mind that Haskell arrays are *not* your standard imperative
> arrays.  Like imperative arrays, they give you O(1) lookup, but only
> O(n) insert.  If you want to keep with a functional style, I'd suggest
> using a different data structure for the data.  A FiniteMap should work
> fine and should give you pretty good (at least much
> better) performance.  And you won't have to give up the FP style (for
> whatever that's worth).

Thanks everyone for the advice! I think I now understand pretty well why my
old code didn't work.  I ended up switching to a binary tree structure,
since I didn't really need O(1) lookup anyways (since I had to do a binary
search in the first place to decide which element if any to modify.  This
changes the algorithm as a whole (I believe) from being O(n^2) back to O(n
log n) as it should be.

I get a speedup of about a factor of 6 for the test files I was using (of
course, this would depend on file size), and find that now only 2% of my
time is spent in that function.  I'm still something like 100 times slower
than GNU diff, so I think (hope, certainly!) there's still room for more
improvement (another day).  I'm sure I'll have more questions in a few
days, once I've tracked down what the new bottlenecks are.
-- 
David Roundy
http://civet.berkeley.edu/droundy/
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: need help optimizing a function

2002-10-08 Thread Zdenek Dvorak

Hello,

> > p.s., I sense your next question is going to be something like "why
> > can't the compiler detect that the array can be updated in place
> > instead of copied" and the answer, from what i can tell, is simply
> > that it doesn't try.
>
>And one might argue that it should not try
[snip]

and the other answer is that the result would not be worth the effort
usually. The idea seems nice, but the laziness is where I see the problem;
see the following code:

do_something::Array Int Int->(Array Int Int, Result)
do_something a = (a'',res)
where
  a' = a // [(summon_index, summon_int)]
  w = a' ! 4
  res = do_some_work w
  a'' = a' // [(summon_other_index, summon_other_int)]

Seems like nice candidate for update-in-place? But the problem is, that
in general you cannot say when w will be evaluated -- so you simply cannot
destroy a' before then.

This is also reason why DiffArray (from hslibs) does not have to be very 
useful,
unless you have a good control on when things are evaluated -- it may (and 
it
also happened to me) have quadratic behavior in cases like this.

To solve this, '!' would have to produce a new array too; but then you must
pass it everywhere and effectively you are doing the work monads would do
for you.

So in general you must choose -- either your programs will be nice and 
functional,
but you will have to use structures with (some kind of) logarithmic slowdown
(FiniteMap, Trie), or some of their parts will be embedded in monads.

Zdenek Dvorak

_
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: need help optimizing a function

2002-10-08 Thread Alastair Reid


Hal Daume <[EMAIL PROTECTED]> writes:

> p.s., I sense your next question is going to be something like "why
> can't the compiler detect that the array can be updated in place
> instead of copied" and the answer, from what i can tell, is simply
> that it doesn't try.

And one might argue that it should not try since the result would be
incredibly fragile with small, semantics preserving changes to the
program confusing the analysis and causing changes in the big-O
complexity of your program and, since the analysis would undoubtedly
be pretty darn cunning, it'd also be virtually impossible to
understand why it did or did not manage to detect the
single-threadedness of your code.

I think one of the reasons that approaches based on types (monads,
MADTs, linear types, etc.) are so popular is that they make the
programmers assertion (X is used in a single-threaded manner)
sufficiently explicit that the compiler can complain when it disagrees
and, since parts of the analysis show through in the typesystem, they
make the analysis less of a black box.

--
Alastair Reid [EMAIL PROTECTED]  
Reid Consulting (UK) Limited  http://www.reid-consulting-uk.ltd.uk/alastair/


ps The opening paragraphs of Paul Hudak's MADT paper
(http://www.haskell.org/papers/madt.ps) have a more coherent
explanation of why smarter compilers are not the answer.

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: need help optimizing a function

2002-10-08 Thread Hal Daume III

I already saw one reply mentioning using state threaded arrays.  This is
probably your best (in terms of performance) option.

Keep in mind that Haskell arrays are *not* your standard imperative
arrays.  Like imperative arrays, they give you O(1) lookup, but only
O(n) insert.  If you want to keep with a functional style, I'd suggest
using a different data structure for the data.  A FiniteMap should work
fine and should give you pretty good (at least much
better) performance.  And you won't have to give up the FP style (for
whatever that's worth).

 - Hal

p.s., I sense your next question is going to be something like "why can't
the compiler detect that the array can be updated in place instead of
copied" and the answer, from what i can tell, is simply that it doesn't
try.

--
Hal Daume III

 "Computer science is no more about computers| [EMAIL PROTECTED]
  than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume

On Tue, 8 Oct 2002, David Roundy wrote:

> Hello.  If you followed my previous post (which was about a bug in my lcs
> function), this is still the same code (only now bug-free, it seems).
> 
> It turns out that my lcs diff function is MUCH slower than GNU diff on some
> test files (much worse than 1000 time slower!).The test case where its
> speed was close was IO bound on the output to an xterm (silly me).
> 
> I've profiled my lcs code, and it is spending about 95% of its time in the
> "hunt_one_char" function.  It also claims to spend 0.0% of its time in
> my_bs, which does a binary search on the array, and is the only other
> function that gets called by hunt_one_char (not counting itself).
> 
> I've told the compiler (ghc) to inline hunt_one_char and my_bs to no avail.
> I guess maybe it can't or won't inline a recursive function, or maybe that
> just doesn't gain me anything.
> 
> My only guess is that for some reason it is making an entirely new copy of
> the array each time it recurses, but that doesn't make any sense, since the
> array is getting threaded through, so it should be able to just modify the
> array in place.  But then, I'm not entirely clear as to what the compiler
> can and cannot do as it optimizes.
> 
> >>data Threshold a = Thresh Int! [a]
> >> deriving (Show)
> >>
> >>hunt_one_char :: String -> [Int] -> Array Int (Threshold String) ->
> >> Array Int (Threshold String)
> >>hunt_one_char c [] th = th
> >>hunt_one_char c (j:js) th =
> >>case my_bs (Thresh j [c]) th of
> >>Nothing -> hunt_one_char c js th
> >>Just k ->
> >>case th!(k-1) of
> >>Thresh _ rest ->
> >>hunt_one_char c js th//[(k,Thresh j (c:rest))]
> 
> For what it's worth, the calling function is:
> 
> >>hunt_internal :: [String] -> [[Int]] -> 
> >> Array Int (Threshold String) ->
> >> Array Int (Threshold String)
> >>hunt_internal [] _ th = th
> >>hunt_internal _ [] th = th
> >>hunt_internal (c:cs) (m:ms) th =
> >>hunt_internal cs ms $ hunt_one_char c m th
> 
> Each string in this list of strings is a line from one of the files (and
> the [[Int]] is a list of line numbers in the other file that that line
> matches).  The array 'th' has size (1,n) where n is the number of lines in
> the file.
> 
> My simplest test case consists of a 20k line file, each line of which is
> unique and a second file which is simply a permutation of the first (moving
> the first line to the last), so hunt_one_char should be called exactly once
> for each line, and it is always called with its second argument having
> exactly one element.  In this test case, according to the ghc profiler,
> 89.7% of the time is spent in hunt_one_char:
>   individual inherited
> COST CENTRE  MODULE entries  %time %alloc   %time %alloc
>   hunt_internal  Main 200010.2   0.0 91.4  95.3
>hunt_one_char Main 4   89.7  95.0 91.1  95.3
> my_bsMain 20.2   0.0  1.4   0.3
>  my_helper_bsMain3072311.2   0.3  1.2   0.3
> 
> This has me at a loss.  It seems like such a simple function... of course,
> it is called 20k times, which could certainly add up, but while each call
> of my_bs should take O(log 20k) time, each call of hunt_one_char (apart
> from its one call to my_bs) should only take O(1), so most of the time
> should be spent in my_bs unless something is terribly wrong (which must be
> the case here).
> 
> If it would help, I'd be happy to send a listing of the complete code, but
> it's 6.5k so I figured it'd be better not to send it, since it seems
> unlikely that anyone would want to run it anyways.
> -- 
> David Roundy
> http://civet.berkeley.edu/droundy/
> ___
> Haskell-Cafe mailing list
> [EMAIL PROTECTED]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
> 

_

Re: need help optimizing a function

2002-10-08 Thread Pedro Vasconcelos

On Tue, 8 Oct 2002 09:14:26 -0400
David Roundy <[EMAIL PROTECTED]> wrote:

> 
> My only guess is that for some reason it is making an entirely new copy of
> the array each time it recurses, but that doesn't make any sense, 
...

> >>hunt_one_char :: String -> [Int] -> Array Int (Threshold String) ->
> >> Array Int (Threshold String)
> >>hunt_one_char c [] th = th
> >>hunt_one_char c (j:js) th =
> >>case my_bs (Thresh j [c]) th of
> >>Nothing -> hunt_one_char c js th
> >>Just k ->
> >>case th!(k-1) of
> >>Thresh _ rest ->
> >>hunt_one_char c js th//[(k,Thresh j (c:rest))]

Yes, I think you're actually copying the array at each recursive invocation when you 
do the update "th//[..]". The compiler can't guess that the array can be updated 
in-place.

You could use state-thread arrays rather than ordinary Haskell arrays. State-thread 
are an extension to H98 standard that is supported by ghc and hugs (at least). The 
basic idea is to use a monad to encapsulate the code that manipulates memory 
references/updates in place and guaranties that the references are actually used in a 
single-threaded way.

Look up the paper entitled "Lazy Functional State Threads" by John Lauchbury and Simon 
Peyton Jones (it's available for download). Section 3 contains examples with array 
updates in-place.

Hope this helps,

Pedro Vasconcelos.

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



need help optimizing a function

2002-10-08 Thread David Roundy

Hello.  If you followed my previous post (which was about a bug in my lcs
function), this is still the same code (only now bug-free, it seems).

It turns out that my lcs diff function is MUCH slower than GNU diff on some
test files (much worse than 1000 time slower!).The test case where its
speed was close was IO bound on the output to an xterm (silly me).

I've profiled my lcs code, and it is spending about 95% of its time in the
"hunt_one_char" function.  It also claims to spend 0.0% of its time in
my_bs, which does a binary search on the array, and is the only other
function that gets called by hunt_one_char (not counting itself).

I've told the compiler (ghc) to inline hunt_one_char and my_bs to no avail.
I guess maybe it can't or won't inline a recursive function, or maybe that
just doesn't gain me anything.

My only guess is that for some reason it is making an entirely new copy of
the array each time it recurses, but that doesn't make any sense, since the
array is getting threaded through, so it should be able to just modify the
array in place.  But then, I'm not entirely clear as to what the compiler
can and cannot do as it optimizes.

>>data Threshold a = Thresh Int! [a]
>> deriving (Show)
>>
>>hunt_one_char :: String -> [Int] -> Array Int (Threshold String) ->
>> Array Int (Threshold String)
>>hunt_one_char c [] th = th
>>hunt_one_char c (j:js) th =
>>case my_bs (Thresh j [c]) th of
>>Nothing -> hunt_one_char c js th
>>Just k ->
>>case th!(k-1) of
>>Thresh _ rest ->
>>hunt_one_char c js th//[(k,Thresh j (c:rest))]

For what it's worth, the calling function is:

>>hunt_internal :: [String] -> [[Int]] -> 
>> Array Int (Threshold String) ->
>> Array Int (Threshold String)
>>hunt_internal [] _ th = th
>>hunt_internal _ [] th = th
>>hunt_internal (c:cs) (m:ms) th =
>>hunt_internal cs ms $ hunt_one_char c m th

Each string in this list of strings is a line from one of the files (and
the [[Int]] is a list of line numbers in the other file that that line
matches).  The array 'th' has size (1,n) where n is the number of lines in
the file.

My simplest test case consists of a 20k line file, each line of which is
unique and a second file which is simply a permutation of the first (moving
the first line to the last), so hunt_one_char should be called exactly once
for each line, and it is always called with its second argument having
exactly one element.  In this test case, according to the ghc profiler,
89.7% of the time is spent in hunt_one_char:
  individual inherited
COST CENTRE  MODULE entries  %time %alloc   %time %alloc
  hunt_internal  Main 200010.2   0.0 91.4  95.3
   hunt_one_char Main 4   89.7  95.0 91.1  95.3
my_bsMain 20.2   0.0  1.4   0.3
 my_helper_bsMain3072311.2   0.3  1.2   0.3

This has me at a loss.  It seems like such a simple function... of course,
it is called 20k times, which could certainly add up, but while each call
of my_bs should take O(log 20k) time, each call of hunt_one_char (apart
from its one call to my_bs) should only take O(1), so most of the time
should be spent in my_bs unless something is terribly wrong (which must be
the case here).

If it would help, I'd be happy to send a listing of the complete code, but
it's 6.5k so I figured it'd be better not to send it, since it seems
unlikely that anyone would want to run it anyways.
-- 
David Roundy
http://civet.berkeley.edu/droundy/
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe