(rand) is expensive -- removing the two (rand)s knocks about 40 seconds off
it, nearly 1/5 the total time. I'll try replacing them with lookup from a
precalculated grid of randoms -- long-range correlations shouldn't matter
here.



On Thu, Nov 8, 2012 at 8:00 PM, Cedric Greevey <cgree...@gmail.com> wrote:

> On Thu, Nov 8, 2012 at 3:48 PM, Cedric Greevey <cgree...@gmail.com> wrote:
>
>> I have the following code to perform a complicated image convolution. It
>> takes 10-15 seconds with output dimensions 256x256 and samples 6. No
>> reflection warnings, and using unchecked math doesn't speed it up any. I've
>> tried to ensure it uses primitive math inside the loops, aside from
>> generating the outer loop's values. What cached-load-chunk does shouldn't
>> matter much, but in most cases it should boil down to a map lookup inside a
>> swap! and a couple of atom derefs and function calls. The bottleneck is
>> likely in the math somewhere, and likely something is being boxed, though
>> I've primitive-coerced every numerical let and loop value and avoided more
>> than two arguments per arithmetic op.
>>
>> Can anyone spot anything I haven't that could be causing boxed arithmetic
>> inside the loops?
>>
>
> I've now checked for Var lookups (none outside the caching function, and
> now none inside either) and checked the caching code itself (there's a .get
> on a closed-over ConcurrentHashMap, a null check, a .get on a
> SoftReference, and another null check, on each lookup, if there isn't a
> cache miss on that lookup; plus a couple more method calls for the IFn
> invokes and an ivar fetch to get the ConcurrentHashMap reference).
>
> In the absence of cache misses I'm still seeing ~3.5 *minutes* at 1280x720
> with 10 samples (= about 100 million iterations total of the inner loop).
> The arithmetic in there is 23 floating-point ops, five compares, a log, an
> atan, and three bitwise ANDs. Without the log and atan a hundred million of
> that inner loop should take a second on this box. I very much doubt the log
> and the atan are 209 times slower than the rest of it combined. So there's
> three likely culprits: boxing, two calls to (rand), and the BufferedImage
> .getRGB method call (on a 6000x2198 24bpp image, though its size should
> matter not), unless trig is really that slow.
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to