On Thu, Feb 10, 2011 at 3:08 PM, Robert Kern <robert.k...@gmail.com> wrote:
> On Thu, Feb 10, 2011 at 15:32, eat <e.antero.ta...@gmail.com> wrote: > > Hi Robert, > > > > On Thu, Feb 10, 2011 at 10:58 PM, Robert Kern <robert.k...@gmail.com> > wrote: > >> > >> On Thu, Feb 10, 2011 at 14:29, eat <e.antero.ta...@gmail.com> wrote: > >> > Hi Robert, > >> > > >> > On Thu, Feb 10, 2011 at 8:16 PM, Robert Kern <robert.k...@gmail.com> > >> > wrote: > >> >> > >> >> On Thu, Feb 10, 2011 at 11:53, eat <e.antero.ta...@gmail.com> wrote: > >> >> > Thanks Chuck, > >> >> > > >> >> > for replying. But don't you still feel very odd that dot > outperforms > >> >> > sum > >> >> > in > >> >> > your machine? Just to get it simply; why sum can't outperform dot? > >> >> > Whatever > >> >> > architecture (computer, cache) you have, it don't make any sense at > >> >> > all > >> >> > that > >> >> > when performing significantly less instructions, you'll reach to > >> >> > spend > >> >> > more > >> >> > time ;-). > >> >> > >> >> These days, the determining factor is less often instruction count > >> >> than memory latency, and the optimized BLAS implementations of dot() > >> >> heavily optimize the memory access patterns. > >> > > >> > Can't we have this as well with simple sum? > >> > >> It's technically feasible to accomplish, but as I mention later, it > >> entails quite a large cost. Those optimized BLASes represent many > >> man-years of effort > > > > Yes I acknowledge this. But didn't they then ignore them something > simpler, > > like sum (but which actually could benefit exactly similiar > optimizations). > > Let's set aside the fact that the people who optimized the > implementation of dot() (the authors of ATLAS or the MKL or whichever > optimized BLAS library you linked to) are different from those who > implemented sum() (the numpy devs). Let me repeat a reason why one > would put a lot of effort into optimizing dot() but not sum(): > > """ > >> However, they are frequently worth it > >> because those operations are often bottlenecks in whole applications. > >> sum(), even in its stupidest implementation, rarely is. > """ > > I don't know if I'm just not communicating very clearly, or if you > just reply to individual statements before reading the whole email. > > >> and cause substantial headaches for people > >> building and installing numpy. > > > > I appreciate this. No doubt at all. > >> > >> However, they are frequently worth it > >> because those operations are often bottlenecks in whole applications. > >> sum(), even in its stupidest implementation, rarely is. In the places > >> where it is a significant bottleneck, an ad hoc implementation in C or > >> Cython or even FORTRAN for just that application is pretty easy to > >> write. > > > > But here I have to disagree; I'll think that at least I (if not even the > > majority of numpy users) don't like (nor I'm be capable/ or have enough > > time/ resources) go to dwell such details. > > And you think we have the time and resources to do it for you? > > > I'm sorry but I'll have to > > restate that it's quite reasonable to expect that sum outperforms dot in > any > > case. > > You don't optimize a function just because you are capable of it. You > optimize a function because it is taking up a significant portion of > total runtime in your real application. Anything else is a waste of > time. > > Heh. Reminds me of a passage in General Bradley's *A Soldier's Story *where he admonished one of his officers in North Africa for taking a hill and suffering casualties, telling him that one didn't take a hill because one could, but because doing so served a purpose in the larger campaign. <snip> Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion