Hi Toby,

I've already told this on IRC. The GEMV uses very conservative profiles
with very few threads. Now that I have ported a simple version of GEMM
(when only full matrices are used), I'll re-bind the generator into
pyviennacl and will try to get an auto-tuning up and runing in python.
Then, I'll update the profiles to something better :)

Philippe


2014-07-06 13:22 GMT+02:00 Toby St Clere Smithe <m...@tsmithe.net>:

> Hey all,
>
> I'm getting on the plane in a couple of hours, so this might be the last
> you here from me till the middle of the night Europe time.
>
> Karl Rupp <r...@iue.tuwien.ac.at> writes:
> >  > I suggest we start unifying in a couple of days indeed. I still have a
> >> couple of things to merge, essentially having GEMM dynamically generated
> >> for some cases and publishing the repo for auto-tuning using pyviennacl.
> >> These have to be done soon so that Toby can present some good benchmarks
> >> at the talk.
>
> I'm currently struggling to get decent performance out of viennacl-dev
> master, even when not doing GEMM. Consider single-precision dense GEMV
> using a square matrix and vector with 4096 rows/cols. On the GTX 470 on
> krupp2, one execution takes ~0.100s; on the C2050, ~0.106s. Execution
> overhead is about 0.0003s. But NumPy with MKL takes only ~0.004s; I know
> that krupp2 has an 8(?)-core i7, so (something like) 512 rows/cols per
> core, but I still didn't expect the gap to be like that. It's strange,
> because my GeForce 610M takes ~0.090s, and my Intel Ivy Bridge M GT2 GPU
> takes ~0.001s (at last competitive with MKL, though I'm waiting to test
> correctness as I write this). And NumPy with OpenBLAS on my i5 takes
> ~0.009s. Any hints?
>
> > These are all valid points. What about cropping this 'offset' and use
> > something like the following:
> >
> > namespace viennacl { namespace linalg {
> >
> > void some_api_function() { ... }
> >
> > namespace detail
> > {
> >    void some_implementation_detail() { ... }
> > }
> >
> > }}
> >
> > This would preserve the benefit of a visual separation of public API and
> > private implementations, yet remove the 'global' indent offset  from the
> > source file.
>
> I like this; and indeed is what I tend to do when indentation gets out
> of hand.
>
> Best,
>
> Toby
>
>
>
>
> --
> Toby St Clere Smithe
> http://tsmithe.net
>
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> ViennaCL-devel mailing list
> ViennaCL-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to