Re: [Help-gsl] gsl performance

onefire Sun, 06 Oct 2013 14:03:26 -0700

Rhys, I did try to use views. They do not help because the gsl routines
allocate vectors internally and there is not much that I can do about it...
except for maybe hacking gsl and changing gsl_vector_alloc myself.


"The main target of a library like GSL is an inherent consistency of its
objects and as it defines the gsl_vector and uses it everywhere in its
methods, consistency is reached. ."

That's what I thought too. That would be more consistent with my initial
question about gsl design not being about performance.
I heard about nlopt before and I plan to benchmark my results against them
as well. I started with gsl because I am more familiar with it.



On Sun, Oct 6, 2013 at 4:38 PM, Simon Zehnder
<[email protected]>wrote:

> I write between the lines.
>
> On Oct 6, 2013, at 10:04 PM, onefire <[email protected]> wrote:
>
> > I am aware of those. I tried the following:
> > 1) Call gsl_set_error_handler_off() to turn off the error handler
> > 2) Compile with -DHAVE_INLINE  and -DGSL_RANGE_CHECK_OFF
> > 3) Link to a different cblas (actually I tried openblas, mkl and
> gslcblas).
> >
> > However, my most interesting experiment was to modify my minimization
> algorithms to accept dynamic arrays. Originally, my function was using
> std::array to store the values of the independent variables when minimizing
> a multidimensional function. My experiment did mimic my use case: solve the
> same problem millions of times (to be precise, a million times). When I
> used dynamic arrays, my algorithm became much slower because of all the
> calls to malloc/new, and free/delete. So I did modify my algorithm to
> accept a gsl_vector. Guess what happened? My library became slower than
> gsl! I am not sure if this is because my Nelder-Mead implementation is
> different than gsl's (which I did not look at) or if its just the extra
> overhead of making the wrapper.
> >
> Here a profiling could give clearer insights - if its possible - about the
> gsl functions that take the most time.
> > My point is: The huge differences that I observed were because of
> gsl_vector which is not efficient for small arrays (stack memory is faster
> and you avoid the costs to malloc/free). So I think that the gsl
> minimization algorithms could have a version that uses stack arrays. I
> should not have to pay for the cost of dynamic allocation if I am
> minimizing a function of 5 variables (a typical version of the Nelder-Mead
> algorithm would require 6 points, including the function evaluations that
> would require an array of 36 elements, which is fine for the stack) . I
> think that gsl could have alternative implementations that use stack memory
> (with warnings to not use those if the problem is big).
> >
> Interesting! Indeed in most statistical problems I am involved in around 6
> parameters is also a good average. Physicists and Engineers may have larger
> parameter vectors. The main target of a library like GSL is an inherent
> consistency of its objects and as it defines the gsl_vector and uses it
> everywhere in its methods, consistency is reached. Performance is something
> that is not its main target I would guess. Providing a function though,
> that works with STL objects - commonly used by C++ developers is not a bad
> idea and - as you tell - it improves performance. Especially embedding GSL
> into self-developed code becomes easier with such a solution, as the step
> from own code to GSL-methods would be more 'fluently'. For another
> performance benchmark you could try the Nelder-Mead from nlopt. This works
> with std::vectors.
>
> > The other issue is: the implementation of gsl_vector just seems
> inefficient to me. Looking at the code, it seems like a single vector
> requires 3 calls to malloc and free (one for the data, one for the
> gsl_block, and one for the gsl_vector itself). The manual states that the
> block is there for "consistency", and I can see how memory management
> becomes easier with it. But it seems to be a case of generality at the
> expense of performance. Also, the stride and the owner flag are part of the
> gsl_vector object to make it work with gsl_views, but then people who never
> need views pay the performance price anyway.
> >
> Again, I would assume the main developing target was inherent consistency.
>
> > All of these issues are not likely to matter for big vectors, but they
> do make a large difference when you are dealing with small objects.
> >
> > Am I the only one who thinks like this?
> >
> > Gilberto
> >
> > PS: As a side note, I prefer Eigen over Armadillo. In my experience, the
> former is almost always faster.
> >
> Thanks for the suggestion, I heard about it but never tried it. I will
> give it a shot.
> >
> > On Sun, Oct 6, 2013 at 5:09 AM, Simon Zehnder <
> [email protected]> wrote:
> > Hi Gilbaerto,
> >
> > congrats on your performance results!
> >
> > A first guess of the worse performance of the gsl library would be
> exception throwing. The GSL is made for 'users' and this means, that a user
> has to be informed about the kind of exception he encounters. This can be
> left out if you have your own code and you know your own code. If something
> goes wrong, you know where to look in your code where the error occurred
> and what could be the reasons. A 'user' just using a library could be
> helpless when he encounters and error and does not know where it occurred.
> So checking and error catching could be a reason, that makes the gsl less
> performant but more appropriate for 'easy' usage.
> >
> > I use a lot the C++ linear algebra library Armadillo. This library has
> for example two different element accessors: One, '()' which makes
> checkings and the second 'at()' with no checkings. Performance increases
> sometimes tremendously when using the 'at()' method.
> >
> > Best
> >
> > Simon
> >
> >
> > On Oct 6, 2013, at 12:44 AM, onefire <[email protected]> wrote:
> >
> > > Hi all,
> > >
> > > I am creating a personal library for my C++ projects and to evaluate
> its
> > > performance I decided to compare it to gsl and, to my surprise, my
> library
> > > is much faster than gsl in many cases. For example, my Nelder-Mead
> > > implementation can be 5 or 10 faster for some functions.
> > >
> > > This has been bothering me a lot because:
> > > 1) My tests are correct, and I checked the results against output from
> > > Scipy and Octave.
> > > 2) I am certainly not smarter than the many people who worked on gsl
> over
> > > the years.
> > > 3) I think I know how to use gsl "properly".
> > >
> > > Sorry I do not have an example, but my library is not polished enough
> for
> > > me to publish it yet.
> > >
> > > What I am really intrigued about  is that I googled around and it seems
> > > that many people noticed (before me) that many gsl implementations are
> > > inefficient. I also found some posts on the mailing lists with things
> like
> > > "gsl is much slower than Matlab" and responses like "gsl is not meant
> to be
> > > the fastest".
> > >
> > > These things lead me to believe that gsl is "slow" by design, If this
> is
> > > the case, can someone explain to me why this is so?
> > >
> > > Gilberto
> > >
> > > PS: Much of the gain with C++ comes from better inlining, which can be
> > > specially important with things like function optimization (it is
> hard, if
> > > not impossible, to inline when there is a funcyion pointer passed
> around).
> > > This is why std::sort can be much faster than lqsort. However, I am
> > > confident that it is possible to write faster implementations (in C)
> than
> > > some of the ones in gsl.
> >
> >
>
>

Re: [Help-gsl] gsl performance

Reply via email to