On Sat, Apr 11, 2020 at 3:55 PM David Kastrup <d...@gnu.org> wrote:
> >> You haven't understood the scheme.  A grob only contains the properties
> >> seminal to its type.  That's why there is a double indirection.  A full
> >> array of 416 entries is needed per grob _type_, not per grob.
> >
> > I still don't see why this requires an overall overhaul of the
> > internal calling convention. You can have a global registry of symbol
> > => uint16 (numbering all the symbols),
>
> It would be pointless to have a single registry for music properties,
> grob properties, stream event properties, and context properties, thus
> significantly blowing up the second-level indexing tables and making for
> a large number of cache misses.

there are 400 grob properties, and about 800 properties in total
(including music and context). 400 requires 9 bits, 800 requires 10
bits. It doesn't appreciably change the problem size.

> > memoize that uint16 like we memoize the symbol today (I think this is
> > what you suggested already).  Then use a uint16 => value hashtable in
> > the grob/prob/etc. A custom hash implementation can save space by
> > storing the uint16 key rather than the 8 byte SCM symbol.
> >
> > This would be a very conservative, localized change that doesn't
> > require setting up machinery to convert types to vectors,
>
> A vector is cheaper to index than a hashtable.  It makes no sense
> talking aobut "setting up machinery" with regard to a vector while
> taking hash tables for granted.
>
> > and it keeps working if grob types change which grob-interfaces they
> > support halfway through.
>
> I don't think there is any point in discussing this patch based on what
> you surmise about an ultimate solution.


Have a look at https://github.com/hanwen/lilypond/tree/grob-dict

It explores different proposals, all of them with neglible to negative
performance impact.

The last series introduces a 'symid' (a global, memoized uint16
identifier for a property), with the optimized hashtable.

I think you could implement what you want to do try by doing the symid
=> per grob type vector index lookup in Grob::internal_xx_property. I
still don't see what you'd do with the infromation you gain by
reconstructing the call sites. Even if you know the C++ type (Grob vs
Context) in the call site, you still won't know the actual grob type
(Stem vs. NoteHead) in the call sites, which you'd need to do further
memoization.

I looked a bit closer at your call-site reorganization. It introduces
'this' as parameter in many places, which I find ugly, and certainly
is nonstandard idiom in C++. I suggest we proceed only if you can
produce a measurable performance improvement. If that happens, I'd be
happy to help deal with any merge conflicts. Until that time, you can
work on the change by not rebasing it.

> >> Hashing outside of a closed set still requires storing the actual symbol
> >> for corroboration.  Hashing inside of a closed set amounts to the
> >> first-stage lookup in my scheme, except that a small number guaranteed
> >> to be unique comes out.  This lookup can be memoised and will work
> >> permanently as opposed to hashing in a growing set.
> >>
> >> > We could experiment with this without requiring a giant rewrite of the
> >> > source code.
> >>
> >> What you call a "giant rewrite" is basically this patch.  It's a purely
> >> syntactical patch only extending rather than reducing possible
> >> approaches and perfectly compatible with different implementations.  The
> >> work for that patch is already done, the impact is similar to what
> >> happened when changing around the syntax of unsmob a few times ian the
> >> past.  It will allow for prolonged work on possibly different
> >> implementations without having to constantly deal with merge conflicts.
> >>
> >> It does not depend on picking a particular implementation.  You
> >> expressed interest in what I was planning to do with the opportunity
> >> this patch opens, I answered.  I do not, however, see the point in
> >> discussing and dissing (partially) unwritten code based on an
> >> incomplete understanding in a handwaving discussion.
> >
> > I think it is not out of the ordinary to discuss plans invasive plans
> > before implementing them.
>
> But we are not discussing an invasive plan here.  This here is an
> extensive, not an invasive change.  There is no point in dissing
> unwritten code that take no project-wide resources.  The point of time
> to review code is after it is proposed.
>
> > I think this change is invasive because it changes the idiom across
> > the source code, and I disagree that this change is net neutral. C++
> > method calls are a more natural idiom plain function calls.
>
> We are not talking about C++ method calls here.  We are talking about
> macro calls.  Memoisation of any kind can only be implemented at the
> macro call level when diversifying on a string literal since those are
> not available as template arguments, otherwise one could transfer the
> job to template specialisation like it is done in a few other places.
> So the macro needs information to work with.  This change makes what
> actually happens more rather than less transparent and idiomatic.
>
> > If we can get a significant performance boost, that may be worth it,
> > but I think we should first understand if that is the case.
>
> This patch only increases the options for changing implementations.
>
> > As said (see above), I am skeptical.
> >
> >> It is more efficient to discuss objections of the "you haven't thought
> >> this through" kind on working code.
> >>
> >> Nothing in this patch here enforces picking a particular route.  It is
> >> just paving the ground for more work, and it does so in a manner that
> >> isn't so much of an eye sore that it can only be tolerated in
> >> expectation of immense future gains.
>
> --
> David Kastrup



-- 
Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen

Reply via email to