Re: The Solr Benchmark Module is targeted for commit next week

Mark Miller Wed, 11 Aug 2021 02:37:25 -0700

Yeah, a Solr interpreter is a bit more of a lift, this interpreter just
handles firing off parameterized benchmarks and deals with results.


It would be nice to have a Solr interpreter as well even for this use case
though, so you can easily query the state of things after a benchmark run.

A decent Solr interpreter would likely be accepted into Zeppelin itself
when ready.

MRM

On Mon, Aug 9, 2021 at 12:37 PM Jason Gerlowski <[email protected]>
wrote:

> Thanks for the info!
>
> The Zeppelin stuff in particular piques my interest: I explored a
> Zeppelin/Solr integration a bit in SOLR-15080, but ultimately never
> committed it because of some lukewarm feedback from David S on the PR
> and some shifting personal priorities.  If others are using Zeppelin
> maybe the idea is worth revisiting though...
>
> Jason
>
> On Wed, Aug 4, 2021 at 10:42 PM Mark Miller <[email protected]> wrote:
> >
> > Yes, it’s a new Gradle module called benchmark.
> >
> > I’ll likely commit the base early tomorrow. It’s been working through
> pre commit checks.
> >
> > There are currently only two benchmarks to start, but I have more that
> I’ll be adding.
> >
> > Once I have a reasonable number in, I’ll run some comparisons with the
> 8x branch. Eventually, I’ll do some comparisons with the ref branch as
> well, but probably not that soon.
> >
> > I also have a subproject for the module that builds an Apache Zeppelin
> interpreter, which allows creating a notebook of parametrized benchmarks
> which can be versioned and allows for organizing various run plans,
> charting, and various other things.
> >
> > You can turn on a variety of profilers via command line, gc, jfr, and
> the async profiler being the most common I’ve used.
> >
> > I’ve been using it on both relatively small runs as well as runs against
> up to 150 GB of index.
> >
> > I’ve also used it in docker and will add more support I have to make
> that simple option. My main motivation there being to be able to control
> and vary hardware resources.
> >
> > Mark
> >
> > On Sun, Aug 1, 2021 at 2:01 PM Jason Gerlowski <[email protected]>
> wrote:
> >>
> >> Just clarifyng, but the "Solr Benchmark Module" you're referring to
> >> here is your work from SOLR-15428? Or something else?
> >>
> >> Jason
> >>
> >> On Sun, Aug 1, 2021 at 12:16 AM Mark Miller <[email protected]>
> wrote:
> >> >
> >> > I’m about ready to commit the first iteration of the Solr benchmark
> module.
> >> >
> >> > It is meant to target both micro and macro benchmarks, though it is
> additive to, not a replacement for, Gatling and a full performance cluster.
> >> >
> >> > The inner workings of Solr and SolrCloud have always been something
> of a mystery to me. Benchmarking has been as well. Not that I ever spent
> any time thinking clearly about that.
> >> >
> >> > If I had, I wouldn’t have had an alternative plan to rectify it. And
> it didn’t matter. It didn’t affect me getting work. It didn’t affect my
> bonus from the boss.
> >> >
> >> > Over the past few years I did start to learn something about these
> mysteries though. Not with a genius plan of attack. Not with a strategy I
> can write down on the wiki and successfully share with you. I did it by
> attacking everything in sight. And then improving my sight.
> >> >
> >> > If some genius computer God once said “don’t do this”, I did and
> found out why not. If something looked huge effort for unlikely reasonable
> return, I did it. And maybe scrapped it. If something took literally 16
> hours just to manually process the code changes with 0 thought the whole
> time and repetitive pain and loud expletives accompanying the final hours,
> I did it. And sometimes maybe scrapped it later.  If there was a rabbit
> hole, I went down it.
> >> >
> >> > I used the tests to chase features and code and surface area I’d
> never have touched or even known existed. I spent hundreds of hours or more
> building tools and hundreds more coopting existing tools to expand my grasp
> and view. I went after other code bases with a similar attack and less
> depth to raise my vantage point.
> >> >
> >> > And I could go on, except that illustrates my point and there is
> little value in doing so.
> >> >
> >> > So I learned a couple things on that journey. And I found an answer
> or two. Formed an opinion or three. And I’ve had to think. Think about how
> I can turn that into some value for Apache Solr. I chose to do that work,
> but I was also paid during that time. Paid for work that is supposed to end
> up returning value. The basic employer / employee contract.
> >> >
> >> > I will never march down that path again. The destination was never
> really the point. No sane developer would or could join the full trip.
> >> >
> >> > I have to use that journey to plot a new one.
> >> >
> >> > Thought one: there was huge value in playing around with the system.
> Trying a wide range of things simply. Getting valuable and low effort
> feedback and introspection easily.
> >> >
> >> > Thought two: I did not play around or explore much before, or see it
> done, because it was high effort to explore even a small surface area. Even
> more effort to properly vet or ensure getting quality results or
> information from it.
> >> >
> >> > Thought three: Continuing on thought two, setting up good experiments
> is very difficult. Collecting results and evaluating the quality of those
> results is very difficult. More difficult than many developers who would
> immediately agree with those statements even know. In the way that Elon
> Musk knew fully self driving cars would be difficult. But he didn’t know it
> would be “that” difficult. Of course a smaller percentage of developers do
> know the extent of it.
> >> >
> >> > Thought four: When the above was even attempted, it was generally by
> developers working in isolation. And climbing on their own scaffolding that
> was not peer reviewed and either tossed out, abandoned, reconstructed, or
> maybe eventually reused by one.
> >> >
> >> > Thought five: Building something that allows for exploration and
> experimentation essentially always reduces to some kind of benchmark type
> framework. And benchmarks are notoriously and ridiculously difficult. See
> above. Any project that wants to truly benefit from them needs to work on
> them together. And retain them. And improve them. And retain and improve
> the knowledge behind them.
> >> >
> >> > And so we come to the Solr benchmark module. I’ve poured some of my
> knowledge and experience into standing up an initial framework. I will
> document it. I will share a video explaining some of the what and why and
> how. And I will make it so easy to join in on that the only reason a
> developer will not join the effort is because they have no interest in
> understanding or improving the system and their changes.
> >> >
> >> > So I will make a commit next week. And then I will continue to move
> it forward. I encourage you to take a look and evaluate what return for
> what effort you might get from joining in.
> >> >
> >> > MRM
> >> >
> >> >
> >> > --
> >> > - Mark
> >> >
> >> > http://about.me/markrmiller
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> > --
> > - Mark
> >
> > http://about.me/markrmiller
>
-- 
- Mark

http://about.me/markrmiller

Re: The Solr Benchmark Module is targeted for commit next week

Reply via email to