Re: The Solr Benchmark Module is targeted for commit next week

Mark Miller Fri, 13 Aug 2021 14:32:47 -0700

So the initial JMH benchmark module is in. It’s fairly minimal to start,
even compared with what I already have done with it in older forms.  I
wouldn’t go rushing to dive into it, although if you have a need or use
case or interest, I will lend a hand.


It’s a small but significant step on a path I’m working toward, liking
lacking much external shine given that it’s a hearth Ive crossed various
times upon different stilts and contraptions, but FYI to the interested,
there are still plenty of undocumented and unfinished and unfleshed out
parts in the pipeline.  I’ve been considering how to make a solid,
reliable, performant, large, complex  Java system since about 2016, and I
personally feel a slow build up for this module is the right strategy. I
will personally use it to make some fairly significant improvements in the
short term - to keep the pitchforks outside my door and food on my table,
but that won’t be the solution. Further doc and support that is coming from
me won’t be the full solution either. Discussing and learning about
performance and reliability won’t be a solution, me addressing a couple
thousand issues or tests won’t be an answer, talking about coordinated
omission and profiler differences and lapses won’t scratch my itch, not the
upcoming dramatic moves forward in new Java releases or talking about life
cycle, and tcp stats and thread management and cpu utilization, not
equating high cpu utilization to cpu being the bottleneck and not most
other facts and figures will get me to my finish line. Tapping the current
wealth of knowledge and experience in some existing Solr engineers,
bringing other guest luminaries to spread enlightenment, demonstrating
alternative benchmark crushing code, removing code, killing overseers - all
fine, but nothing there I’m really chasing either.  But I’ve still been and
am on the tail of things that will scratch my itch. Or the tail of ghosts,
take your pick. But with my own vision of what I’m after regardless.

So as the benchmark module fills out, as I demonstrate some of what can be
done with it, always on the trail of higher return for less effort, do take
a look, take advantage. Help build it out. It’s certainly a keystone in
what I’d like to see - buts it’s early, the results I’ll get as I start to
pull from it are just required tolls, and over a bit longer term, I’m
looking for paths to returns like I’ve discovered through sheer lack of
sleep or distraction with little of the regular costs or individual(s)
efforts over time.

MRM

On Wed, Aug 11, 2021 at 4:37 AM Mark Miller <[email protected]> wrote:

> Yeah, a Solr interpreter is a bit more of a lift, this interpreter just
> handles firing off parameterized benchmarks and deals with results.
>
> It would be nice to have a Solr interpreter as well even for this use case
> though, so you can easily query the state of things after a benchmark run.
>
> A decent Solr interpreter would likely be accepted into Zeppelin itself
> when ready.
>
> MRM
>
> On Mon, Aug 9, 2021 at 12:37 PM Jason Gerlowski <[email protected]>
> wrote:
>
>> Thanks for the info!
>>
>> The Zeppelin stuff in particular piques my interest: I explored a
>> Zeppelin/Solr integration a bit in SOLR-15080, but ultimately never
>> committed it because of some lukewarm feedback from David S on the PR
>> and some shifting personal priorities.  If others are using Zeppelin
>> maybe the idea is worth revisiting though...
>>
>> Jason
>>
>> On Wed, Aug 4, 2021 at 10:42 PM Mark Miller <[email protected]>
>> wrote:
>> >
>> > Yes, it’s a new Gradle module called benchmark.
>> >
>> > I’ll likely commit the base early tomorrow. It’s been working through
>> pre commit checks.
>> >
>> > There are currently only two benchmarks to start, but I have more that
>> I’ll be adding.
>> >
>> > Once I have a reasonable number in, I’ll run some comparisons with the
>> 8x branch. Eventually, I’ll do some comparisons with the ref branch as
>> well, but probably not that soon.
>> >
>> > I also have a subproject for the module that builds an Apache Zeppelin
>> interpreter, which allows creating a notebook of parametrized benchmarks
>> which can be versioned and allows for organizing various run plans,
>> charting, and various other things.
>> >
>> > You can turn on a variety of profilers via command line, gc, jfr, and
>> the async profiler being the most common I’ve used.
>> >
>> > I’ve been using it on both relatively small runs as well as runs
>> against up to 150 GB of index.
>> >
>> > I’ve also used it in docker and will add more support I have to make
>> that simple option. My main motivation there being to be able to control
>> and vary hardware resources.
>> >
>> > Mark
>> >
>> > On Sun, Aug 1, 2021 at 2:01 PM Jason Gerlowski <[email protected]>
>> wrote:
>> >>
>> >> Just clarifyng, but the "Solr Benchmark Module" you're referring to
>> >> here is your work from SOLR-15428? Or something else?
>> >>
>> >> Jason
>> >>
>> >> On Sun, Aug 1, 2021 at 12:16 AM Mark Miller <[email protected]>
>> wrote:
>> >> >
>> >> > I’m about ready to commit the first iteration of the Solr benchmark
>> module.
>> >> >
>> >> > It is meant to target both micro and macro benchmarks, though it is
>> additive to, not a replacement for, Gatling and a full performance cluster.
>> >> >
>> >> > The inner workings of Solr and SolrCloud have always been something
>> of a mystery to me. Benchmarking has been as well. Not that I ever spent
>> any time thinking clearly about that.
>> >> >
>> >> > If I had, I wouldn’t have had an alternative plan to rectify it. And
>> it didn’t matter. It didn’t affect me getting work. It didn’t affect my
>> bonus from the boss.
>> >> >
>> >> > Over the past few years I did start to learn something about these
>> mysteries though. Not with a genius plan of attack. Not with a strategy I
>> can write down on the wiki and successfully share with you. I did it by
>> attacking everything in sight. And then improving my sight.
>> >> >
>> >> > If some genius computer God once said “don’t do this”, I did and
>> found out why not. If something looked huge effort for unlikely reasonable
>> return, I did it. And maybe scrapped it. If something took literally 16
>> hours just to manually process the code changes with 0 thought the whole
>> time and repetitive pain and loud expletives accompanying the final hours,
>> I did it. And sometimes maybe scrapped it later.  If there was a rabbit
>> hole, I went down it.
>> >> >
>> >> > I used the tests to chase features and code and surface area I’d
>> never have touched or even known existed. I spent hundreds of hours or more
>> building tools and hundreds more coopting existing tools to expand my grasp
>> and view. I went after other code bases with a similar attack and less
>> depth to raise my vantage point.
>> >> >
>> >> > And I could go on, except that illustrates my point and there is
>> little value in doing so.
>> >> >
>> >> > So I learned a couple things on that journey. And I found an answer
>> or two. Formed an opinion or three. And I’ve had to think. Think about how
>> I can turn that into some value for Apache Solr. I chose to do that work,
>> but I was also paid during that time. Paid for work that is supposed to end
>> up returning value. The basic employer / employee contract.
>> >> >
>> >> > I will never march down that path again. The destination was never
>> really the point. No sane developer would or could join the full trip.
>> >> >
>> >> > I have to use that journey to plot a new one.
>> >> >
>> >> > Thought one: there was huge value in playing around with the system.
>> Trying a wide range of things simply. Getting valuable and low effort
>> feedback and introspection easily.
>> >> >
>> >> > Thought two: I did not play around or explore much before, or see it
>> done, because it was high effort to explore even a small surface area. Even
>> more effort to properly vet or ensure getting quality results or
>> information from it.
>> >> >
>> >> > Thought three: Continuing on thought two, setting up good
>> experiments is very difficult. Collecting results and evaluating the
>> quality of those results is very difficult. More difficult than many
>> developers who would immediately agree with those statements even know. In
>> the way that Elon Musk knew fully self driving cars would be difficult. But
>> he didn’t know it would be “that” difficult. Of course a smaller percentage
>> of developers do know the extent of it.
>> >> >
>> >> > Thought four: When the above was even attempted, it was generally by
>> developers working in isolation. And climbing on their own scaffolding that
>> was not peer reviewed and either tossed out, abandoned, reconstructed, or
>> maybe eventually reused by one.
>> >> >
>> >> > Thought five: Building something that allows for exploration and
>> experimentation essentially always reduces to some kind of benchmark type
>> framework. And benchmarks are notoriously and ridiculously difficult. See
>> above. Any project that wants to truly benefit from them needs to work on
>> them together. And retain them. And improve them. And retain and improve
>> the knowledge behind them.
>> >> >
>> >> > And so we come to the Solr benchmark module. I’ve poured some of my
>> knowledge and experience into standing up an initial framework. I will
>> document it. I will share a video explaining some of the what and why and
>> how. And I will make it so easy to join in on that the only reason a
>> developer will not join the effort is because they have no interest in
>> understanding or improving the system and their changes.
>> >> >
>> >> > So I will make a commit next week. And then I will continue to move
>> it forward. I encourage you to take a look and evaluate what return for
>> what effort you might get from joining in.
>> >> >
>> >> > MRM
>> >> >
>> >> >
>> >> > --
>> >> > - Mark
>> >> >
>> >> > http://about.me/markrmiller
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: [email protected]
>> >> For additional commands, e-mail: [email protected]
>> >>
>> > --
>> > - Mark
>> >
>> > http://about.me/markrmiller
>>
> --
> - Mark
>
> http://about.me/markrmiller
>
-- 
- Mark

http://about.me/markrmiller

Re: The Solr Benchmark Module is targeted for commit next week

Reply via email to