Re: The Solr Benchmark Module is targeted for commit next week

Jason Gerlowski Mon, 09 Aug 2021 10:37:08 -0700

Thanks for the info!

The Zeppelin stuff in particular piques my interest: I explored a
Zeppelin/Solr integration a bit in SOLR-15080, but ultimately never
committed it because of some lukewarm feedback from David S on the PR
and some shifting personal priorities.  If others are using Zeppelin
maybe the idea is worth revisiting though...


Jason

On Wed, Aug 4, 2021 at 10:42 PM Mark Miller <[email protected]> wrote:
>
> Yes, it’s a new Gradle module called benchmark.
>
> I’ll likely commit the base early tomorrow. It’s been working through pre 
> commit checks.
>
> There are currently only two benchmarks to start, but I have more that I’ll 
> be adding.
>
> Once I have a reasonable number in, I’ll run some comparisons with the 8x 
> branch. Eventually, I’ll do some comparisons with the ref branch as well, but 
> probably not that soon.
>
> I also have a subproject for the module that builds an Apache Zeppelin 
> interpreter, which allows creating a notebook of parametrized benchmarks 
> which can be versioned and allows for organizing various run plans, charting, 
> and various other things.
>
> You can turn on a variety of profilers via command line, gc, jfr, and the 
> async profiler being the most common I’ve used.
>
> I’ve been using it on both relatively small runs as well as runs against up 
> to 150 GB of index.
>
> I’ve also used it in docker and will add more support I have to make that 
> simple option. My main motivation there being to be able to control and vary 
> hardware resources.
>
> Mark
>
> On Sun, Aug 1, 2021 at 2:01 PM Jason Gerlowski <[email protected]> wrote:
>>
>> Just clarifyng, but the "Solr Benchmark Module" you're referring to
>> here is your work from SOLR-15428? Or something else?
>>
>> Jason
>>
>> On Sun, Aug 1, 2021 at 12:16 AM Mark Miller <[email protected]> wrote:
>> >
>> > I’m about ready to commit the first iteration of the Solr benchmark module.
>> >
>> > It is meant to target both micro and macro benchmarks, though it is 
>> > additive to, not a replacement for, Gatling and a full performance cluster.
>> >
>> > The inner workings of Solr and SolrCloud have always been something of a 
>> > mystery to me. Benchmarking has been as well. Not that I ever spent any 
>> > time thinking clearly about that.
>> >
>> > If I had, I wouldn’t have had an alternative plan to rectify it. And it 
>> > didn’t matter. It didn’t affect me getting work. It didn’t affect my bonus 
>> > from the boss.
>> >
>> > Over the past few years I did start to learn something about these 
>> > mysteries though. Not with a genius plan of attack. Not with a strategy I 
>> > can write down on the wiki and successfully share with you. I did it by 
>> > attacking everything in sight. And then improving my sight.
>> >
>> > If some genius computer God once said “don’t do this”, I did and found out 
>> > why not. If something looked huge effort for unlikely reasonable return, I 
>> > did it. And maybe scrapped it. If something took literally 16 hours just 
>> > to manually process the code changes with 0 thought the whole time and 
>> > repetitive pain and loud expletives accompanying the final hours, I did 
>> > it. And sometimes maybe scrapped it later.  If there was a rabbit hole, I 
>> > went down it.
>> >
>> > I used the tests to chase features and code and surface area I’d never 
>> > have touched or even known existed. I spent hundreds of hours or more 
>> > building tools and hundreds more coopting existing tools to expand my 
>> > grasp and view. I went after other code bases with a similar attack and 
>> > less depth to raise my vantage point.
>> >
>> > And I could go on, except that illustrates my point and there is little 
>> > value in doing so.
>> >
>> > So I learned a couple things on that journey. And I found an answer or 
>> > two. Formed an opinion or three. And I’ve had to think. Think about how I 
>> > can turn that into some value for Apache Solr. I chose to do that work, 
>> > but I was also paid during that time. Paid for work that is supposed to 
>> > end up returning value. The basic employer / employee contract.
>> >
>> > I will never march down that path again. The destination was never really 
>> > the point. No sane developer would or could join the full trip.
>> >
>> > I have to use that journey to plot a new one.
>> >
>> > Thought one: there was huge value in playing around with the system. 
>> > Trying a wide range of things simply. Getting valuable and low effort 
>> > feedback and introspection easily.
>> >
>> > Thought two: I did not play around or explore much before, or see it done, 
>> > because it was high effort to explore even a small surface area. Even more 
>> > effort to properly vet or ensure getting quality results or information 
>> > from it.
>> >
>> > Thought three: Continuing on thought two, setting up good experiments is 
>> > very difficult. Collecting results and evaluating the quality of those 
>> > results is very difficult. More difficult than many developers who would 
>> > immediately agree with those statements even know. In the way that Elon 
>> > Musk knew fully self driving cars would be difficult. But he didn’t know 
>> > it would be “that” difficult. Of course a smaller percentage of developers 
>> > do know the extent of it.
>> >
>> > Thought four: When the above was even attempted, it was generally by 
>> > developers working in isolation. And climbing on their own scaffolding 
>> > that was not peer reviewed and either tossed out, abandoned, 
>> > reconstructed, or maybe eventually reused by one.
>> >
>> > Thought five: Building something that allows for exploration and 
>> > experimentation essentially always reduces to some kind of benchmark type 
>> > framework. And benchmarks are notoriously and ridiculously difficult. See 
>> > above. Any project that wants to truly benefit from them needs to work on 
>> > them together. And retain them. And improve them. And retain and improve 
>> > the knowledge behind them.
>> >
>> > And so we come to the Solr benchmark module. I’ve poured some of my 
>> > knowledge and experience into standing up an initial framework. I will 
>> > document it. I will share a video explaining some of the what and why and 
>> > how. And I will make it so easy to join in on that the only reason a 
>> > developer will not join the effort is because they have no interest in 
>> > understanding or improving the system and their changes.
>> >
>> > So I will make a commit next week. And then I will continue to move it 
>> > forward. I encourage you to take a look and evaluate what return for what 
>> > effort you might get from joining in.
>> >
>> > MRM
>> >
>> >
>> > --
>> > - Mark
>> >
>> > http://about.me/markrmiller
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
> --
> - Mark
>
> http://about.me/markrmiller

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: The Solr Benchmark Module is targeted for commit next week

Reply via email to