Thanks for the info! The Zeppelin stuff in particular piques my interest: I explored a Zeppelin/Solr integration a bit in SOLR-15080, but ultimately never committed it because of some lukewarm feedback from David S on the PR and some shifting personal priorities. If others are using Zeppelin maybe the idea is worth revisiting though...
Jason On Wed, Aug 4, 2021 at 10:42 PM Mark Miller <markrmil...@gmail.com> wrote: > > Yes, it’s a new Gradle module called benchmark. > > I’ll likely commit the base early tomorrow. It’s been working through pre > commit checks. > > There are currently only two benchmarks to start, but I have more that I’ll > be adding. > > Once I have a reasonable number in, I’ll run some comparisons with the 8x > branch. Eventually, I’ll do some comparisons with the ref branch as well, but > probably not that soon. > > I also have a subproject for the module that builds an Apache Zeppelin > interpreter, which allows creating a notebook of parametrized benchmarks > which can be versioned and allows for organizing various run plans, charting, > and various other things. > > You can turn on a variety of profilers via command line, gc, jfr, and the > async profiler being the most common I’ve used. > > I’ve been using it on both relatively small runs as well as runs against up > to 150 GB of index. > > I’ve also used it in docker and will add more support I have to make that > simple option. My main motivation there being to be able to control and vary > hardware resources. > > Mark > > On Sun, Aug 1, 2021 at 2:01 PM Jason Gerlowski <gerlowsk...@gmail.com> wrote: >> >> Just clarifyng, but the "Solr Benchmark Module" you're referring to >> here is your work from SOLR-15428? Or something else? >> >> Jason >> >> On Sun, Aug 1, 2021 at 12:16 AM Mark Miller <markrmil...@gmail.com> wrote: >> > >> > I’m about ready to commit the first iteration of the Solr benchmark module. >> > >> > It is meant to target both micro and macro benchmarks, though it is >> > additive to, not a replacement for, Gatling and a full performance cluster. >> > >> > The inner workings of Solr and SolrCloud have always been something of a >> > mystery to me. Benchmarking has been as well. Not that I ever spent any >> > time thinking clearly about that. >> > >> > If I had, I wouldn’t have had an alternative plan to rectify it. And it >> > didn’t matter. It didn’t affect me getting work. It didn’t affect my bonus >> > from the boss. >> > >> > Over the past few years I did start to learn something about these >> > mysteries though. Not with a genius plan of attack. Not with a strategy I >> > can write down on the wiki and successfully share with you. I did it by >> > attacking everything in sight. And then improving my sight. >> > >> > If some genius computer God once said “don’t do this”, I did and found out >> > why not. If something looked huge effort for unlikely reasonable return, I >> > did it. And maybe scrapped it. If something took literally 16 hours just >> > to manually process the code changes with 0 thought the whole time and >> > repetitive pain and loud expletives accompanying the final hours, I did >> > it. And sometimes maybe scrapped it later. If there was a rabbit hole, I >> > went down it. >> > >> > I used the tests to chase features and code and surface area I’d never >> > have touched or even known existed. I spent hundreds of hours or more >> > building tools and hundreds more coopting existing tools to expand my >> > grasp and view. I went after other code bases with a similar attack and >> > less depth to raise my vantage point. >> > >> > And I could go on, except that illustrates my point and there is little >> > value in doing so. >> > >> > So I learned a couple things on that journey. And I found an answer or >> > two. Formed an opinion or three. And I’ve had to think. Think about how I >> > can turn that into some value for Apache Solr. I chose to do that work, >> > but I was also paid during that time. Paid for work that is supposed to >> > end up returning value. The basic employer / employee contract. >> > >> > I will never march down that path again. The destination was never really >> > the point. No sane developer would or could join the full trip. >> > >> > I have to use that journey to plot a new one. >> > >> > Thought one: there was huge value in playing around with the system. >> > Trying a wide range of things simply. Getting valuable and low effort >> > feedback and introspection easily. >> > >> > Thought two: I did not play around or explore much before, or see it done, >> > because it was high effort to explore even a small surface area. Even more >> > effort to properly vet or ensure getting quality results or information >> > from it. >> > >> > Thought three: Continuing on thought two, setting up good experiments is >> > very difficult. Collecting results and evaluating the quality of those >> > results is very difficult. More difficult than many developers who would >> > immediately agree with those statements even know. In the way that Elon >> > Musk knew fully self driving cars would be difficult. But he didn’t know >> > it would be “that” difficult. Of course a smaller percentage of developers >> > do know the extent of it. >> > >> > Thought four: When the above was even attempted, it was generally by >> > developers working in isolation. And climbing on their own scaffolding >> > that was not peer reviewed and either tossed out, abandoned, >> > reconstructed, or maybe eventually reused by one. >> > >> > Thought five: Building something that allows for exploration and >> > experimentation essentially always reduces to some kind of benchmark type >> > framework. And benchmarks are notoriously and ridiculously difficult. See >> > above. Any project that wants to truly benefit from them needs to work on >> > them together. And retain them. And improve them. And retain and improve >> > the knowledge behind them. >> > >> > And so we come to the Solr benchmark module. I’ve poured some of my >> > knowledge and experience into standing up an initial framework. I will >> > document it. I will share a video explaining some of the what and why and >> > how. And I will make it so easy to join in on that the only reason a >> > developer will not join the effort is because they have no interest in >> > understanding or improving the system and their changes. >> > >> > So I will make a commit next week. And then I will continue to move it >> > forward. I encourage you to take a look and evaluate what return for what >> > effort you might get from joining in. >> > >> > MRM >> > >> > >> > -- >> > - Mark >> > >> > http://about.me/markrmiller >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> For additional commands, e-mail: dev-h...@solr.apache.org >> > -- > - Mark > > http://about.me/markrmiller --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional commands, e-mail: dev-h...@solr.apache.org