Personally, I believe the latter so strongly, if I can’t convince the others in the raft with me, I’m jumping in and swimming to another raft after my entire adult life here.
Mark On Sun, Nov 3, 2019 at 7:30 AM Mark Miller <markrmil...@gmail.com> wrote: > In fact this will be a fundamental difference some of us are about to > split between. > > Those that think they can ever fix the tests or the system or the 1000s of > bugs we have and keep adding due to our current world view of making tests > fit the system not the system fit the tests and that fact that everything > is so slow and retry and workaround that stupid shit works all over. It's > all deep. It's ingrained. It grown over for a decade.Its a project of 60 > modules. > > Soon we will split between those that think they are making progress > across the ocean and those that think we are sitting in shark infested > waters waiting to die actually, starting to float backwards sometimes now. > > - Mark > > On Sun, Nov 3, 2019 at 7:23 AM Mark Miller <markrmil...@gmail.com> wrote: > >> bq. They also would allow it to do it in an iterative manner without >> changing everything at once. >> >> Sadly, you can't fix this piece by piece :) I dare anyone to try. I >> encourage, I applaud the effort. >> >> The world is your oyster from a good spot - take your pick of how to do >> things. >> >> But from this spot, if anyone thinks we are getting out design change by >> design change, JIRA by JIRA, I'm so sorry. Let's commiserate in a couple >> years on a beer when you give up on that. >> >> - Mark >> >> On Sun, Nov 3, 2019 at 4:01 AM Jörn Franke <jornfra...@gmail.com> wrote: >> >>> I cannot say anything about the statements, but maybe it could help to >>> introduce Solr Improvement Proposals (SIP) similar to Kafka Improvement >>> Proposals (KIP) or Flink Improvement Proposals (FLIP). >>> >>> I think they are helpful to facilitate design decisions and >>> refactoring / redesign decision. They also would allow it to do it in an >>> iterative manner without changing everything at once. >>> The final version could be out in The Git of Solr in markdown including >>> figures presenting parts of the design. >>> >>> However for developing them I propose a more inclusive approach where >>> many people (not only core developers) can easily comment and support, eg >>> Google docs or similar. >>> >>> > Am 03.11.2019 um 06:39 schrieb Noble Paul <noble.p...@gmail.com>: >>> > >>> > Solr has to do more than Lucene. A Lucene user is mostly a developer >>> > who reads javadocs. A Solr user's touch points are >>> > >>> > * Public API >>> > * Ref guide >>> > * publicly visible files (in ZK as well as file system) >>> > * What to see/look for in the log files to debug issues >>> > >>> > Then we have more nuanced touch points such as the knowledge base of >>> > what happens internally in the system when 'X' API is invoked or when >>> > 'Y' behavior is observed in ZK data. >>> > >>> > The problem with delaying the review process till code completion is >>> > that, any changes based on review comments will require massive amount >>> > of work. >>> > >>> > I don't have an answer to how we achieve it. But, I clearly see this >>> > as a major gap in our development process today. >>> > >>> > This discussion may not be relevant in this thread, may be because no >>> > behavior is changed at all. We don't know yet >>> > >>> > What I want to believe is Mark is doing the right thing & it's gonna >>> > help us all in dealing with our operational issues. I don't want to >>> > interrupt his work with more discussions. >>> > >>> > Thanks you >>> > >>> > >>> >> On Sun, Nov 3, 2019 at 3:32 PM David Smiley <david.w.smi...@gmail.com> >>> wrote: >>> >> >>> >> Yeah we do a bad job of the things you listed Noble. :-( My >>> colleagues want pointers to internal docs but the sad reality is there >>> isn't any. You may notice I'm a stickler in my code reviews for requiring >>> javadocs on all top level classes. I think more javadocs and code comments >>> would be very helpful -- especially for the major classes. This might help >>> us all and others a lot more. For example I think Lucene does a rather >>> fine job of this for its major classes -- IndexWriter being a good example. >>> >> >>> >> ~ David Smiley >>> >> Apache Lucene/Solr Search Developer >>> >> http://www.linkedin.com/in/davidwsmiley >>> >> >>> >> >>> >>> On Sat, Nov 2, 2019 at 7:32 PM Noble Paul <noble.p...@gmail.com> >>> wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> I believe there is a consensus on what is wrong with the way we have >>> built the cluster state and overseer. We need to focus a bit more on the >>> design aspect. Design, according to me, has the following elements: >>> >>> >>> >>> * How does it work? >>> >>> >>> >>> * What are the performance characteristics? Can it be done more >>> efficiently? >>> >>> >>> >>> * What are the public touch points? >>> >>> >>> >>> ** Which are the files we store in ZK? Are they expected to be >>> watched always? >>> >>> >>> >>> ** Or are they read on demand? >>> >>> >>> >>> ** The public APIs. Does it make sense to the user? Can it be >>> further simplified? How does it compare to the other APIs in the system? >>> >>> >>> >>> >>> >>> We, as a community, do a bad job in dealing with these. While we >>> focus on internal things, these are not discussed before it is too late. We >>> usually do coding, tests, code review (sometimes) and commit. This leads to >>> huge technical debt. >>> >>> >>> >>> >>> >>> This is not to put blame on one person or a group of people. (I >>> occasionally see people discussing design issues upfront, I just hope that >>> is the norm.) >>> >>> >>> >>> >>> >>> Now, why am I discussing this in this thread? >>> >>> >>> >>> >>> >>> While we agree there are problems, we are trying to solve the >>> problem using the same process we used to create these problems. Again, I'm >>> not questioning the intent or competence of anyone. Unless we set the >>> process right, we are doomed to make the same mistakes again. >>> >>> >>> >>> >>> >>> I whole heartedly endorse any effort to improve SolrCloud/overseer. >>> At the same time I fail to see us leveraging the collective experience of >>> our community through meaningful discussion. >>> >>> >>> >>> >>> >>> I hope we don't resort to personal attacks and use this as an >>> opportunity to improve our processes. >>> >>> Thanks >>> >>> >>> >>> On Sun, Nov 3, 2019, 9:52 AM Scott Blum <dragonsi...@gmail.com> >>> wrote: >>> >>>> >>> >>>> Very much agreed. I've been trying to figure out for a long time >>> what is the point in having a replica DOWN state that has to be toggled >>> (DOWN and then UP!) every time a node restarts. Considering that we could >>> just combine ACTIVE and `live_nodes` to understand whether a replica is >>> available. It's not even foolproof since kill -9 on a solr node won't mark >>> all the replicas DOWN-- that doesn't happen until the node comes back up >>> (perversely). >>> >>>> >>> >>>> What would it take to get to a state where restarting a node would >>> require a minimal amount of ZK work in most cases? >>> >>>> >>> >>>> On Sat, Nov 2, 2019 at 5:44 PM Mark Miller <markrmil...@gmail.com> >>> wrote: >>> >>>>> >>> >>>>> Give me a short bit to follow up and I will lay out my case and >>> proposal. >>> >>>>> >>> >>>>> Everyone is then free to decide that we need to do something >>> drastic or that I'm wrong and we should just continue down the same road. >>> If that's the case, a lot of your work will get a lot easier and less >>> impeded by me and we will still all be happier. Win win. >>> >>>>> >>> >>>>> If we can just not make drastic changes for a just a brief week or >>> so window, I'll say what I have to say, you guys can judge and do whatever >>> you'd please. >>> >>>>> >>> >>>>> - mark >>> >>>>> >>> >>>>> On Fri, Nov 1, 2019 at 7:46 PM Mark Miller <markrmil...@gmail.com> >>> wrote: >>> >>>>>> >>> >>>>>> Hey All Solr Dev's, >>> >>>>>> >>> >>>>>> SolrCloud is sick right now. The way low level Zookeeper is >>> handeled, the Overseer, is mix and mess of proper exception handling and >>> super slow startup and shutdown, adding new things all the time with no >>> concern for performance or proper ordering (which is harder to tell than >>> you think). >>> >>>>>> >>> >>>>>> Our class dependency graph doesn't even work - we just force it. >>> Sort of. If the whole system doesn't block and choke it's way to a start >>> slow enough, lots of things fail. >>> >>>>>> >>> >>>>>> This thing coughs up, you toss stuff into the storm, a good chunk >>> of time, what you want eventually come back without causing too much damage. >>> >>>>>> >>> >>>>>> There are so many things are are off or just plain wrong and the >>> list is growing and growing. No one is following this or if you are, please >>> back me up. This thing will collapse under it's own wait. >>> >>>>>> >>> >>>>>> So if you want to add yet another state format cluster state or >>> some other optimization on this junk heap, you can expect me to push back. >>> >>>>>> >>> >>>>>> We should all be embarrassed by the state of things. >>> >>>>>> >>> >>>>>> I've got some ideas for addressing them that I'll share soon, but >>> god, don't keep optimizing a turd in non backcompat Overseer loving ways. >>> That Overseer is an atrocity. >>> >>>>>> >>> >>>>>> -- >>> >>>>>> - Mark >>> >>>>>> >>> >>>>>> http://about.me/markrmiller >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> - Mark >>> >>>>> >>> >>>>> http://about.me/markrmiller >>> > >>> > >>> > >>> > -- >>> > ----------------------------------------------------- >>> > Noble Paul >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> > For additional commands, e-mail: dev-h...@lucene.apache.org >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >>> >> >> -- >> - Mark >> >> http://about.me/markrmiller >> > > > -- > - Mark > > http://about.me/markrmiller > -- - Mark http://about.me/markrmiller