keith-turner commented on PR #2665: URL: https://github.com/apache/accumulo/pull/2665#issuecomment-1140181288
> I think the design of this change is great so far but it is a major new feature, with a drastic increase in complexity, touching major parts of Accumulo (scans, API, metadata, configuration, scripts, and introduces another new server). I don't think this should get merged into 2.1. The complexity of this change on top of all the changes already in 2.1 will only further delay the release of 2.1. Main already has many major new features (ZK Prop Store, Overhaul of Compactions code, External Compactions, AMPLE, Master Rename, New Tracing, New Metrics, New SPI, Root Table change) not to mention the 1,130+ tickets marked done for 2.1. I have been working really hard on testing this new feature over the past few weeks. Tonight I finally got to a point where I was seeing scan servers work really well on a small cluster (12 scan servers, 3 tablets servers, 3 datanodes). I was running 600 ish random concurrent greps over random large ranges in a single tablet that scanned lots of data and returned little data. To get to that point #2700, #2744, and #2745 needed to be debugged and fixed. Each of these took a lot of time to find and fix. The interesting thing is that these problems were not all specific to scan servers, they were a result of all of the complex changes you mentioned and just happened to be found during intense scan server testing. I think with or without scan servers that 2.1 need s a lot of testing to shake out more latent problems. I think it would be nice to stop adding new features and release a 2.1.0-beta-1 and use it to do that testing and refining. If we did that I would like to see s can servers in 2.1.0 as the last big feature. The feature has gotten a good bit of review. While working on this @dlmarion and I reviewed each others work in addition to this review. If anyone is interested, @dlmarion and I could give a talk about the concepts on slack sometime in order to help guide anyone reviewing. In addition to reviews, the feature does need more testing. Next week I hope to scale up the scan server testing to larger clusters now that I am getting small cluster to work well. If anyone wants to help I would be happy to show you how to run scan server and some new test in kubernetes. The source code for the testing I have been doing is [here](https://github.com/keith-turner/accumulo-testing/tree/scan-server-testing/sstest), its not documented but I would be happy to offer guidance on how to run it (need Kubernetes+Accumulo+Zookeper+DFS). It supports multiple different test scenarios and I am running through those. After that, for 2.1.0 in general I would like to set up a test scenario w/ continuous bulk import+external compactions+scan servers+heavy query load and compare that to continuous bulk import+tservers only+heavy query load. This testing could be done w/ a 2.1.0-beta-1 release possibly. If anyone is interested in collaborating on that let me know, could definitely use help. I do agree that this new feature could break existing functionality unrelated to scan servers. I am optimistic that we can mitigate this with good testing though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
