keith-turner commented on PR #2665:
URL: https://github.com/apache/accumulo/pull/2665#issuecomment-1140181288

   > I think the design of this change is great so far but it is a major new 
feature, with a drastic increase in complexity, touching major parts of 
Accumulo (scans, API, metadata, configuration, scripts, and introduces another 
new server). I don't think this should get merged into 2.1. The complexity of 
this change on top of all the changes already in 2.1 will only further delay 
the release of 2.1. Main already has many major new features (ZK Prop Store, 
Overhaul of Compactions code, External Compactions, AMPLE, Master Rename, New 
Tracing, New Metrics, New SPI, Root Table change) not to mention the 1,130+ 
tickets marked done for 2.1.
   
   I have been working really hard on testing this new feature over the past 
few weeks.  Tonight I finally got to a point where I was seeing scan servers 
work really well on a small cluster (12 scan servers, 3 tablets servers, 3 
datanodes).  I was running 600 ish  random concurrent greps over random large 
ranges in a single tablet that scanned lots of data and returned little data.   
To get to that point #2700, #2744, and #2745 needed to be debugged and fixed.  
Each of these took a lot of time to find and fix. The interesting thing is that 
these problems were not all specific to scan servers, they were a result of all 
of the complex changes you mentioned and just happened to be found during 
intense scan server testing.   I think with or without scan servers that 2.1 
need s a lot of testing to shake out more latent problems.  I think it would be 
nice to stop adding new features and release a 2.1.0-beta-1 and use it to do 
that testing and refining.  If we did that I would like to see s
 can servers in 2.1.0 as the last big feature.   The feature has gotten a good 
bit of review.  While working on this @dlmarion  and I reviewed each others 
work in addition to this review.  If anyone is interested, @dlmarion  and I 
could give a talk about the concepts on slack sometime in order to help guide 
anyone reviewing.
   
   In addition to reviews, the feature does need more testing.   Next week I 
hope to scale up the scan server testing to larger clusters now that I am 
getting small cluster to work well.  If anyone wants to help I would be happy 
to show you how to run scan server and some new test in kubernetes.  The source 
code for the testing I have been doing is 
[here](https://github.com/keith-turner/accumulo-testing/tree/scan-server-testing/sstest),
 its not documented but I would be happy to offer guidance on how to run it 
(need Kubernetes+Accumulo+Zookeper+DFS).  It supports multiple different test 
scenarios and I am running through those. 
   
   After that, for 2.1.0 in general I would like to set up a test scenario w/ 
continuous bulk import+external compactions+scan servers+heavy query load and 
compare that to continuous  bulk import+tservers only+heavy query load.  This 
testing could be done w/ a 2.1.0-beta-1 release possibly.  If anyone is 
interested in collaborating on that let me know, could definitely use help.
   
   I do agree that this new feature could break existing functionality 
unrelated to scan servers.  I am optimistic that we can mitigate this with good 
testing though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to