Dear Community, I have waited a day to reply to the sudden wave of feedback regarding the rebuild task list and plan. In this way I hope to ensure that my reply is constructive and useful. I urge others to adopt a similar approach.
It would be nice to be able to claim that it was gratifying to see such a sudden surge of interest in a topic for which it has, until now, been difficult to drum up much enthusiasm. Those who have participated in the process of getting us to the point where we have a plan and an emerging toolset - they deserve our thanks and they have mine. Those who have chosen to snipe, often in non-specific terms, at this plan, imperfect though it may be, well, I think they should consider how things get done around here. Clearly they would have done a better job and it's unfortunate that they did not step forward in a timely fashion and do so. All this being said, allow me to address those criticisms that have been made in specific enough terms to allow it. There is a risk I will leave out something important, but something tells me I'll hear about that soon enough. I will politely request that followups be made to rebu...@openstreetmap.org, a list that is open to all interested parties and that exists for the purpose of such discussions as this. I personally will assume that any followup not to rebuild@ is unproductive punditry that need not be addressed in actual planning. "The plan should be postponed until after April 1st" To this I will simply state that deadlines are a Good Thing when you are trying to get something done. Until we have completed this task it is good that we should work to some deadlines even if they have to evolve in the light of circumstances. If a safe rebuild or a portion of it really has to slip beyond 1st April then that will have to happen. There is, however, no virtue in ensuring that we slip by a token few days just to prove that the world will not end. But be assured that the plan is a living document that will not ignore emerging realities. "There should be _much_ more test runs and validation of the edits made" The more testing the better, this is clear. I hope that those calling for improvements here have read, understood and fed back weaknesses found in the test suite: https://github.com/zerebubuth/openstreetmap-license-change (all files test*) Unlesss you prefer to systematically verify every object in the planet file, this will provide the single greatest chance of successful data migration. We do also need spot checking of data changes made to a real API database and this is planned. It will need manpower, of course, something that is still lacking in this process. Let me recap on the planned nature of these tests - as can be seen from the plan, this weekend is to see a test run on a subset or subsets of the data set on the dev server, these subsets being chosen for being representative of many of the important test cases (and probably having regard to the locations where volunteer data checkers have the local knowledge to most easily spot unexpected behaviour). As this is a fast moving process, the plan does not yet reflect the fact that we also hope to commisison the new database server and install a full API database. The redaction process will then also be commenced on this box (we have a choice whether to test the offline or online redaction), something that will give us the fairest benchmark (and the most random distribution of test cases) possible. Even during the running of this full planet test it will be possible to view and validate the decisions being made. Until we run these tests we don't know how we will have to react to what we find. If we discover that data is vanishing all over the place and wrong redactions are happening, this will oblige much greater caution than if everything behaves well. The benchmarking will also be revealing. If we discover that live redaction on a non-loaded API seems to suggest (random figure with no basis used for effect) a whole month of database churning, that might indicate that an offline redaction is much smarter (consider the scope for conflicts or just plain degradation of API performance). But we have to perform the tests first - after that, if we can see that our projections are flawed, we will need to address this. "This can be done without downtime and should be" Two points need to be made about this, and both are hinted at above. Firstly, _if_ we wish to use the opportunity of the licence change to migrate to the new server (and database version), something Matt is keen to do, this will require at least some downtime. A separate discussion must be had about the principle of live redaction V offline redaction (which is assumed to be quicker and avoids certain theoretical issues such as permormance hit and redactions conflicting with real edits). We still lack the benchmarks to make a truly informed decision between the live and offline options. The plan, as many of you have mentioned, assumes that the offline approach is the safest path to a swift completion. Maybe we will learn more this weekend. "Downtime should have more notice" Yes, it should. Maybe we will manage to shorten the length of it and/or move it to a more acceptable time. There are not many of us and we are under pressure. "The pace of the plan fails to heed the scope for error" The less one knows about the rebuild process the more scary this aspect will seem. Put briefly, the entire process is reversible. No version history is being deleted from the database, only the current version records will be altered or (marked) deleted. In the event that we make an abject mess of the rebuild we can simply roll forward the historical versions of each object to recover the state we were in at the start of the process. We can do this selectively per object or across the entire data set if an unrecoverable snag is exposed. Clearly, it would be distressing, annoying and personally very embarrassing to have to do so, and if we take 3 days of downtime all in the name of getting back to where we started then nobody will claim that it's a good thing. But in assessing the scope for error it is important to acknowledge that what we are risking is disruption, not data loss. "1st April has been held out to mappers as an agree-by date. But now we are starting early" It's a somewhat fair point. Many of us will have little sympathy with drama queens who have left a lot of their peers guessing and allowed them to take the trouble to remap their stuff only to theatrically agree at the last minute. Guys, if you're reading, just effing agree and be done with it, or refuse if you like, whatever point you were making has been made. But this is also an issue for the lost mappers campaign and all the excellent email chasing that a lot of you have heroically been doing. We have, as a community, tended to see April 1st as a step change - throw a switch at midnight and it's done. This has allowed some people (and I've even done this myself) to suggest to mappers that they have until 1st April to agree, whereas others may have looked forward to switching the attribution on their tile server on the morning of the 1st. Can we get a win-win scenario here? Again, I really want to see the benchmarks from this weekend. I also really want to avoid forced procrastination for no reward. As per the plan document, there is some scope for reprocessing of objects based on "new information". See the stern warnings, though, it's very much a measure of last resort. If we go for an offline rebuild, there may be slightly higher scope to handle some level of (deserving) late agreement before we switch back to read-write, though as a very secondary coder in this effort I am not in a position to promise this. This mail is long already, so apologies if I have missed something important. I look forward to seeing a lot of you on rebuild@ and we can identify any gaps together. Dermot -- -------------------------------------- Igaühel on siin oma laul ja ma oma ei leiagi üles _______________________________________________ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk