Thanks, Karl. Mark
On Mon, Jul 9, 2018 at 10:11 PM, Karl Williamson <pub...@khwilliamson.com> wrote: > On 07/08/2018 03:21 AM, Mark Davis ☕️ wrote: > >> I'm surprised that the tests for 11.0 passed for a 10.0 implementation, >> because the following should have triggered a difference for WB. Can you >> check on this particular case? >> >> ÷ 0020 × 0020 ÷#÷ [0.2] SPACE (WSegSpace) × [3.4] SPACE (WSegSpace) ÷ >> [0.3] >> > > I'm one of the people who advocated for this change, and I had already > tailored our implementation of 10.0 to not break between horizontal white > space, so it's actually not surprising that this rule didn't break > >> >> >> About the testing: >> >> The tests are generated so that they go all the combinations of pairs, >> and some combinations of triples. The generated test cases use a sample >> from each partition of characters, to cut down on the file size to a >> reasonable level. That also means that some changes in the rules don't >> cause changes in the test results. Because it is not possible to test every >> combination, so there is also provision for additional test cases, such as >> those at the end of the files, eg: >> >> https://unicode.org/Public/11.0.0/ucd/auxiliary/WordBreakTest.html >> https://unicode.org/Public/10.0.0/ucd/auxiliary/WordBreakTest.html >> >> We should extend those each time to make sure we cover combinations that >> aren't covered by pairs. There were some additions to that end; if they >> didn't cover enough cases, then we can look at your experience to add more. >> >> I can suggest two strategies for further testing: >> >> 1. To do a full test, for each row check every combinations obtained by >> replacing each sample character by every other character in its >> partition. Eg for the above line that would mean testing every <WSegSpace, >> WSegSpace> sequence. >> >> 2. Use a monkey test against ICU. That is, generate random combinations >> of characters from different partitions and check that ICU and your >> implementation are in sync. >> >> 3. During the beta period, test your previous-version with the new test >> files. If there are no failures, yet there are changes in the rules, then >> raise that issue during the beta period so we can add tests. >> > > I actually did this, and as I recall, did find some test failures. In > retrospect, I must have screwed up somehow back then. I was under tight > deadline pressure, and as a result, did more cursory beta testing than > normal. > >> >> 4. If possible, during the beta period upgrade your implementation and >> test against the new and old test files. >> > > >> Anyone else have other suggestions for testing? >> >> Mark >> >> > As an aside, a release or two ago, I implemented SB, and someone > immediately found a bug, and accused me of releasing software that had not > been tested at all. He had looked through the test suite and not found > anything that looked like it was testing that. But he failed to find the > test file which bundled up all your tests, in a manner he was not > accustomed to, so it was easy for him to overlook. The bug only manifested > itself in longer runs of characters than your pairs and triples tested. I > looked at it, and your SB tests still seemed reasonable, and I should not > expect a more complete series than you furnished. > >> >> >> Mark >> ////// >> >> On Sun, Jul 8, 2018 at 6:52 AM, Karl Williamson via Unicode < >> unicode@unicode.org <mailto:unicode@unicode.org>> wrote: >> >> I am working on upgrading from Unicode 10 to Unicode 11. >> >> I used all the new files. >> >> The algorithms for some of the boundaries, like GCB and WB, have >> changed so that some of the property values no longer have code >> points associated with them. >> >> I ran the tests furnished in 11.0 for these boundaries, without >> having changed the algorithms from earlier releases. All passed 100%. >> >> Unless I'm missing something, that indicates that the tests >> furnished in 11.0 do not contain instances that exercise these >> changes. My guess is that the 10.0 tests were also deficient. >> >> I have been relying on the UCD to furnish tests that have enough >> coverage to sufficiently exercise the algorithms that are specified >> in UAX 31, but that appears to have been naive on my part >> >> >> >