Even though I understand very little, in the interest of generating discussion I'll jump in with mostly questions. (And jump out if Monty or Peter Creath or someone who understands this better elucidates.)
The first and obvious question is what do CD paranoia 10.2, cued https://gna.org/news/?group=cued<https://gna.org/news/?group=cued> or EAC do with ths? I had a vague and second-hand understanding that cdparanoia had a way to figure out from the drive when it is better and totally burning of paranoia. It is possible that cued or EAC compares rips with and without paranoia. Is this a fair paraphrase of the situation?: Because two noncontiguous final blocks are zero (which probably means there is silence), cdparanoia considers the second a jitter of the first or vice versa? And those other two blocks which were confused are also silence as well? Do I have it correct that things that are uniformly close to 0000 or ffff are both silence because it what is looked for is lots of *changes* in amplitude. (If this is correct, would any repeated 16-bit value also be silence? For example 5555, 5555, or even 1234 1234?) Since talk is cheap. So suppose one measured the amount of entropy in the block. One way this could be done is by running a compressor like bz2 and seeing that the data greatly compresses. Or perhaps a compression library has the ability to give you the entropy without compressing.) So is the suggestion to do discourage jitter if the entropy is low? On Tue, Nov 1, 2011 at 10:45 AM, Blake Jones <[email protected]> wrote: > Hi all, > > I've been digitizing some of my CD collection recently, and have tried > to figure out when I should be satisfied with a given rip. The > documentation about the "!"-marks in the status bar suggests that "too > many" of them might be a sign that errors are sneaking through > undetected, but I didn't have a good sense of how many "too many" was. > > So I looked at the problem a bit more closely. As a concrete example, I > used a recording of Bela Bartok's 6 String Quartets (DG 423 657-2, disc > 1 of 2), the paranoia library/command from libcdio 0.83, a Samsung > CD/DVD drive (reported as "TSSTcorp CDDVDW SE-S204N TS00"), and > OpenSolaris build 134. > > I got various complaints around the end of track 6 or the beginning of > track 7 when doing a full-disc rip, so I zoomed in on the region > '6:[2:00]-7:[2:00]'; this corresponds to sectors 182485 through 203197. > When ripping that region, cd-paranoia reliably reported the same five > problems in the "-e" trace: > > ##: 2 [jitter] @ 193211.1110 > ##: 2 [jitter] @ 193211.1146 > ##: 3 [correction] @ 193212.0047 > ##: 10 [dropped] @ 193212.0105 > ##: 10 [dropped] @ 193212.0223 > > These samples are between 13 and 14 seconds from the end of track 6, which > is mostly fairly silent. > > (Note: I've taken the raw sample offsets, such as "227217246", which are > a count of the number of 16-bit samples since the start of the disc, > and turned them into sector-and-sample offsets, such as "193211.1110", > where the first number is the number of 1176-sample sectors and the > second number is the offset into that sector measured in 16-bit samples.) > > I recompiled with -DTRACE_PARANOIA, and found the following match > operations > which correspond (roughly) to the above errors: > > [...] > Matched [193188.0031-193195.1144] against [193188.0031-193195.1144] > Matched [193196.0031-193203.1144] against [193196.0031-193203.1144] > Matched [193204.0031-193211.1144] against [193204.0031-193211.1144] > (!) Matched [193211.1018-193211.1095] against [193211.1146-193212.0047] > Matched [193212.0031-193219.1144] against [193212.0031-193219.1144] > (!) Matched [193219.1085-193219.1150] against [193219.1165-193220.0054] > Matched [193220.0031-193228.0000] against [193220.0031-193228.0000] > [...] > > I wrote another program to extract the raw data from the CD using the > underlying ioctl. Looking at that data, the two "(!)"-marked matches > correspond to the following sets of data: > > 193211.1146: fffe ffff fffe ffff fffe ffff > 193211.1152: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1160: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1168: fffe ffff fffe ffff fffe ffff fffe ffff > 193212.0000: fffe ffff fffe ffff fffe ffff fffe ffff > 193212.0008: fffe ffff fffe ffff fffe ffff fffe ffff > 193212.0016: fffe ffff fffe ffff fffe ffff fffe ffff > 193212.0024: fffe ffff fffe ffff fffe ffff fffe ffff > 193212.0032: fffe ffff fffe ffff fffe ffff fffe ffff > 193212.0040: fffe ffff fffe ffff fffe ffff fffe > > 193211.1018: fffe ffff fffe ffff fffe ffff > 193211.1024: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1032: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1040: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1048: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1056: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1064: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1072: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1080: fffe ffff fffe ffff fffe ffff fffe ffff > 193211.1088: fffe ffff fffe ffff fffe ffff fffe > > and: > > 193219.1085: 0000 0000 0000 > 193219.1088: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1096: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1104: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1112: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1120: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1128: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1136: 0000 0000 0000 0000 0000 0000 0000 0000 > 193219.1144: 0000 0000 0000 0000 0000 0000 > > 193219.1165: 0000 0000 0000 > 193219.1168: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0000: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0008: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0016: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0024: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0032: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0040: 0000 0000 0000 0000 0000 0000 0000 0000 > 193220.0048: 0000 0000 0000 0000 0000 0000 > > These do, of course, match -- but the data being matched is hardly all > that unique. > > So it looks to me as if paranoia is being too confident in declaring a > match when it's analyzing heavily repetitive data. Given modern CD > drives and fairly clean CDs, it seems like this behavior might introduce > more problems than it fixes. > > Any thoughts or advice would be appreciated. > > [I sent this note to [email protected] yesterday and > [email protected] on Sunday; neither message has shown up on their > online mailing list archive yet, so I'm guessing those lists are dead.] > > Blake > >
