On Wed, Jun 05, 2013 at 09:27:00AM -0700, Russ Allbery wrote: > Ryan Kavanagh <r...@debian.org> writes: > > Lintian is overly eager with spelling-error-in-binary. In one one > > package, the string > > I9\$ teH > > I wonder if we should require the "word" be a minimum length to trigger > that tag. Maybe four or five characters?
That would work, although it would miss legitimate mispellings of words like "the", etc, if done across the board. A more complicated example, which may or may not be worth the additional effort, would be to check words with three characters if and only if it is in a string with at least two recognised words of length >= 4, e.g., neither of I9\$ teH I9\$ teH I7%53 teH %753192 would get matched because the context is gibberish, but I am going to hte fair would since the context is English text (detected by the words "going" and "fair"). This might be overkill though, and your solution of just ignoring words of length less that 4 would probably be sufficient. Best wishes, Ryan -- |_)|_/ Ryan Kavanagh | Debian Developer | \| \ http://ryanak.ca/ | GPG Key 4A11C97A -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org