Sorav Bansal <[EMAIL PROTECTED]> writes: > I have now ported look.c to the latest CVS repository too. In doing > so, I noticed that the new keycompare() function no longer calls > 'trim_trailing_blanks()'. Its not clear to me, why these calls are not > necessary anymore. Is that because if the keys match till the last > non-blank character, then you allow the number of trailing blanks to > be considered as a tie-breaker?
Yes. POSIX requires this and other 'sort' implementations do it that way, so we thought it better to be compatible. Please see the 2004-04-25 ChangeLog entry. > These calls are needed for 'look' though. For now, I am adding the > calls and commenting with XXX in compare.h. I guess they'll need to be conditional on whether it's 'look' rather than 'sort'. >>I don't see the need for the -B option, or the -l option. Can't >>we omit them? >> > I thought '-l' could be useful for looking into unsorted files. '-B' > can be used for setting the buffer size which affects performance, > especially when doing linear search. But doesn't 'grep' do the same job that look -l would? Perhaps I'm missing something here. > However, I agree that they are not particularly necessary. OK, then (unless I'm missing something) let's please omit them for now. We can always put them in later if there's a need. Speaking of which, the most important thing needed for this change is documentation. We need a section in doc/coreutils.texi. I'd put it just after the 'tsort invocation' section. Also, a couple of lines in NEWS. >>Isn't fseeko kind of a loser, performance-wise? Wouldn't it >>be faster if you used lseek, and took block boundaries into >>account? I suppose this is more of a tuning thing tho. >> > How do you propose using block boundaries into account? I am not sure > what you mean, so I am leaving the fseeko() calls in place right now. For POSIX-conformance reasons, fseeko is required to do an underlying lseek to the exact same position; in effect, buffering is turned off. Thus you'll often get better performance by seeking to buffer boundaries, even if you use fseeko. Often it's easier just to use lseek. This is just a performance issue, not a correctness issue, so we shouldn't let it slow us down too much. Better to get it right (first) and then speed it up. (But I thought I'd mention it anyway. :-) > This time, I have tried my best to match the style in > coreutils. Please point me to specific areas where I am not conforming > to the coding style and I would be glad to correct them. These are all fairly minor, but they'll have to be done anyway so we appreciate your help. Comments should look like this: /* Use English sentences. Break long lines like this. Put two spaces after each period. */ Prefer "const *" to "*". Don't use "register". Don't put a space before ";". Don't parenthesize the expression X in "return X;". Use x2realloc or x2nrealloc to double a buffer's size, instead of checking for size overflow by hand. Prefer 'bool' to 'int' where either will do. Thanks again for helping with this! _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils