Update of bug #46815 (project findutils): Status: Invalid => Need Info Open/Closed: Closed => Open
_______________________________________________________ Follow-up Comment #4: Yes, both Sebastian and Dale are correct. The 4.1.7 version of the manual page is much less clear, though consistent with the current behaviour: -size n[bckw] File uses n units of space. The units are 512-byte blocks by default or if `b' follows n, bytes if `c' follows n, kilobytes if `k' follows n, or 2-byte words if `w' follows n. The size does not count indirect blocks, but it does count blocks in sparse files that are not actually allocated. The current and previous code behaves similarly. Here is the 4.1.7 code for illustration: boolean pred_size (pathname, stat_buf, pred_ptr) char *pathname; struct stat *stat_buf; struct predicate *pred_ptr; { unsigned long f_val; f_val = (stat_buf->st_size + pred_ptr->args.size.blocksize - 1) / pred_ptr->args.size.blocksize; switch (pred_ptr->args.size.kind) { case COMP_GT: if (f_val > pred_ptr->args.size.size) return (true); break; case COMP_LT: if (f_val < pred_ptr->args.size.size) return (true); break; case COMP_EQ: if (f_val == pred_ptr->args.size.size) return (true); break; } return (false); } As you can see, this does round up. Let's go over to take a quick look at the POSIX requirements for the -size test at http://pubs.opengroup.org/onlinepubs/009695399/utilities/find.html :- -size n[c] The primary shall evaluate as true if the file size in bytes, divided by 512 and rounded up to the next integer, is n. If n is followed by the character 'c', the size shall be in bytes. So POSIX requires -size -1 should be false for a 500-byte file, and it also introduces a suffix c for bytes. As is quite common for GNU tools, every chance to make a potentially-useful extension is eventually taken. So in this case the introduction of alternative suffixes beyond the mandatory "c". The "k" suffix denotes units of 1024 bytes, for example. It's not suprising that somebody thought that the behavior for dealing in k should be quite similar to that for dealing in units of 512 bytes (especially if they were using one of the several systems where the system block size for things like ls -s is in fact 1024 bytes). But this is clearly surprising for unit suffixes like m, for which it is pretty clear that the user is not thinking in terms of how many blocks the file occupies on the storage layer. So the existing behavior is kind of understandable, but it is obviously confusing for most users. The canonical version of the bug report on this, one might say, is https://savannah.gnu.org/bugs/?12162. There are others (see https://savannah.gnu.org/bugs/?group=findutils&func=browse&set=custom&msort=0&status_id[]=3&resolution_id[]=0&submitted_by[]=0&assigned_to[]=0&category_id[]=0&bug_group_id[]=0&severity[]=0&summary[]=-size&details[]=&advsrch=0&msort=0&chunksz=50&spamscore=5&report_id=101&sumORdet=0&morder=severity%3C&sumOrdet=0&order=date#results). I think that particular discussion got side-tracked, at the end, by the introduction of time tests. The problems of rounding with those tests have largely been obviated by the introduction of tests like -newermt, where the timestamp is specified directly in absolute not relative terms (and no rounding occurs). I've been in favour of providing a more sensible test for a long time, the problem has always been how to spell the new usage and describe its semantics. The use of > and < prefixes is attractive, but doomed by the use of those characters by the shell. Yes, the user could quote them to avoid redirection, but this would clearly be a source of confusion for less experienced users. The alternatives that seem attractive to me are Nigel McNie's proposal to use a new test, '-filesize' or something along the lines of Martin Steigerwald's three-word variant (i.e. -size lt 20M being the sane, no-rounding, replacement for -size -20M). Both of those options have the nice property that they're likely POSIX-compliant in the sense that POSIX provides no required meaning for those constructs (though -filesize is I suppose more obviously a GNU extension). Let's re-open the discussion about what to call the "sane" alternative to -size, and implement it this time. _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?46815> _______________________________________________ Message sent via/by Savannah http://savannah.gnu.org/