Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Eric Blake

On 10/21/2010 10:32 AM, Jim Meyering wrote:

A possible disadvantage of using this approach is that the naive
implementation would perform truncation.  I.e., using a format of
"%X.%3.3:X", a time of N.10099 would be truncated to N.100


I would argue that you WANT truncation by default.  POSIX is explicit 
that if a file system cannot represent the full precision of utimensat, 
that it truncate rather than round.  The same should apply in any other 
context - rounding time into the future is always a potential for 
confusing bugs, and truncation is the best course of action when 
reducing precision.




I'm inclined to say:
If you care, don't truncate in the first place, or use
another tool, e.g., printf, to format the floating point number.


Agreed.

--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Jim Meyering wrote:

> Jim Meyering wrote:
>
>> Eric Blake wrote:
>>
>>> On 10/21/2010 08:09 AM, Andreas Schwab wrote:
 Eric Blake  writes:

> On 10/21/2010 03:22 AM, Andreas Schwab wrote:
>>
>>
>> Jim Meyering   writes:
>>
>>> And besides, with coreutils-8.6 already released, reverting the
>>> change is no longer an option.
>>
>> Why?  I'm pretty sure more breakage will pop up over time.
>
> How would you propose 'fixing' it?

 Add a flag character, eg. %.X.
>>>
>>> %.X is no good, since we already support %.1X (that is, all printf()
>>> flags should keep their existing printf() meaning). But Jim's proposal
>>> of %:X for extended stat info, especially given how date already
>>> supports %:z for extended timezone formatting, makes sense to me.  In
>>> fact, %.3:X might be a nice way to request that the result be
>>> displayed in milliseconds, although Jim's proof of concept patch
>>> didn't cover that aspect.
>>
>> An alternative would be to leave %X, %Y, etc. as the only
>> operators that print seconds-since-epoch, and let the new %:X
>> print the nanoseconds part of the associated time.
>>
>> Then, to get full seconds.nanoseconds, you'd use a format like %X.%:X
>> and if you want only milliseconds, you'd use %X.%3.3:X
>
> Oops.
> I forgot to zero-pad.  Otherwise, with < 100,000,000ns in the first
> case or < 100 in the 2nd, and the above would print invalid numbers.
>
> This is the right way:
>
>   To get full seconds.nanoseconds, you'd use a format like %X.%0:X
>   and if you want only milliseconds, you'd use %X.%03.3:X

A possible disadvantage of using this approach is that the naive
implementation would perform truncation.  I.e., using a format of
"%X.%3.3:X", a time of N.10099 would be truncated to N.100

I'm inclined to say:
If you care, don't truncate in the first place, or use
another tool, e.g., printf, to format the floating point number.



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Eric Blake wrote:
> On 10/21/2010 09:33 AM, Jim Meyering wrote:
>>> Then, to get full seconds.nanoseconds, you'd use a format like %X.%:X
>>> and if you want only milliseconds, you'd use %X.%3.3:X
>>
>> Oops.
>> I forgot to zero-pad.  Otherwise, with<  100,000,000ns in the first
>> case or<  100 in the 2nd, and the above would print invalid numbers.
>>
>> This is the right way:
>>
>>To get full seconds.nanoseconds, you'd use a format like %X.%0:X
>>and if you want only milliseconds, you'd use %X.%03.3:X
>
> Do we really want to require that users must manually 0-pad, or can we
> explicitly state that %:X has a minimum width of min(precision,9), and
> 0 padding is only needed if you want a width greater than 9.

That would be better, since I suspect there are very
few uses of %:X that would *not* want leading zeros.
Otherwise, we'd be setting up users of this feature
for an insidious bug.



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Eric Blake

On 10/21/2010 09:33 AM, Jim Meyering wrote:

Then, to get full seconds.nanoseconds, you'd use a format like %X.%:X
and if you want only milliseconds, you'd use %X.%3.3:X


Oops.
I forgot to zero-pad.  Otherwise, with<  100,000,000ns in the first
case or<  100 in the 2nd, and the above would print invalid numbers.

This is the right way:

   To get full seconds.nanoseconds, you'd use a format like %X.%0:X
   and if you want only milliseconds, you'd use %X.%03.3:X


Do we really want to require that users must manually 0-pad, or can we 
explicitly state that %:X has a minimum width of min(precision,9), and 0 
padding is only needed if you want a width greater than 9.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Jim Meyering wrote:

> Eric Blake wrote:
>
>> On 10/21/2010 08:09 AM, Andreas Schwab wrote:
>>> Eric Blake  writes:
>>>
 On 10/21/2010 03:22 AM, Andreas Schwab wrote:
>
>
> Jim Meyering   writes:
>
>> And besides, with coreutils-8.6 already released, reverting the
>> change is no longer an option.
>
> Why?  I'm pretty sure more breakage will pop up over time.

 How would you propose 'fixing' it?
>>>
>>> Add a flag character, eg. %.X.
>>
>> %.X is no good, since we already support %.1X (that is, all printf()
>> flags should keep their existing printf() meaning). But Jim's proposal
>> of %:X for extended stat info, especially given how date already
>> supports %:z for extended timezone formatting, makes sense to me.  In
>> fact, %.3:X might be a nice way to request that the result be
>> displayed in milliseconds, although Jim's proof of concept patch
>> didn't cover that aspect.
>
> An alternative would be to leave %X, %Y, etc. as the only
> operators that print seconds-since-epoch, and let the new %:X
> print the nanoseconds part of the associated time.
>
> Then, to get full seconds.nanoseconds, you'd use a format like %X.%:X
> and if you want only milliseconds, you'd use %X.%3.3:X

Oops.
I forgot to zero-pad.  Otherwise, with < 100,000,000ns in the first
case or < 100 in the 2nd, and the above would print invalid numbers.

This is the right way:

  To get full seconds.nanoseconds, you'd use a format like %X.%0:X
  and if you want only milliseconds, you'd use %X.%03.3:X



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Pádraig Brady
On 21/10/10 15:49, Eric Blake wrote:
> On 10/21/2010 08:42 AM, Pádraig Brady wrote:
>>> Or use '.' rather than ':' which also has the
>>> advantage of being backward compat with older stats,
>>
>> To clarify, coreutils<= 8.5 output these timestamps
>> using an int format internally, and so ignored any specified precision.
> 
> Not quite:
> 
> $ stat -c%0.20X .
> 001287615247

Fair enough, but inconsequential to the special case (%.X) we're talking about.

> 
>> coreutils 8.6 treats these timestamps as strings and
>> therefore %.X will not output anything which is a pity,
>> but if we're considering making 8.6 "special" in it's
>> handling of %[WXYZ], then perhaps this is OK.
> 
> I'm still wary of special-casing precision like this; should it behave
> more like printf()s %.d or %.f?

> What you are arguing for is that %X has
> no . or subsecond digits, %.X has nine subsecond digits

Right

> but what about %.*X?

Well that's a separate but related issue.
Currently we're treating like %.s which is a little confusing
as one might guess first that %.f was being used. Using %s
doesn't allow getting millisecond resolution for example.
Also this is another backwards compat issue as we previously used %.j

> At this point, I'm thinking that %:X is nicer than %.X, to avoid
> these types of confusion, and given that date(1) already supports %:z.

Yep, that avoids the issue, but means one can use "%.X" to mean:-
get the best resolution timestamp available, and have it work
on all versions of coreutils (except 8.6 which may be deemed "special").

Jim's suggestion of splitting the nanoseconds away from %[WXYZ]
altogether, elsewhere in this thread, is the most flexible and compatible,
albeit not as discoverable.

cheers,
Pádraig.



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Eric Blake wrote:

> On 10/21/2010 08:09 AM, Andreas Schwab wrote:
>> Eric Blake  writes:
>>
>>> On 10/21/2010 03:22 AM, Andreas Schwab wrote:


 Jim Meyering   writes:

> And besides, with coreutils-8.6 already released, reverting the
> change is no longer an option.

 Why?  I'm pretty sure more breakage will pop up over time.
>>>
>>> How would you propose 'fixing' it?
>>
>> Add a flag character, eg. %.X.
>
> %.X is no good, since we already support %.1X (that is, all printf()
> flags should keep their existing printf() meaning). But Jim's proposal
> of %:X for extended stat info, especially given how date already
> supports %:z for extended timezone formatting, makes sense to me.  In
> fact, %.3:X might be a nice way to request that the result be
> displayed in milliseconds, although Jim's proof of concept patch
> didn't cover that aspect.

An alternative would be to leave %X, %Y, etc. as the only
operators that print seconds-since-epoch, and let the new %:X
print the nanoseconds part of the associated time.

Then, to get full seconds.nanoseconds, you'd use a format like %X.%:X
and if you want only milliseconds, you'd use %X.%3.3:X



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Eric Blake

On 10/21/2010 08:42 AM, Pádraig Brady wrote:

Or use '.' rather than ':' which also has the
advantage of being backward compat with older stats,


To clarify, coreutils<= 8.5 output these timestamps
using an int format internally, and so ignored any specified precision.


Not quite:

$ stat -c%0.20X .
001287615247


coreutils 8.6 treats these timestamps as strings and
therefore %.X will not output anything which is a pity,
but if we're considering making 8.6 "special" in it's
handling of %[WXYZ], then perhaps this is OK.


I'm still wary of special-casing precision like this; should it behave 
more like printf()s %.d or %.f?  What you are arguing for is that %X has 
no . or subsecond digits, %.X has nine subsecond digits, but what about 
%.*X?  At this point, I'm thinking that %:X is nicer than %.X, to avoid 
these types of confusion, and given that date(1) already supports %:z.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Pádraig Brady
On 21/10/10 15:18, Pádraig Brady wrote:
> On 21/10/10 15:01, Jim Meyering wrote:
>> Andreas Schwab wrote:
>>> Jim Meyering  writes:
>>>
 And besides, with coreutils-8.6 already released, reverting the
 change is no longer an option.
>>>
>>> Why?  I'm pretty sure more breakage will pop up over time.
>>
>> Hmm... I see what you mean.
>> Anyone using a distribution with coreutils-8.5 or older will
>> think it's fine to treat $(stat -c %X) as an integer with no
>> decimal or trailing fraction.
>>
>> Is it worthwhile to create a new format, say %...:X (and same for Y and Z)
>> that expands to seconds.nanoseconds and to revert %...X to the old
>> seconds-only semantics?
>>
>> That would make it so people could use stat's %X %Y %Z portably
>> while new scripts can still get nanosecond accuracy when needed.
>>
>> Here's PoC code:
>>
>>   $ for i in %X %:X %Y %:Y %Z %:Z; do src/stat -c $i .; done
>>   1256665918
>>   1256665918.441784225
> 
> Or use '.' rather than ':' which also has the
> advantage of being backward compat with older stats,

To clarify, coreutils <= 8.5 output these timestamps
using an int format internally, and so ignored any specified precision.

coreutils 8.6 treats these timestamps as strings and
therefore %.X will not output anything which is a pity,
but if we're considering making 8.6 "special" in it's
handling of %[WXYZ], then perhaps this is OK.

cheers,
Pádraig.



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Pádraig Brady
On 21/10/10 15:01, Jim Meyering wrote:
> Andreas Schwab wrote:
>> Jim Meyering  writes:
>>
>>> And besides, with coreutils-8.6 already released, reverting the
>>> change is no longer an option.
>>
>> Why?  I'm pretty sure more breakage will pop up over time.
> 
> Hmm... I see what you mean.
> Anyone using a distribution with coreutils-8.5 or older will
> think it's fine to treat $(stat -c %X) as an integer with no
> decimal or trailing fraction.
> 
> Is it worthwhile to create a new format, say %...:X (and same for Y and Z)
> that expands to seconds.nanoseconds and to revert %...X to the old
> seconds-only semantics?
> 
> That would make it so people could use stat's %X %Y %Z portably
> while new scripts can still get nanosecond accuracy when needed.
> 
> Here's PoC code:
> 
>   $ for i in %X %:X %Y %:Y %Z %:Z; do src/stat -c $i .; done
>   1256665918
>   1256665918.441784225

Or use '.' rather than ':' which also has the
advantage of being backward compat with older stats,



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Andreas Schwab
Eric Blake  writes:

> On 10/21/2010 03:22 AM, Andreas Schwab wrote:
>>
>>
>> Jim Meyering  writes:
>>
>>> And besides, with coreutils-8.6 already released, reverting the
>>> change is no longer an option.
>>
>> Why?  I'm pretty sure more breakage will pop up over time.
>
> How would you propose 'fixing' it?

Add a flag character, eg. %.X.

> And why give the user less information than what is available?

Because it breaks things.  This is bad in general, especially if it can
easily be avoided.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Eric Blake

On 10/21/2010 08:09 AM, Andreas Schwab wrote:

Eric Blake  writes:


On 10/21/2010 03:22 AM, Andreas Schwab wrote:



Jim Meyering   writes:


And besides, with coreutils-8.6 already released, reverting the
change is no longer an option.


Why?  I'm pretty sure more breakage will pop up over time.


How would you propose 'fixing' it?


Add a flag character, eg. %.X.


%.X is no good, since we already support %.1X (that is, all printf() 
flags should keep their existing printf() meaning). But Jim's proposal 
of %:X for extended stat info, especially given how date already 
supports %:z for extended timezone formatting, makes sense to me.  In 
fact, %.3:X might be a nice way to request that the result be displayed 
in milliseconds, although Jim's proof of concept patch didn't cover that 
aspect.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Andreas Schwab wrote:
> Jim Meyering  writes:
>
>> And besides, with coreutils-8.6 already released, reverting the
>> change is no longer an option.
>
> Why?  I'm pretty sure more breakage will pop up over time.

Hmm... I see what you mean.
Anyone using a distribution with coreutils-8.5 or older will
think it's fine to treat $(stat -c %X) as an integer with no
decimal or trailing fraction.

Is it worthwhile to create a new format, say %...:X (and same for Y and Z)
that expands to seconds.nanoseconds and to revert %...X to the old
seconds-only semantics?

That would make it so people could use stat's %X %Y %Z portably
while new scripts can still get nanosecond accuracy when needed.

Here's PoC code:

  $ for i in %X %:X %Y %:Y %Z %:Z; do src/stat -c $i .; done
  1256665918
  1256665918.441784225
  1287476685
  1287476685.450022280
  1287476685
  1287476685.450022280

In retrospect, %W should get the same treatment.

diff --git a/src/stat.c b/src/stat.c
index fabbc17..dfb4dcb 100644
--- a/src/stat.c
+++ b/src/stat.c
@@ -463,7 +463,7 @@ human_time (struct timespec t)
 }

 static char * ATTRIBUTE_WARN_UNUSED_RESULT
-epoch_time (struct timespec t)
+epoch_time (bool print_ns, struct timespec t)
 {
   static char str[INT_STRLEN_BOUND (time_t) + sizeof ".N"];
   /* Note that time_t can technically be a floating point value, such
@@ -475,10 +475,20 @@ epoch_time (struct timespec t)
  converting to struct tm just to use nstrftime (str, len, "%s.%N",
  tm, 0, t.tv_nsec) is pointless, since nstrftime would have to
  convert back to seconds as time_t.  */
-  if (TYPE_SIGNED (time_t))
-sprintf (str, "%" PRIdMAX ".%09ld", (intmax_t) t.tv_sec, t.tv_nsec);
+  if (print_ns)
+{
+  if (TYPE_SIGNED (time_t))
+sprintf (str, "%" PRIdMAX ".%09ld", (intmax_t) t.tv_sec, t.tv_nsec);
+  else
+sprintf (str, "%" PRIuMAX ".%09ld", (uintmax_t) t.tv_sec, t.tv_nsec);
+}
   else
-sprintf (str, "%" PRIuMAX ".%09ld", (uintmax_t) t.tv_sec, t.tv_nsec);
+{
+  if (TYPE_SIGNED (time_t))
+sprintf (str, "%" PRIdMAX, (intmax_t) t.tv_sec);
+  else
+sprintf (str, "%" PRIuMAX, (uintmax_t) t.tv_sec);
+}
   return str;
 }

@@ -539,7 +549,8 @@ out_file_context (char *pformat, size_t prefix_len, char 
const *filename)

 /* Print statfs info.  Return zero upon success, nonzero upon failure.  */
 static bool ATTRIBUTE_WARN_UNUSED_RESULT
-print_statfs (char *pformat, size_t prefix_len, char m, char const *filename,
+print_statfs (char *pformat, size_t prefix_len, unsigned int m,
+  char const *filename,
   void const *data)
 {
   STRUCT_STATVFS const *statfsbuf = data;
@@ -713,7 +724,7 @@ print_mount_point:

 /* Print stat info.  Return zero upon success, nonzero upon failure.  */
 static bool
-print_stat (char *pformat, size_t prefix_len, char m,
+print_stat (char *pformat, size_t prefix_len, unsigned int m,
 char const *filename, void const *data)
 {
   struct stat *statbuf = (struct stat *) data;
@@ -820,26 +831,35 @@ print_stat (char *pformat, size_t prefix_len, char m,
 if (t.tv_nsec < 0)
   out_string (pformat, prefix_len, "-");
 else
-  out_string (pformat, prefix_len, epoch_time (t));
+  out_string (pformat, prefix_len, epoch_time (0, t));
   }
   break;
 case 'x':
   out_string (pformat, prefix_len, human_time (get_stat_atime (statbuf)));
   break;
 case 'X':
-  out_string (pformat, prefix_len, epoch_time (get_stat_atime (statbuf)));
+  out_string (pformat, prefix_len, epoch_time (0, get_stat_atime 
(statbuf)));
+  break;
+case 'X' + 256:
+  out_string (pformat, prefix_len, epoch_time (1, get_stat_atime 
(statbuf)));
   break;
 case 'y':
   out_string (pformat, prefix_len, human_time (get_stat_mtime (statbuf)));
   break;
 case 'Y':
-  out_string (pformat, prefix_len, epoch_time (get_stat_mtime (statbuf)));
+  out_string (pformat, prefix_len, epoch_time (0, get_stat_mtime 
(statbuf)));
+  break;
+case 'Y' + 256:
+  out_string (pformat, prefix_len, epoch_time (1, get_stat_mtime 
(statbuf)));
   break;
 case 'z':
   out_string (pformat, prefix_len, human_time (get_stat_ctime (statbuf)));
   break;
 case 'Z':
-  out_string (pformat, prefix_len, epoch_time (get_stat_ctime (statbuf)));
+  out_string (pformat, prefix_len, epoch_time (0, get_stat_ctime 
(statbuf)));
+  break;
+case 'Z' + 256:
+  out_string (pformat, prefix_len, epoch_time (1, get_stat_ctime 
(statbuf)));
   break;
 case 'C':
   fail |= out_file_context (pformat, prefix_len, filename);
@@ -897,7 +917,8 @@ print_esc_char (char c)
Return zero upon success, nonzero upon failure.  */
 static bool ATTRIBUTE_WARN_UNUSED_RESULT
 print_it (char const *format, char const *filename,
-  bool (*print_func) (char *, size_t, char, char const *, void const 
*),
+  bool (*print_func) (char 

Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Eric Blake

On 10/21/2010 03:22 AM, Andreas Schwab wrote:



Jim Meyering  writes:


And besides, with coreutils-8.6 already released, reverting the
change is no longer an option.


Why?  I'm pretty sure more breakage will pop up over time.


How would you propose 'fixing' it? By burning three more % specifiers, 
which are already coming close to a scarce resource?  (Note that %W is 
completely new with coreutils 8.6, so no one has ever had a short %W to 
deal with, so it is just %[XYZ]).  And why give the user less 
information than what is available?  %X was first implemented prior to 
POSIX 2008, back when subsecond resolution was not mandatory; but now 
that POSIX requires nanosecond information (even if the file system 
itself has less granularity), I see no reason to hide that fact, even if 
a few scripts need to be adjusted to be more robust to potential 
nanosecond information.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Andreas Schwab
Jim Meyering  writes:

> And besides, with coreutils-8.6 already released, reverting the
> change is no longer an option.

Why?  I'm pretty sure more breakage will pop up over time.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Andreas Schwab wrote:
> Jim Meyering  writes:
>
>> Can you point to any others?
>
> Tramp, mkinitrd on openSUSE 11.3.

I found one, you mention two more.  IMHO, that's not too bad.
And besides, with coreutils-8.6 already released, reverting the
change is no longer an option.



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Andreas Schwab
Jim Meyering  writes:

> Can you point to any others?

Tramp, mkinitrd on openSUSE 11.3.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: [coreutils] Re: [PATCH 2/2] stat: print timestamps to full resolution

2010-10-21 Thread Jim Meyering
Andreas Schwab wrote:
> Eric Blake  writes:
>
>> * src/stat.c (epoch_time): New function.
>> (print_stat): Use it for %[WXYZ].
>> * NEWS: Document this.
>> * tests/touch/60-seconds: Adjust test to match.
>> ---
>>
>> It bugs me that %x has more information than %X in 'stat --format',
>> especially, since we don't support any format modifiers for getting
>> at the additional information.  We're already incompatible with
>> BSD stat(1) format modifiers, and there is no standard for stat(1),
>> so I wasn't too worried about changing the meaning of existing
>> modifiers rather than burning new letters just for the nanosecond
>> portions.  And now that POSIX 2008 requires nanonsecond resolution
>> in stat(2), you could argue that we should always be displaying it.
>
> There are a lot of scripts that assume whole integer output from
> %[WXYZ].  It would have been better to make this optional and
> non-default.

Hi Andreas,

I was concerned about that risk, but searched for uses of stat
that would be affected and found so few that it seemed ok:

http://thread.gmane.org/gmane.comp.gnu.coreutils.general/161/focus=270

Can you point to any others?