Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-13 Thread Junio C Hamano
Carlo Marcelo Arenas Belón   writes:

> There are still some more possible improvements around this code but
> they are orthogonal to this change :
>
> * migrate to approxidate_careful or parse_expiry_date
> * maybe make sure only approxidate are used for expiration
>
> Changes in v2:
> * improved commit message as suggested by Eric
> * failsafe against time_t truncation as suggested by Junio
>
> -- >8 --
> Subject: [PATCH v2 2/2] read-cache: use time specific types instead of
>  unsigned long
>
> b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> introduced get_shared_index_expire_date using unsigned long to track
> the modification times of a shared index.
>
> dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> shows why that might be problematic so move to timestamp_t/time_t.
>
> if time_t can't represent a valid time keep the indexes for failsafe

I am not sure if this "failsafe" is a good idea in the first place.
FWIW, I had a total opposite in mind when I suggested it.

The user gave you a timestamp_t value, telling you that anything
before that time is too old to matter and can be discarded.

And date_overflows() helper tells you that that the given expiry
date is more in the future than _any_ possible timestamp expressed
in time_t.  

So the result of running stat() on the shared index file _will_ have
a timestamp that is older than the specified expiry.

Which means that the user essentially is telling you that any shared
index that is not referenced can immediately be expired, similar to
saying --expire=now, no?

I kind of sort of would understand if we (1) read the configured
expiry, (2) checked it with date_overflows(), and (3) told the user
that it is an impossible condition, i.e. an error, to use such a
large timestamp far in the future.  Then the user has a chance to
update the configuration to a more reasonable time---even the most
aggressive one does not need to specify anything more in the future
than "now", and that will not "date_overflows()" for 20 years or so.

> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  read-cache.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7b1354d759..7d322f11c8 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
>  
>  static const char *shared_index_expire = "2.weeks.ago";
>  
> -static unsigned long get_shared_index_expire_date(void)
> +static timestamp_t get_shared_index_expire_date(void)
>  {
> - static unsigned long shared_index_expire_date;
> + static timestamp_t shared_index_expire_date;
>   static int shared_index_expire_date_prepared;
>  
>   if (!shared_index_expire_date_prepared) {
> @@ -2643,12 +2643,12 @@ static unsigned long 
> get_shared_index_expire_date(void)
>  static int should_delete_shared_index(const char *shared_index_path)
>  {
>   struct stat st;
> - unsigned long expiration;
> + time_t expiration;
> + timestamp_t t = get_shared_index_expire_date();
>  
> - /* Check timestamp */
> - expiration = get_shared_index_expire_date();
> - if (!expiration)
> + if (!t || date_overflows(t))
>   return 0;
> + expiration = t;
>   if (stat(shared_index_path, ))
>   return error_errno(_("could not stat '%s'"), shared_index_path);
>   if (st.st_mtime > expiration)


Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-13 Thread Carlo Arenas
On Tue, Nov 13, 2018 at 12:49 AM Johannes Schindelin
 wrote:
> On Mon, 12 Nov 2018, Carlo Marcelo Arenas Belón wrote:
>>
> > if time_t can't represent a valid time keep the indexes for failsafe
>
> Is this sentence incomplete? What are those "indexes"?

the indexes that are created when core.splitIndex = true

the code that was modified looks for a configuration parameter (or the
default) that says something like "2 weeks ago" and that then converts
that into a timestamp that can be compared with the modification time
for the files that compose that index and then decides to delete
anyone that is older than the "expiration date".

since time_t is used for the stat.mtime there is a chance that the
timestamp might overflow its size, so if an overflow is detected we
assume the value to be equivalent to "never" and keep the files.

note this scenario will only matter around 2038 and it should be fixed
by then as part of the Year 2038 problem if we still care about 32-bit
UNIX by then, 32-bit Windows has until next century.

Carlo


Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-13 Thread Johannes Schindelin
Hi,

On Mon, 12 Nov 2018, Carlo Marcelo Arenas Belón wrote:

> There are still some more possible improvements around this code but
> they are orthogonal to this change :
> 
> * migrate to approxidate_careful or parse_expiry_date
> * maybe make sure only approxidate are used for expiration
> 
> Changes in v2:
> * improved commit message as suggested by Eric
> * failsafe against time_t truncation as suggested by Junio
> 
> -- >8 --
> Subject: [PATCH v2 2/2] read-cache: use time specific types instead of
>  unsigned long
> 
> b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> introduced get_shared_index_expire_date using unsigned long to track
> the modification times of a shared index.
> 
> dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> shows why that might be problematic so move to timestamp_t/time_t.
> 
> if time_t can't represent a valid time keep the indexes for failsafe

Is this sentence incomplete? What are those "indexes"?

Thanks,
Johannes

> 
> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  read-cache.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 7b1354d759..7d322f11c8 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
>  
>  static const char *shared_index_expire = "2.weeks.ago";
>  
> -static unsigned long get_shared_index_expire_date(void)
> +static timestamp_t get_shared_index_expire_date(void)
>  {
> - static unsigned long shared_index_expire_date;
> + static timestamp_t shared_index_expire_date;
>   static int shared_index_expire_date_prepared;
>  
>   if (!shared_index_expire_date_prepared) {
> @@ -2643,12 +2643,12 @@ static unsigned long 
> get_shared_index_expire_date(void)
>  static int should_delete_shared_index(const char *shared_index_path)
>  {
>   struct stat st;
> - unsigned long expiration;
> + time_t expiration;
> + timestamp_t t = get_shared_index_expire_date();
>  
> - /* Check timestamp */
> - expiration = get_shared_index_expire_date();
> - if (!expiration)
> + if (!t || date_overflows(t))
>   return 0;
> + expiration = t;
>   if (stat(shared_index_path, ))
>   return error_errno(_("could not stat '%s'"), shared_index_path);
>   if (st.st_mtime > expiration)
> -- 
> 2.19.1.856.g8858448bb
> 
> 

[PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Carlo Marcelo Arenas Belón
There are still some more possible improvements around this code but
they are orthogonal to this change :

* migrate to approxidate_careful or parse_expiry_date
* maybe make sure only approxidate are used for expiration

Changes in v2:
* improved commit message as suggested by Eric
* failsafe against time_t truncation as suggested by Junio

-- >8 --
Subject: [PATCH v2 2/2] read-cache: use time specific types instead of
 unsigned long

b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
introduced get_shared_index_expire_date using unsigned long to track
the modification times of a shared index.

dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
shows why that might be problematic so move to timestamp_t/time_t.

if time_t can't represent a valid time keep the indexes for failsafe

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 read-cache.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..7d322f11c8 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
 
 static const char *shared_index_expire = "2.weeks.ago";
 
-static unsigned long get_shared_index_expire_date(void)
+static timestamp_t get_shared_index_expire_date(void)
 {
-   static unsigned long shared_index_expire_date;
+   static timestamp_t shared_index_expire_date;
static int shared_index_expire_date_prepared;
 
if (!shared_index_expire_date_prepared) {
@@ -2643,12 +2643,12 @@ static unsigned long get_shared_index_expire_date(void)
 static int should_delete_shared_index(const char *shared_index_path)
 {
struct stat st;
-   unsigned long expiration;
+   time_t expiration;
+   timestamp_t t = get_shared_index_expire_date();
 
-   /* Check timestamp */
-   expiration = get_shared_index_expire_date();
-   if (!expiration)
+   if (!t || date_overflows(t))
return 0;
+   expiration = t;
if (stat(shared_index_path, ))
return error_errno(_("could not stat '%s'"), shared_index_path);
if (st.st_mtime > expiration)
-- 
2.19.1.856.g8858448bb



Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Junio C Hamano
Johannes Schindelin  writes:

>> Given that the function returns the value obtained from
>> approxidate(), which is approxidate_careful() in disguise, time_t is
>> not as appropriate as timestamp_t, no?
>> 
>> IOW, what if time_t were narrower than timestamp_t?
>
> Rght. From the patch, I had assumed that the return type of
> `approxidate()` is `time_t`, but it is `timestamp_t`.

Yes, but if we dig a bit deeper, it turns out that the return value
of this function is used at only one place, to be compared with the
.st_mtime field.

So for this change to truly be consisent, not just the function
needs to return timestamp_t, but also its sole caller needs to check
if its return value exceeds the maximum span that is expressible
with the platform's time_t (and if so, treat the expiration to be
"infinity- never expire"), or something like that.


Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Johannes Schindelin
Hi,

On Mon, 12 Nov 2018, Junio C Hamano wrote:

> Carlo Marcelo Arenas Belón   writes:
> 
> > b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> > introduced get_shared_index_expire_date using unsigned long to track
> > the modification times of a shared index.
> >
> > dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> > shows why that might problematic so move to time_t instead.
> >
> > Signed-off-by: Carlo Marcelo Arenas Belón 
> > ---
> >  read-cache.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/read-cache.c b/read-cache.c
> > index 7b1354d759..5525d8e679 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state 
> > *istate,
> >  
> >  static const char *shared_index_expire = "2.weeks.ago";
> >  
> > -static unsigned long get_shared_index_expire_date(void)
> > +static time_t get_shared_index_expire_date(void)
> >  {
> > -   static unsigned long shared_index_expire_date;
> > +   static time_t shared_index_expire_date;
> > static int shared_index_expire_date_prepared;
> >  
> > if (!shared_index_expire_date_prepared) {
> 
> After this line, the post-context reads like this:
> 
>   git_config_get_expiry("splitindex.sharedindexexpire",
> _index_expire);
>   shared_index_expire_date = approxidate(shared_index_expire);
>   shared_index_expire_date_prepared = 1;
>   }
> 
>   return shared_index_expire_date;
> 
> Given that the function returns the value obtained from
> approxidate(), which is approxidate_careful() in disguise, time_t is
> not as appropriate as timestamp_t, no?
> 
> IOW, what if time_t were narrower than timestamp_t?

Rght. From the patch, I had assumed that the return type of
`approxidate()` is `time_t`, but it is `timestamp_t`.

Ciao,
Johannes

> 
> 
> > @@ -2643,7 +2643,7 @@ static unsigned long 
> > get_shared_index_expire_date(void)
> >  static int should_delete_shared_index(const char *shared_index_path)
> >  {
> > struct stat st;
> > -   unsigned long expiration;
> > +   time_t expiration;
> >  
> > /* Check timestamp */
> > expiration = get_shared_index_expire_date();
> 

Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Junio C Hamano
Carlo Marcelo Arenas Belón   writes:

> b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> introduced get_shared_index_expire_date using unsigned long to track
> the modification times of a shared index.
>
> dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> shows why that might problematic so move to time_t instead.
>
> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  read-cache.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7b1354d759..5525d8e679 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
>  
>  static const char *shared_index_expire = "2.weeks.ago";
>  
> -static unsigned long get_shared_index_expire_date(void)
> +static time_t get_shared_index_expire_date(void)
>  {
> - static unsigned long shared_index_expire_date;
> + static time_t shared_index_expire_date;
>   static int shared_index_expire_date_prepared;
>  
>   if (!shared_index_expire_date_prepared) {

After this line, the post-context reads like this:

git_config_get_expiry("splitindex.sharedindexexpire",
  _index_expire);
shared_index_expire_date = approxidate(shared_index_expire);
shared_index_expire_date_prepared = 1;
}

return shared_index_expire_date;

Given that the function returns the value obtained from
approxidate(), which is approxidate_careful() in disguise, time_t is
not as appropriate as timestamp_t, no?

IOW, what if time_t were narrower than timestamp_t?


> @@ -2643,7 +2643,7 @@ static unsigned long get_shared_index_expire_date(void)
>  static int should_delete_shared_index(const char *shared_index_path)
>  {
>   struct stat st;
> - unsigned long expiration;
> + time_t expiration;
>  
>   /* Check timestamp */
>   expiration = get_shared_index_expire_date();


Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Eric Sunshine
On Mon, Nov 12, 2018 at 3:41 AM Carlo Marcelo Arenas Belón
 wrote:
> b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> introduced get_shared_index_expire_date using unsigned long to track
> the modification times of a shared index.
>
> dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> shows why that might problematic so move to time_t instead.

s/might/& be/

> Signed-off-by: Carlo Marcelo Arenas Belón 


[PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Carlo Marcelo Arenas Belón
b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
introduced get_shared_index_expire_date using unsigned long to track
the modification times of a shared index.

dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
shows why that might problematic so move to time_t instead.

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 read-cache.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..5525d8e679 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
 
 static const char *shared_index_expire = "2.weeks.ago";
 
-static unsigned long get_shared_index_expire_date(void)
+static time_t get_shared_index_expire_date(void)
 {
-   static unsigned long shared_index_expire_date;
+   static time_t shared_index_expire_date;
static int shared_index_expire_date_prepared;
 
if (!shared_index_expire_date_prepared) {
@@ -2643,7 +2643,7 @@ static unsigned long get_shared_index_expire_date(void)
 static int should_delete_shared_index(const char *shared_index_path)
 {
struct stat st;
-   unsigned long expiration;
+   time_t expiration;
 
/* Check timestamp */
expiration = get_shared_index_expire_date();
-- 
2.19.1.856.g8858448bb