Re: RFC uuid for log-format

Willy Tarreau Thu, 05 Sep 2019 20:18:36 -0700

On Thu, Sep 05, 2019 at 05:13:11PM +0000, Schimweg, Luca wrote:
> Hey,
> 
> just for you to get an impression, a quick draft of what a change could look 
> like:
> 
> https://github.com/lucaschimweg/haproxy/commit/5306395c062ac20b6d030d48c48290d5f08db5cc


( pasted here for discussion)

> char smp_fetch_uuid_res[37];

Note, this is not thread-safe, but you can use the trash from within the
function instead, then you can simply use chunk_printf(&trash, ...) to
replace the sprintf().

> static int
> smp_fetch_uuid(const struct arg *args, struct sample *smp, const char *kw, 
> void *private)
> {
>     sprintf(smp_fetch_uuid_res, "%8.8x-%4.4x-4%3.3x-%4.4x-%7.7x%5.5x",
>             (unsigned int) ((random() + 2*random()) % 4294967296), // 
> random() only gives 31 bits of randomness, using it twice to get 32 bits

Just being nit-picking, but here you're creating a bias by adding two randoms,
as 25% of the values will be represented twice as frequently as the other ones
(those both even and in the first half). As a general rule one must never add
randoms with overlapping bits.

>             (unsigned int) (random() % 65536), // getting 16 bits
>             (unsigned int) (random() % 4096),  // getting 12 bits
>             (unsigned int) (random() % 16384) + 32768,  // getting 14 bits, 
> setting the first two bits to 1 and 0...
>             (unsigned int) (random() % 268435456), // getting 28 bits
>             (unsigned int) (random() % 1048576) // getting 20 bits
>     );

> 
>     smp->data.type = SMP_T_STR;
>     smp->flags = SMP_F_MAY_CHANGE;
>     smp->data.type = SMP_T_STR;
>     smp->data.u.str.area = smp_fetch_uuid_res;
>     smp->data.u.str.data = strlen(smp_fetch_uuid_res);

Here since you know the length in advance, better not measure it again
(it's returned by sprintf(), or will be automatically calculated by
chunk_printf).

> I didn't check with the contributing guidelines, etc... yet, but this commit
> would make the feature work. 
> In the end, Willy has to decide whether we want a feature like this.

The only problem I'm having with this is that we're using a very generic
name "uuid" for a single very specific implementation that will definitely
not suit everyone. Even RFC4122 suggests parts involving timestamps and
host-based identification. What could *possibly* constitute a good first
step would be to mandate an argument which indicates the type of UUID we
want to produce. This function would for now support only one type, this
one, but be easily extended to produce others, which would just require
one extra line in the doc indicating type and description. Just my two
cents.

Thanks,
Willy

Re: RFC uuid for log-format

Reply via email to