Re: RFC uuid for log-format

Schimweg, Luca Fri, 06 Sep 2019 00:57:50 -0700

Hey Willy,

thanks for the suggestions, I'll integrate them later or on monday.


UUIDs generally have different versions. I implemented version 4 right now, but 
if you want to we can implement different UUID types. I personally think one is 
enough, because most people won't care if they get an UUID v1, v2, ..., but we 
could definitely do that, for example as uuid(1), uuid(2), .... 
In my opinion, implementing version 4 is the simplest one, which fulfills all 
requirements.

I think providing UUIDs as simple fetch is a good idea, because they do are a 
standardized format that is used quite much. PostgreSQL even has a Datatype 
dedicated to uuids.

Would you generally be interesting in adding such a feature into HAProxy? Then 
I would continue working on that topic.

Best regards,
Luca


On 06.09.19, 05:18, "Willy Tarreau" <w...@1wt.eu> wrote:

    On Thu, Sep 05, 2019 at 05:13:11PM +0000, Schimweg, Luca wrote:
    > Hey,
    > 
    > just for you to get an impression, a quick draft of what a change could 
look like:
    > 
    > 
https://github.com/lucaschimweg/haproxy/commit/5306395c062ac20b6d030d48c48290d5f08db5cc
    
    ( pasted here for discussion)
    
    > char smp_fetch_uuid_res[37];
    
    Note, this is not thread-safe, but you can use the trash from within the
    function instead, then you can simply use chunk_printf(&trash, ...) to
    replace the sprintf().
    
    > static int
    > smp_fetch_uuid(const struct arg *args, struct sample *smp, const char 
*kw, void *private)
    > {
    >     sprintf(smp_fetch_uuid_res, "%8.8x-%4.4x-4%3.3x-%4.4x-%7.7x%5.5x",
    >             (unsigned int) ((random() + 2*random()) % 4294967296), // 
random() only gives 31 bits of randomness, using it twice to get 32 bits
    
    Just being nit-picking, but here you're creating a bias by adding two 
randoms,
    as 25% of the values will be represented twice as frequently as the other 
ones
    (those both even and in the first half). As a general rule one must never 
add
    randoms with overlapping bits.
    
    >             (unsigned int) (random() % 65536), // getting 16 bits
    >             (unsigned int) (random() % 4096),  // getting 12 bits
    >             (unsigned int) (random() % 16384) + 32768,  // getting 14 
bits, setting the first two bits to 1 and 0...
    >             (unsigned int) (random() % 268435456), // getting 28 bits
    >             (unsigned int) (random() % 1048576) // getting 20 bits
    >     );
    
    > 
    >     smp->data.type = SMP_T_STR;
    >     smp->flags = SMP_F_MAY_CHANGE;
    >     smp->data.type = SMP_T_STR;
    >     smp->data.u.str.area = smp_fetch_uuid_res;
    >     smp->data.u.str.data = strlen(smp_fetch_uuid_res);
    
    Here since you know the length in advance, better not measure it again
    (it's returned by sprintf(), or will be automatically calculated by
    chunk_printf).
    
    > I didn't check with the contributing guidelines, etc... yet, but this 
commit
    > would make the feature work. 
    > In the end, Willy has to decide whether we want a feature like this.
    
    The only problem I'm having with this is that we're using a very generic
    name "uuid" for a single very specific implementation that will definitely
    not suit everyone. Even RFC4122 suggests parts involving timestamps and
    host-based identification. What could *possibly* constitute a good first
    step would be to mandate an argument which indicates the type of UUID we
    want to produce. This function would for now support only one type, this
    one, but be easily extended to produce others, which would just require
    one extra line in the doc indicating type and description. Just my two
    cents.
    
    Thanks,
    Willy

Re: RFC uuid for log-format

Reply via email to