Re: "Better" mod_unique_id

Ian Holsman Tue, 29 Apr 2008 02:54:03 -0700

Hi Konstantin.

I'm about to look at the same issue for my employer.

for my version I was planning on using apr_uuid_get that usesuuid_create / uuid_generate function to generate a unique value.


have you looked at this function?

regards
Ian

Konstantin Chuguev wrote:

Hi,
I'm developing a solution generating unique IDs for the requests towebsites that are not only clustered but also geographicallydispersed. This implies the following:- the website's virtual host section on each Apache server has thesame ServerName which is mapped by DNS to different IP addresses usingvarious methods, geo-proximity, round-robin, etc.
- the virtual host's IP address is normally but not necessarily *;
- the actual IP address the Apache listens to for this virtual host isnormally, but not necessarily, an intranet address (behind a loadbalancer).
After analysing the format of the ID generated by mod_unique_id, andreading the module's source code, I have a feeling that this modulehas serious flaws if used in my situation.No offence to the authors, I'm sure the module serves its purpose justright for the majority of its users. But as it seems that it doesn'tdo this in my case, I thought I'd better ask if someone knows why.
I understand that the module is relatively old and likely has beenported from a pre-2.0 version, when no APR library existed, and thismight explain its design. I'd be glad if someone could either confirmthis or
explain why it has been done like that.
Now to the point of my question. The unique_id_rec structure thatcontains the binary representation of the unique ID consists of thefollowing fields:
    unsigned int stamp;
    unsigned int in_addr;
    unsigned int pid;
    unsigned short counter;
    unsigned int thread_index;
1. Why use unsigned int timestamp when there exists apr_time_t whichis 64 bit and seems to be at least 1 microsecond accurate? Surelythere is unsigned short counter which helps if there is more than onerequest coming to the same IP address / PID / thread per second, butstill I can hardly see this as a better design.
2. Why use unsigned id pid plus unsigned int thread_index if thereexists long r->connection->id? thread_index is in fact produced bydoing htonl((unsigned int)r->connection->id), but MPMs seem to ensurethe child_id is included there already! While it is just 4 bytes longcompared to the 8-byte pid/thread_index combination, still it isguaranteed to be unique among all worker threads of the Apache serverin the system. And I don't think this particular field needsconverting to the network byte order.
3. Using unsigned int in_addr with the server-side IPv4 address workswell in the single cluster in the IPv4 network only. What if only IPv6is being used in the intranet? What if multiple dispersed clusterswith exactly the same intranet IP addressing schemes serve the samewebsite? Please correct me if I'm wrong but I think the followingstructure would represent the unique website more correctly:- union {struct in_addr, struct in6_addr} local_ip_addr: the IPaddress of the local side of the HTTP connection;- union {struct in_addr, struct in6_addr} dns_ip_addr: one (any?) ofthe IP addresses that are mapped to the website's domain name in DNS.The latter can be omitted if the former IP address is public.
Does anyone see any flaws in the design where the following structureis used?
    apr_time_t stamp;    // 8 bytes, converted to network byte order
long connection_id; // size depends on architecture: normally 4or 8 bytes, doesn't need htonlunion {struct in_addr, struct in6_addr} local_ip_addr; // 4 to16 bytes[union {struct in_addr, struct in6_addr} dns_ip_addr;] // 0 to16 bytes
Comments and suggestions are appreciated.

Konstantin Chuguev
Software Developer
Clickstream Technologies PLC, 58 Davies Street, London, W1K 5JF,Registered in England No. 3774129

Re: "Better" mod_unique_id

Reply via email to