Re: Cheap and unique

2002-05-07 Thread jjore

I would have sent both to the client. The sequence would be *the* id and 
is guaranteed to be uinique by the database (or whatever else is around 
that does this reliably). The idea is that by combining the random secret 
with the ID and sending the digest with that the ID number can't just be 
incremented or fooled with. The digest isn't unique but it would keep the 
unique bit from being fiddled with.

That said, I'm just a paranoid person regarding security (especially for 
my out-side-of-work work at http://www.greentechnologist.org) and I 
wouldn't want to keep the random bits around for too long to prevent them 
from being brute-forced. I'm imagining that someone with a fast computer, 
the ID number and knowledge of how that combines with randomness for the 
digest source might be able to locate the bits just by trying a lot of 
them. I would expire them after a while just to prevent that from 
happening by stating that if there is a 15 minute session, new random bits 
are generated each five minutes. New sessions would be tied to the most 
recent random data. The random data might be expired at the session 
timeout. This assumes that I'm tracking which random bits are associated 
with the session to verify that the digest was ok. All that means is that 
the random-ness is valid as long as the session is still active and 
normally expires after a time period otherwise. Perhaps other people would 
get by just keeping a static secret on the server. That may be overkill 
for many people, it might not be for the apps I'm working with.

Joshua b. Jore
Domino Developer by day, political by night




James G Smith [EMAIL PROTECTED]
05/06/2002 01:45 PM
Please respond to JGSmith

 
To: [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject:Re: Cheap and unique


[EMAIL PROTECTED] wrote:
I've been following this conversation and I'd like to clarify whether my 
idea (since I and others want to do this as well) would be use an 
incrementing counter for uniqueness. Then also store a bit of secret 
randomness, concatenate both values together and create a digest hash. 
That hash would be sent along with the sequence as well. This would allow 

uniqueness and prevent guessing since the digest would have to match as 
well. Depending on my paranoia I could either get fresh random bits each 
time (and have a good hardware source for this then) or keep it around 
for 
a bit and throw it away after a period.

I think I understand you correctly, but I'm not sure.

You mention the sequence being incremented for uniqueness and the
digest.  I think you propose to send the sequence along with the
digest (the digest containing that bit of randomness along with the
sequence), but you also mention keeping the random bits around for
only a short time, which would indicate they aren't being used to
verify the sequence, but produce the sequence via the hash.

A digest is not unique, especially with the random bit of data thrown
in.  For example, MD5 has 128 bits, but can hash any length string.
There are more than 2^128 strings that MD5 can take to 128 bits.
Therefore, MD5 does not produce a unique value, though it is a
reproducable value (the same input string will always produce the
same output string).  You can replace MD5 with MHX (my hash X) and
the number of bits with some other length and the results are still
the same -- in other words, no hash will give unique results.

The secret string concatenated with the unique number and then hashed
can be used to guarantee that the number has not been tampered with,
but the secret string would need to be constant to be able to catch
tampering.  Otherwise, how can you tell if the hash is correct?
-- 
James Smith [EMAIL PROTECTED], 979-862-3725
Texas AM CIS Operating Systems Group, Unix






Re: Cheap and unique

2002-05-07 Thread James G Smith

[EMAIL PROTECTED] wrote:
I would have sent both to the client. The sequence would be *the* id and 
is guaranteed to be uinique by the database (or whatever else is around 
that does this reliably). The idea is that by combining the random secret 
with the ID and sending the digest with that the ID number can't just be 
incremented or fooled with. The digest isn't unique but it would keep the 
unique bit from being fiddled with.

That said, I'm just a paranoid person regarding security (especially for 
my out-side-of-work work at http://www.greentechnologist.org) and I 
wouldn't want to keep the random bits around for too long to prevent them 
from being brute-forced. I'm imagining that someone with a fast computer, 
the ID number and knowledge of how that combines with randomness for the 
digest source might be able to locate the bits just by trying a lot of 
them. I would expire them after a while just to prevent that from 
happening by stating that if there is a 15 minute session, new random bits 
are generated each five minutes. New sessions would be tied to the most 
recent random data. The random data might be expired at the session 
timeout. This assumes that I'm tracking which random bits are associated 
with the session to verify that the digest was ok. All that means is that 
the random-ness is valid as long as the session is still active and 
normally expires after a time period otherwise. Perhaps other people would 
get by just keeping a static secret on the server. That may be overkill 
for many people, it might not be for the apps I'm working with.

Thanks for the clarification -- makes a lot more sense.  At first
glance, I think that would work.
-- 
James Smith [EMAIL PROTECTED], 979-862-3725
Texas AM CIS Operating Systems Group, Unix



Re: Cheap and unique

2002-05-07 Thread Simon Oliver

 [EMAIL PROTECTED] wrote:
 digest source might be able to locate the bits just by trying a lot of
 them. I would expire them after a while just to prevent that from
 happening by stating that if there is a 15 minute session, new random bits
 are generated each five minutes.

I missed the start of this thread, but how about generating a new id (or
random bits) on every vists: on first connect client is assigned a session
id, on subsequent connects, previous id is verified and a new id is
generated and returned.  This makes it even harder to crack.

--
  Simon Oliver



Re: Cheap and unique

2002-05-07 Thread jjore

(Anyone else, is there a module that already does this?)

That misses two things: random data is not unique and random data is 
scarce.

The thread started where someone else wanted a cheap way to generate 
difficult to guess and unique session ids. It went on around how using a 
random function doesn't provide uniqueness and eventually ended up where 
we're at now (verified sequential IDs). Part of the issue is that the 
output from a digest (SHA1, MD5) or random data is not unique and cannot 
ever be expected to be. So if you want uniqueness, you should use 
something that produces values without looping - like simple iteration. 
You could use some other number series but that's just pointless since you 
don't need to keep your session IDs secret and it will just confuse the 
next person to look at the code. You also run out of numbers faster if you 
{in|de}crement by more than 1.

A lot of other smarter people can tell you why random data is scarce. Just 
accept it. /dev/urandom is not an infinite font of quality entropy. If you 
use too much then you fall back to simpler algorithms that will enter into 
loops which are highly non-random.

So what I said was keep some random data secret for a bit, use it for the 
hashes and after a while get new random data. A malicious attacker can 
attempt to brute-force your secret during the period that the secret is 
still valid. Once the secret is invalidated then the attacker has to start 
the key-space over again. This is like asking distributed.net to start 
over every few minutes. While distributed.net would eventually make it's 
way through vast keyspaces for a single secret, it can't keep up with 
volataile secrets.

Josh




Simon Oliver [EMAIL PROTECTED]
05/07/2002 10:53 AM

 
To: [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject:Re: Cheap and unique


 [EMAIL PROTECTED] wrote:
 digest source might be able to locate the bits just by trying a lot of
 them. I would expire them after a while just to prevent that from
 happening by stating that if there is a 15 minute session, new random 
bits
 are generated each five minutes.

I missed the start of this thread, but how about generating a new id (or
random bits) on every vists: on first connect client is assigned a session
id, on subsequent connects, previous id is verified and a new id is
generated and returned.  This makes it even harder to crack.

--
  Simon Oliver






Re: Cheap and unique

2002-05-06 Thread Perrin Harkins

Ken Williams wrote:
 If you have the additional requirement that the unique values shouldn't 
 be easily *guessable*, that becomes a very hard problem, precisely 
 because random and unique are such poor friends.  Usually people 
 just cheat by generating a large random ID such that the probability of 
 it being already-used is low, and then they check all the previous IDs 
 to make sure.

The requirement to prevent guessing is usually aimed at security and 
preventing session hijacking and similar attacks (and believe me, this 
kind of attack is very common).  Another way to do this is to use a MAC 
like MD5 or SHA1, as described in the Eagle book and O'Reilly's CGI 
book.  This makes it very difficult for an attacker to generate a valid 
ID, even if the sequence of IDs is predictable.

- Perrin




Re: Cheap and unique

2002-05-06 Thread jjore

I've been following this conversation and I'd like to clarify whether my 
idea (since I and others want to do this as well) would be use an 
incrementing counter for uniqueness. Then also store a bit of secret 
randomness, concatenate both values together and create a digest hash. 
That hash would be sent along with the sequence as well. This would allow 
uniqueness and prevent guessing since the digest would have to match as 
well. Depending on my paranoia I could either get fresh random bits each 
time (and have a good hardware source for this then) or keep it around for 
a bit and throw it away after a period.

Does that sound right?

Josh




Perrin Harkins [EMAIL PROTECTED]
05/06/2002 01:15 PM

 
To: Ken Williams [EMAIL PROTECTED]
cc: OCNS Consulting [EMAIL PROTECTED], [EMAIL PROTECTED], David 
Jacobs 
[EMAIL PROTECTED]
Subject:Re: Cheap and unique


Ken Williams wrote:
 If you have the additional requirement that the unique values shouldn't 
 be easily *guessable*, that becomes a very hard problem, precisely 
 because random and unique are such poor friends.  Usually people 
 just cheat by generating a large random ID such that the probability of 
 it being already-used is low, and then they check all the previous IDs 
 to make sure.

The requirement to prevent guessing is usually aimed at security and 
preventing session hijacking and similar attacks (and believe me, this 
kind of attack is very common).  Another way to do this is to use a MAC 
like MD5 or SHA1, as described in the Eagle book and O'Reilly's CGI 
book.  This makes it very difficult for an attacker to generate a valid 
ID, even if the sequence of IDs is predictable.

- Perrin







Re: Cheap and unique

2002-05-06 Thread Perrin Harkins

[EMAIL PROTECTED] wrote:
 I've been following this conversation and I'd like to clarify whether my 
 idea (since I and others want to do this as well) would be use an 
 incrementing counter for uniqueness. Then also store a bit of secret 
 randomness, concatenate both values together and create a digest hash. 
 That hash would be sent along with the sequence as well. This would allow 
 uniqueness and prevent guessing since the digest would have to match as 
 well. Depending on my paranoia I could either get fresh random bits each 
 time (and have a good hardware source for this then) or keep it around for 
 a bit and throw it away after a period.

 Does that sound right?

Yes, except for the random part.  There is no randomness involved here. 
  You should use a secret key stored on your server.  There's an example 
of this technique here: 
http://www.oreilly.com/catalog/cgi2/chapter/ch08.html

- Perrin




Re: Cheap and unique

2002-05-06 Thread Peter Bi

Does the first email mean to use the incrementing numbers as seeds and then
generate cool random numbers from the partly ordered seeds, which will
make them more difficult to guess ?


Peter Bi

- Original Message -
From: James G Smith [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, May 06, 2002 11:45 AM
Subject: Re: Cheap and unique


 [EMAIL PROTECTED] wrote:
 I've been following this conversation and I'd like to clarify whether my
 idea (since I and others want to do this as well) would be use an
 incrementing counter for uniqueness. Then also store a bit of secret
 randomness, concatenate both values together and create a digest hash.
 That hash would be sent along with the sequence as well. This would allow
 uniqueness and prevent guessing since the digest would have to match as
 well. Depending on my paranoia I could either get fresh random bits each
 time (and have a good hardware source for this then) or keep it around
for
 a bit and throw it away after a period.

 I think I understand you correctly, but I'm not sure.

 You mention the sequence being incremented for uniqueness and the
 digest.  I think you propose to send the sequence along with the
 digest (the digest containing that bit of randomness along with the
 sequence), but you also mention keeping the random bits around for
 only a short time, which would indicate they aren't being used to
 verify the sequence, but produce the sequence via the hash.

 A digest is not unique, especially with the random bit of data thrown
 in.  For example, MD5 has 128 bits, but can hash any length string.
 There are more than 2^128 strings that MD5 can take to 128 bits.
 Therefore, MD5 does not produce a unique value, though it is a
 reproducable value (the same input string will always produce the
 same output string).  You can replace MD5 with MHX (my hash X) and
 the number of bits with some other length and the results are still
 the same -- in other words, no hash will give unique results.

 The secret string concatenated with the unique number and then hashed
 can be used to guarantee that the number has not been tampered with,
 but the secret string would need to be constant to be able to catch
 tampering.  Otherwise, how can you tell if the hash is correct?
 --
 James Smith [EMAIL PROTECTED], 979-862-3725
 Texas AM CIS Operating Systems Group, Unix





Re: Cheap and unique

2002-05-03 Thread Ken Williams


On Wednesday, May 1, 2002, at 06:46 AM, OCNS Consulting wrote:
 Of course srand seeds rand. And yes, it is a good way to generate
 random numbers for a one time application RUN.

The original poster is not looking for random, he's looking 
for unique.  These are in many ways *opposite* requirements, 
as the only reliable way to get unique IDs is to use a very 
deterministic procedure - a simple iterator is one of the 
easiest and best.

If you have the additional requirement that the unique values 
shouldn't be easily *guessable*, that becomes a very hard 
problem, precisely because random and unique are such poor 
friends.  Usually people just cheat by generating a large random 
ID such that the probability of it being already-used is low, 
and then they check all the previous IDs to make sure.


  -Ken




Re: Cheap and unique

2002-05-03 Thread David Jacobs


Good morning.

Ken is correct  - I am not looking for random, I am looking for unique.

Unique and sequenced would be ideal, but it's tricky because if I use

$i++.$IP_address.$parameter.$PID
{$i is global, $IP_address is the server address, $parameter is a 
parameter that was just passed in, $PID is the apache process ID.}

That's a decent solution, but it's not quite sequential because one 
Apache process could handle far more many requests than another, and so 
a later request could have a lower $i.

Are there any code examples out there that use mod_unique_id.c ? I am 
new to mod_perl and couldn't quite get that. Looking at the code, it 
looks like mod_unique_id does basically the same thing but with a shared 
counter between the different servers.

Also what is the cheapest way to get a datestamp? (I'm assuming it's not 
`date %Y%M%D%H%S`)

--

I'm sorry to be such a noob, but when I've got a little more mod_perl 
under my belt I'll be answering questions up a storm!

David




Re: Cheap and unique

2002-05-03 Thread Stas Bekman

David Jacobs wrote:
 
 Good morning.
 
 Ken is correct  - I am not looking for random, I am looking for unique.
 
 Unique and sequenced would be ideal, but it's tricky because if I use
 
 $i++.$IP_address.$parameter.$PID
 {$i is global, $IP_address is the server address, $parameter is a 
 parameter that was just passed in, $PID is the apache process ID.}
 
 That's a decent solution, but it's not quite sequential because one 
 Apache process could handle far more many requests than another, and so 
 a later request could have a lower $i.
 
 Are there any code examples out there that use mod_unique_id.c ? I am 
 new to mod_perl and couldn't quite get that. Looking at the code, it 
 looks like mod_unique_id does basically the same thing but with a shared 
 counter between the different servers.
 
 Also what is the cheapest way to get a datestamp? (I'm assuming it's not 
 `date %Y%M%D%H%S`)
 
 -- 
 
 I'm sorry to be such a noob, but when I've got a little more mod_perl 
 under my belt I'll be answering questions up a storm!

It looks like you didn't see my replies to your post. Please read them 
first.

http://marc.theaimsgroup.com/?l=apache-modperlm=102022922719057w=2
http://marc.theaimsgroup.com/?l=apache-modperlm=102023259920560w=2

__
Stas BekmanJAm_pH -- Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide --- http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com




RE: Cheap and unique

2002-05-01 Thread Homsher, Dave V.

David Jacobs wrote:
 
 I'm converting a few CGI scripts that used the PID as a 
cyclical unique
 number (in concert with TIMESTAMP - so it was TIMESTAMP.PID).
 
 Our goal is to find a replacement function that is extremely cheap
 (cheaper than say, random(100)) and will never repeat. 
Any ideas?
 Has anyone else faced this problem?

I use $$.$r-uri().$connections.time where $connections is an 
incremental 
value defined before sub handler {...} and do $connections++ w/in the 
handler block (also realizing that you're porting CGI's and 
will have to
get the uri elsewhere).

For example:

package foo;

my $connections = 0;

sub handler {
my $r = shift ;
...
...
...
my $unique_id = $$ . $r-uri() . $connections . time ;
...
...
...
$connections++ ;
}

1;

Regards,
Dave
http://www.linkreminder.com 



Re: [Fwd: Re: Cheap and unique]

2002-05-01 Thread darren chamberlain

* David Jacobs [EMAIL PROTECTED] [2002-04-30 18:31]:
 A global counter hanging around is a good solution, but not perfect if
 we deploy on multiple servers. 

That depends on what you initialize the global to; if you do something
like the last octet of the ip of the vhost and increment it by the PID
of the child each time you access it (rather than incrementing by 1),
you'll get a pretty useful way of getting a useful global value.

(darren)

-- 
Your freedom to swing your fist ends at the tip of my nose.
-- Larry Niven



Re: Cheap and unique

2002-04-30 Thread Ged Haywood

Hi there,

On Tue, 30 Apr 2002, David Jacobs wrote:

 I'm converting a few CGI scripts that used the PID as a cyclical unique 
 number (in concert with TIMESTAMP - so it was TIMESTAMP.PID).
 
 Our goal is to find a replacement function that is extremely cheap 
 (cheaper than say, random(100)) and will never repeat. Any ideas? 

Have a look at Time::HiRes?

73,
Ged.





Re: Cheap and unique

2002-04-30 Thread Perrin Harkins

David Jacobs wrote:
 I'm converting a few CGI scripts that used the PID as a cyclical unique 
 number (in concert with TIMESTAMP - so it was TIMESTAMP.PID).
 
 Our goal is to find a replacement function that is extremely cheap 
 (cheaper than say, random(100)) and will never repeat. Any ideas? 

Yes, mod_unique_id will do that, as will Sys::UniqueID and Data::UUID on 
CPAN.  Of course that random function is probably very lightweight, but 
it's not actually unique.

- Perrin




RE: Cheap and unique

2002-04-30 Thread OCNS Consulting

Check your Programming in PERL book. Specifically, the srand function.

RB

-Original Message-
From: David Jacobs [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 30, 2002 3:39 PM
To: [EMAIL PROTECTED]
Subject: Cheap and unique


I'm converting a few CGI scripts that used the PID as a cyclical unique 
number (in concert with TIMESTAMP - so it was TIMESTAMP.PID).

Our goal is to find a replacement function that is extremely cheap 
(cheaper than say, random(100)) and will never repeat. Any ideas? 
Has anyone else faced this problem?

tia
David




RE: Cheap and unique

2002-04-30 Thread Joe Breeden

If you look at the docs for mod_unique it will generate a unique number in a properly 
configured server farm making it a good candidate for this process if you are worried 
about getting a unique number across several systems.

 -Original Message-
 From: Perrin Harkins [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, April 30, 2002 2:44 PM
 To: David Jacobs
 Cc: [EMAIL PROTECTED]
 Subject: Re: Cheap and unique
 
 
 David Jacobs wrote:
  I'm converting a few CGI scripts that used the PID as a 
 cyclical unique 
  number (in concert with TIMESTAMP - so it was TIMESTAMP.PID).
  
  Our goal is to find a replacement function that is extremely cheap 
  (cheaper than say, random(100)) and will never repeat. 
 Any ideas? 
 
 Yes, mod_unique_id will do that, as will Sys::UniqueID and 
 Data::UUID on 
 CPAN.  Of course that random function is probably very 
 lightweight, but 
 it's not actually unique.
 
 - Perrin
 
 



Re: Cheap and unique

2002-04-30 Thread Perrin Harkins

OCNS Consulting wrote:
 Check your Programming in PERL book. Specifically, the srand function.

'random' ne 'unique'

A random function could return the same number 10 times in a row.  It's 
very unlikely, but it could happen.  That's the definition of random.

- Perrin




RE: Cheap and unique

2002-04-30 Thread OCNS Consulting

You could try - Math::TrulyRandom CPAN module.

RB

-Original Message-
From: Perrin Harkins [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 30, 2002 4:08 PM
To: OCNS Consulting
Cc: David Jacobs; [EMAIL PROTECTED]
Subject: Re: Cheap and unique


OCNS Consulting wrote:
 Check your Programming in PERL book. Specifically, the srand function.

'random' ne 'unique'

A random function could return the same number 10 times in a row.  It's
very unlikely, but it could happen.  That's the definition of random.

- Perrin




RE: Cheap and unique

2002-04-30 Thread Andrew Ho

Hello,

OCNSYou could try - Math::TrulyRandom CPAN module.

Perrin's comments still apply. There is no guarantee that a random number
generator of any type (truly random or otherwise) will return unique
values. In fact, one would fully expect repetition after some amount of
time.

Humbly,

Andrew

--
Andrew Ho   http://www.tellme.com/   [EMAIL PROTECTED]
Engineer   [EMAIL PROTECTED]  Voice 650-930-9062
Tellme Networks, Inc.   1-800-555-TELLFax 650-930-9101
--




RE: Cheap and unique

2002-04-30 Thread OCNS Consulting

Of course srand seeds rand. And yes, it is a good way to generate
random numbers for a one time application RUN. CGI on the other hand
could pose a problem, as stated in the Programming in PERL book.

Salutations - RB

-Original Message-
From: Paul Johnson [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 30, 2002 4:35 PM
To: Perrin Harkins
Cc: OCNS Consulting; David Jacobs; [EMAIL PROTECTED]
Subject: Re: Cheap and unique


On Tue, Apr 30, 2002 at 04:08:00PM -0400, Perrin Harkins wrote:
 OCNS Consulting wrote:
 Check your Programming in PERL book. Specifically, the srand
function.

 'random' ne 'unique'

 A random function could return the same number 10 times in a row.  It's
 very unlikely, but it could happen.  That's the definition of random.

'srand' ne 'rand' :-)

I suspect that Mr or Mrs Consulting was thinking about the seed to srand
that used to be required.  Not to say that that is a good solution to
this problem though.

--
Paul Johnson - [EMAIL PROTECTED]
http://www.pjcj.net




Re: Cheap and unique

2002-04-30 Thread Steve Piner



David Jacobs wrote:
 
 I'm converting a few CGI scripts that used the PID as a cyclical unique
 number (in concert with TIMESTAMP - so it was TIMESTAMP.PID).
 
 Our goal is to find a replacement function that is extremely cheap
 (cheaper than say, random(100)) and will never repeat. Any ideas?
 Has anyone else faced this problem?
 
 tia
 David

I'm just curious - what's wrong with the function you're already using?

Steve

-- 
Steve Piner
Web Applications Developer
Marketview Limited
http://www.marketview.co.nz



[Fwd: Re: Cheap and unique]

2002-04-30 Thread David Jacobs




I'm just curious - what's wrong with the function you're already using?

Steve


Mod_Perl hangs on to it's PID, so it's no longer unique. (I _believe_)

mod_unique_id looks like a good solution, pending performance.

Thanks for your help so far, everyone!

David






Re: [Fwd: Re: Cheap and unique]

2002-04-30 Thread Alex Krohn

Hi,

 I'm just curious - what's wrong with the function you're already using?
 
 Mod_Perl hangs on to it's PID, so it's no longer unique. (I _believe_)

TIMESTAMP . $$ . $GLOBAL++

might work just as well (as $global will persist)..

Cheers,

Alex

--
Alex Krohn [EMAIL PROTECTED]



Re: [Fwd: Re: Cheap and unique]

2002-04-30 Thread Steve Piner



David Jacobs wrote:
 
 
 I'm just curious - what's wrong with the function you're already using?
 
 Steve
 
 
 Mod_Perl hangs on to it's PID, so it's no longer unique. (I _believe_)

But the timestamp will make it unique - as long as you're not serving
several requests per second.

If you are, you could use a counter as well as, or in place of, the
timestamp.

All I'm saying is that CGI by itself doesn't guarantee a unique PID -
your CGI's original author probably knew that, and incorporated the
timestamp to guarantee uniqueness.


 mod_unique_id looks like a good solution, pending performance.

Yeah, agreed.

 Thanks for your help so far, everyone!
 
 David

-- 
Steve Piner
Web Applications Developer
Marketview Limited
http://www.marketview.co.nz



Re: [Fwd: Re: Cheap and unique]

2002-04-30 Thread David Jacobs



Mod_Perl hangs on to it's PID, so it's no longer unique. (I _believe_


But the timestamp will make it unique - as long as you're not serving
several requests per second.
  

I'm building the system so I can be confident up to thousands of 
requests/second. Looks like unique_id_module is good for around 65K 
requests, which is fine ;)

A global counter hanging around is a good solution, but not perfect if 
we deploy on multiple servers. It appears the mod_unique module uses 
timestamp, a PID, a counter and the IP address, which is fine.

David





Re: [Fwd: Re: Cheap and unique]

2002-04-30 Thread Michael Robinton


 I'm just curious - what's wrong with the function you're already using?

 Mod_Perl hangs on to it's PID, so it's no longer unique. (I _believe_)

TIMESTAMP . $$ . $GLOBAL++

I use the above.

If you create a global for the child process then that is adequate since
the PID belongs to the child and will be unique at least for a second
until the next time tick -- it's not plausible to create 65k children in
one second. If you need the variable global across multiple servers then
include a unique part of the host address as part of the variable in
addition to the time stamp.

Michael




Re: [Fwd: Re: Cheap and unique]

2002-04-30 Thread Stas Bekman

Michael Robinton wrote:
I'm just curious - what's wrong with the function you're already using?

Mod_Perl hangs on to it's PID, so it's no longer unique. (I _believe_)
 
 
   TIMESTAMP . $$ . $GLOBAL++

Do not use concat, but sprintf 0 padding. Here is an example that can 
happen easily which produces two identical ids in two different procs:

P TIMESTAMP   $$ $GLOB CONCAT   SPRINTF
A 1020227781 753 3 10202277817533  1020227781007533
B 1020227781  7533 10202277817533 10202277810007533

As you can see if you don't pad $$ with 0 to the max proc's len (on some 
machines 63999) you can get two identical UUIDs in CONCAT column above. 
The same can happen with the edge of timestamp when its first digit 
switches from 9 - 10, but then this number is so big, this is most 
likely won't be a problem.

So a much safer solution would be :
   $uuid = time . sprintf(%05, $$) . $GLOBAL++;

s/05/06/ or other if your system's process ID can be bigger than 5 digits.

if you don't modify $^T and you don't reuse the timestamp set for you in 
$r, this will save you a few more millisecs:

   $uuid = $^T . sprintf(%05, $$) . $GLOBAL++;

Since $^T . sprintf(%05, $$) is unique across processes. It's not 
possible to create two processes with the same $$, at the same time.

$^T is the time the Perl interpreter has been started, unless it was 
modified later.

You can also move all this work in the ChildCleanup Handler, so during 
the request time, you use a value that has been already precalculated 
for the current request at the end of the previous one. (just make sure 
to initialize it during the ChildInit Handler for the first request).

p.s. this concept works on Unix for sure, it's possible that on some OSs 
it won't work.

p.p.s. in mod_perl 2.0 we have APR::UUID that does this work for you.

__
Stas BekmanJAm_pH -- Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide --- http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com




Re: [Fwd: Re: Cheap and unique]

2002-04-30 Thread Stas Bekman

In my post I've missed the 'd' token in %05d

Here are a few possible solutions that will do all the work for you

Apache/UUID.pm
--
package Apache::UUID;
use strict;
my($base, $seq);

die Cannot push handlers unless Apache-can('push_handlers');
init();

sub init {
 Apache-push_handlers(
 PerlChildInitHandler = sub {
 $seq  = 0;
 $base = $^T . sprintf(%05d, $$);
 1;
 });
}
sub id { $base . $seq++; }
#sub format { ... }
1;
__END__

startup.pl
--
use Apache::UUID; # must be loaded at the startup!

test.pl

use Apache::UUID;
print Content-type: text/plain\n\n;
print Apache::UUID::id();

Since I've used $^T token, the module must be loaded at the startup, or 
someone may modify $^T. If you use time(), you don't need the child init 
handler (but you pay the overhead). and can simply have:

package Apache::UUID;
use strict;
my($base, $seq);

sub id {
$base ||= time . sprintf(%05d, $$);
$base . $seq++;
}
1;

the nice thing about the childinit handler is that at run time you just 
need to $base . $seq++;. but probably this second version is just fine.

also you probably want to add a format() function so you get the ids of 
the same width.

another improvement is to use pack(), which will handle sprintf for you 
and will create a number which is more compact and always of the same width:

package Apache::UUID;
use strict;
my $seq;
sub id { unpack H*, pack Nnn, time, $$, $seq++;}
1;

Another problem that you may need to tackle is predictability... but 
that's a different story.

__
Stas BekmanJAm_pH -- Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide --- http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com