Am Montag, 21. Januar 2008 01:16 schrieb Morgan:
> Hi Guys,
>
> I have field that I want to put a unique identifier in. This unique Id i
> would like to be a composite key or simply a random number. What do you
> guys suggest for this, is there a particular method which works well for
> some of you?

That's a good question, I asked myself some weeks ago, here's how I solved 
this: 

In my case, I have database records that have sequential numbers as primary 
keys. These keys can be calculated by the database and are unique by design 
(as the primary index is unique).

This record should hold another field, which should be also unique and in a 
form of a 8-digit number. However, I'd rather not want this number to be 
sequential, it should look random. The first way would have been to simple 
generate a number via "random.randint(99999999)", look into the database, if 
it's already in and if not, insert it. However, to guarantee that the number 
is unique, one should create a unique index on this column. In case the 
number is already there, the database will raise an error, which has to be 
catched by the application. Another way would be to lock the table after the 
select, so that the rare case, that another application instance inserts the 
same number after my select, is avoided. So, the algorithm could look like 
this (in pseudo code):

# Variant 1 with exception handling
while 1:
  num = random.randint(99999999)
  try:
    insert into db_table (col1, col2, col_num, col3, ) % num
  except UniqueNum_IndexViolated:
    continue
  else:
    break

# Variant 2 with locking
while 1:
  num = random.randint(99999999)
  lock db_table
  result = select * from db_table where col_num = num
  if result: 
    continue
  else:
    insert into db_table (col1, col2, col_num, col3, ....) % num
    unlock db_table
  continue

My problem with variant (1) was that I could not find out how to lock a whole 
table with SQLAlchemy, moreover, each insert needs a table lock and a select, 
which is bad performance-wise. The problem with (2) was that I did not know 
how to catch this specific exception, as I can't simply except any database 
error but this specific index violation (which may be different on different 
databases).

My third idea, which I use now, is to calculate my "random" number out of my 
sequential, unique primary index, which is generated by the database during 
the insert. One helpful guy from #sqlalchemy helped me out with 
the "randomization" of the sequential number with this algorithm:

def mk_arb_seq(id):
    """ Return an arbitrary number. This number is calculated out of
    the given id. For that, it is multiplied by the large prime number"A".
    Then a modulo operation with prime "M" where M < A. If "A" is
    chosen as a non-prime, the sequence is not very arbitrary,
    therefore a prime is recommended. """

    M = 99999989
    A = 2760727302517

    return str((A*id) % M).zfill(len(str(M)))

The last problem with this is that I have no real mathematical proof for that 
algorithm, that "id" never maps to one number more than once. However, I 
simply tested this with a little program and it seems to work.

If you use the ORM, don't forget to do a session.flush() after adding the 
object to the session, as this will calculate the primary index. Then you can 
simply set col_num = mk_arb_seq(primary_index).

Best Regards,
Hermann


-- 
[EMAIL PROTECTED]
GPG key ID: 299893C7 (on keyservers)
FP: 0124 2584 8809 EF2A DBF9  4902 64B4 D16B 2998 93C7

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to