RE: [JBoss-user] Re: random keys
Sorry for this being off topic a bit.this is probably better discussed in some sort of EJB User Mailing List rather than JBoss itself. True. I put it up since some databases don't have that concept. Also, there is no real bit twiddling needed: #1 On startup: --- Load Data From Database into Integer object (assumes last 8 bits are 0) Save value into object Incremement by 256 Store new data into Database #2 On key gen: --- Give current value stored in object to requestor Increment current value by 1 If you've incremented 255 times, then it is time to get a new key. (Repeat step #1 above) Just that conceptually you are incrementing the low side and keeping the high side of the integer the same. :) There really is no large difference between using a counter and a table to store the high value...just that a table can be done in all databases, so if your application needs to be vendor/db neutral, then this can be a better mechanism. I think that to make it truely vendor/db neutral, you have to use an entity bean for the high valuebut I'm not positive about that. Alan -Original Message- From: David Jencks [mailto:[EMAIL PROTECTED]] Sent: Tuesday, June 26, 2001 5:36 PM To: [EMAIL PROTECTED] Subject: RE: [JBoss-user] Re: random keys Hi, You can get the same effect with the generators and sequences I am familiar with by requesting a large step or increment. Then you don't need to do bit twiddling. If your db supports generators/sequences use them. david jencks On 2001.06.26 16:38:03 -0400 Wood, Alan wrote: If you're going to run in multiple jvm's, the solution is a counter table. Lock the counter table when you update the count. This doesn't perform as bad as it sounds. Which would leave this alternative of course. It somehow seems like overkill to me, but it may very well be the option I land on. Thanks for the input. Cheers Bent D An alternative to this is discussed in many forums (including theserverside.com). It sacrifices loss of key space for a bit of performance. I'm just reading up on it, but the basics are: Make a key an integer with a 24 bit high index value and an 8 bit low index value. (You can use any bit split you would like. Use 32 32 if you needs lots of room in the key space) The result would be a true 32 bit integer. Hold the last used 24 bit high index value in the database exactly as mentioned in the previous post. Create an SessionBean that will generate a unique key for you. Make it stateless, although it will hold state. (The state in this case is discardable...if the state is missing, a new one can be generated.) Either encapsulate the high value into an entitybean, or just use direct sql in the SessionBean to load the high value when the bean is created. Since the SessionBean is stateless, it will probably be created in a pool (set it to a small # of instances) and only get initialized once per run of the EJB server. (Not a requirement though) Now, generate the lower order keys as you are called. Just increment the low value, and append it to the high value to create the full integer. (Or in other words, add 1 to your saved key). Detect if you are going to go into the next high value range, and if you are then reload the high value from the database (incrementing it when you do). This method requires the following: You can atomically read the high value, increment the high value, write the high value at the database level. (SELECT FOR UPDATE, Lock the record, or whatever needs to be done. I believe if you put it into an entity bean, then this will already be done for you if you make an operation (method) that does the increment and mark the bean for transaction level REQUIRES_NEW, or REQUIRES ??). This allows many different SessionBeans across multiple VMs to generate unique keys (since they will all have different high values). It will result in some of the integer space not being used (due to shutdowns of the server, etc.) It should also allow pooling of the unique key generator so that you have less of a bottleneck there during entity creation. I'd still keep the pool size limited though so that your key space isn't used up at each server shutdown. Keep in mind, I'm still learning and if someone could correct me if I got something wrong, I'd be much appreciative. But, I've read this mechanism a few times and it seems solid enough. Hope this helps, Alan NOTICE: This transmission, and any attached files, may contain information from Genaissance Pharmaceuticals which is confidential and/or legally privileged. Such information is intended only for the use of the individual or entity to whom this transmission is addressed. If you are not the intended recipient
Re: [JBoss-user] Re: random keys
[EMAIL PROTECTED] wrote: On Tue, Jun 26, 2001 at 05:32:43PM -0400, Michael Bilow wrote: On 2001-06-26 at 21:45 +0200, [EMAIL PROTECTED] wrote: Why would this matter? Do databases assume that records with primary keys near one another will often be used together? Yes, this is why they are called primary keys. Traditionally, database engines would try to entry-sequence records by primary key, and there remains an expectation that access by primary key will always be the fastest and most efficient mechanism for accessing a table. It seems strange to me that locality would be important in this case. The assumption that record number 5 and record number 6 are inherently linked more than record 5 and record 8793 are would certainly hold for some databases, but that it should be true in the general case (or even just often enough that it matters)? (From my understanding of relational databases) Actually it's not so much that the database assumes that they're linked as that the table is generally organized in some sort of B-Tree structure where the record's location in the tree (what page it's on) is determined by the primary key. This way when the database does a search by primary key, once it's found the key it doesn't need another IO to get the actual data. This just optimizes the case of finding by the primary key. I can only see the usefulness for binary search, but there you would presumably build index tables anyway so actual location of data doesn't matter. actual location matters only in terms of io. -danch ___ JBoss-user mailing list [EMAIL PROTECTED] http://lists.sourceforge.net/lists/listinfo/jboss-user
Re: [JBoss-user] Re: random keys
On 2001-06-27 at 00:24 +0200, [EMAIL PROTECTED] wrote: On Tue, Jun 26, 2001 at 05:32:43PM -0400, Michael Bilow wrote: On 2001-06-26 at 21:45 +0200, [EMAIL PROTECTED] wrote: Why would this matter? Do databases assume that records with primary keys near one another will often be used together? Yes, this is why they are called primary keys. Traditionally, database engines would try to entry-sequence records by primary key, and there remains an expectation that access by primary key will always be the fastest and most efficient mechanism for accessing a table. It seems strange to me that locality would be important in this case. The assumption that record number 5 and record number 6 are inherently linked more than record 5 and record 8793 are would certainly hold for some databases, but that it should be true in the general case (or even just often enough that it matters)? I can only see the usefulness for binary search, but there you would presumably build index tables anyway so actual location of data doesn't matter. The main reason why primary key access is expected to be more efficient is because experience has shown that databases tend to made up of two flavors of table: tables which are read frequently and written infrequently, which are usually searched on the same key, and tables which are inserted to frequently and read not too much more often than they are written. An example is something like an order entry system where orders are created in an orders table for customers in a customers table to sell items that are in an items table. The items table will be written very rarely, only when new items are introduced, but will be read frequently. Although there might be occasional need to search the items table on some key other than the primary key, such as a description field, the vast majority of accesses from the point of view of the database engine will be to resolve references from other tables and these will all be done by primary key. For example, whenever an order is viewed, the orders table references to items by primary key will have to be resolved through the items table. Because of this, optimizing for primary key will usually result in an order of magnitude performance improvement. The customers table may be modified more frequently than the items table, but if there are regular customers then the customers table will still be modified much less frequently than the orders table. The orders table, in turn, is mostly being modified by insertion operations. There might be occasions to modify an order record, say to notate than an order has been shipped or that part of an order is backordered, but the basic common operation regarding an order table will be to either insert a new order or to locate all orders associated with some other entity, such as a customer. Looking up all orders for a customer will require resolving through a secondary index on the orders table, but those references will themselves resolve back to primary keys in the orders table. So the end result is that all database accesses are eventually going to become a search by primary key, and optimizing for that is invariably a huge win. -- Mike ___ JBoss-user mailing list [EMAIL PROTECTED] http://lists.sourceforge.net/lists/listinfo/jboss-user