[firebird-support] Re: Benefit from Sequential Insert of High/Low PK

2013-01-29 Thread Jeff


--- In firebird-support@yahoogroups.com, Ann Harrison  wrote:

 On Mon, Jan 28, 2013 at 2:21 PM, Jeff  wrote:
 
 
 
  High/Low is a technique of generating primary keys client side in contrast
  to generating keys server side with fb generators. Here is a link to a
  discussion that probably could explain it better:
  http://stackoverflow.com/questions/282099/whats-the-hi-lo-algorithm
 
  The application I am trying to build is distributed system spread
  geographically, most of them on slow connection. Hence, the need for client
  side generated primary keys.
 
 
 I guess I don't quite follow that logic.   One low-to-no cost option is to
 generate your keys on server using an INSERT ... RETURNING to get the value
 back, if that's supported through your interface.
 

My own humble experience suggests that INSERT ... RETURNING is great for 
client-server apps, not for distributed apps, although it could be my mangled 
code and logic that is at fault here. Oh, BTW, RETURNING is not available for 
fb 1.5.


 
 Basically the High/Low key scheme appears to be a two-part primary key
 where the first part is obtained from the database, probably from a
 generator, and the second part is incremented in the client.  Assuming it's
 handled to avoid the confusion between 13/23 and 132/3, that should be OK.
  As long as each client is adding values in approximately the same part of
 the index, the performance should be fine, especially if you're expecting
 no more than a million records in three years.  Index load performance gets
 much more interesting when you expect millions of records a week.
 
 
 
 Having the most variable part of a key last leads to better index
 compression.
 
 
  More work though, and an integer data type might not suffice to accomodate
  the now very large value PK. Maybe should use Numeric(18,0)?
 
 
 Or - forgive me I work with several databases and sometimes forget what
 64-bit integers are called in each - use a BigInt.
 
 
  OTOH, if you say these are non issue with fb, then maybe I should just
  consider GUIDs or UUIDs for PKs and forget about the server altogether with
  regards to PK generation?
 
 
 GUIDs and UUIDs are normally generated in a way that defeats prefix
 compression.
 
 
 Good Luck,
 
 Ann
 
 
 
 
 [Non-text portions of this message have been removed]





[firebird-support] Re: Benefit from Sequential Insert of High/Low PK

2013-01-28 Thread Jeff



Ann, thanks for your answer. Thanks also goes to all others who have shared 
their knowledge. Please see inline.

--- In firebird-support@yahoogroups.com, Ann Harrison  wrote:

 On Fri, Jan 25, 2013 at 4:21 PM, Jeff  wrote:
 
 
  (Using FB 1.5)
 
 
 OK, that's scary right there.  Firebird 1.5 is ten years old. Newer

Not scary for me, it's rather rock solid. Amazing :) I'm poised at upgrading 
though.

 versions use approximately the same index structures but have lots of bugs
 fixed, including several important ones in the index code.
 
 
  Do inserts in FB benefit from ordered or sequential PK?
 
 
 Yes.  Unlike the databases that the article considers, Firebird stores data
 and indexes separately - even primary key indexes, so its talk about
 shuffling data is irrelevant.  But still there are advantages to
 inserting keys into an index in ascending (for ascending indexes) or
 descending (for descending indexes) order.  The most efficient way to
 generate primary keys is with a generator - also called a sequence in
 newer versions.
 
 
  Please allow me to clarify. I intend to use High/Low for table PKs. With
  this approach, it is very possible that PKs will not be in sequential order
  as they are inserted into the db. Will this be an issue in FB?
 
 
 I don't know what High/Low is, so that's hard to answer.  Firebird is
 designed to handle random inserts into indexes with reasonable efficiency,
 but building indexes sequentially creates a dense index at low insert cost.
  When creating a new index on an existing table, Firebird reads the table,
 sorting by they index key and builds the index from the sorted output.
 

High/Low is a technique of generating primary keys client side in contrast to 
generating keys server side with fb generators. Here is a link to a discussion 
that probably could explain it better:
http://stackoverflow.com/questions/282099/whats-the-hi-lo-algorithm

The application I am trying to build is distributed system spread 
geographically, most of them on slow connection. Hence, the need for client 
side generated primary keys. If the hi/lo technique works as expected and let's 
say for example there are a hundred client users doing inserts at random times 
of the day, then there will be instances where each batch of inserts are not in 
sequence with other client's batch of inserts. Fast forward three years and the 
table now holds hundreds of thousands or even a million record, while most or 
all clients are still working with their initial high values. Woe to the client 
with the first acquired (least) high value; his inserts would fall rather 
towards the beginning of the table/index, while the client with the last 
acquired (highest) high value would fall towards the end of the table/index 
(lucky?).

If my previous assumptions are correct, then it is worth making an effort to 
ensure that all inserts fall towards the end of the table/index by prefixing 
the generated PKs with a value based on the current time, for instance (all 
clients in the same time zone). More work though, and an integer data type 
might not suffice to accomodate the now very large value PK. Maybe should use 
Numeric(18,0)?

OTOH, if you say these are non issue with fb, then maybe I should just consider 
GUIDs or UUIDs for PKs and forget about the server altogether with regards to 
PK generation? 



 Why is that fast?  Two reasons, neither of which as to do with the
 (relatively) low cost of moving index entries in memory.
 
 The first is that random index entries tend to land on random pages in the
 index.  Unless you can hold the entire index (and everything else you need)
 in memory, that means that pages will be written and read multiple times.
 One of the key misunderstandings in databases is that an index is a
 good alternative to a sort because the cost of an index look up is K*n,
 while the cost of a sort is L*nlog(n).  It's rarely noticed that the value
 of K (the cost of a page read) is huge compared with log(n). 
  Filling an index page with ordered values and going on to the next page is
 more efficient that putting one entry on one page, the next on another, and
 so on, even if all the pages are in memory.
 
 The second reason to store rows in the order of their index (i.e. not just
 sorted, but sorted ascending for ascending, descending for descending) is
 prefix compression.  Firebird drop the first bytes of a key if they match
 the previous key value.  So if you store AA, AAB, AAC, ABA,
 what goes into the index is AA, 6B, 6C, 4BA.   Prefix compression makes
 the indexes dense, meaning that you read fewer index pages to get to the
 data you want.  Clearly if you store those rows in reverse order, the first
 key in is ABA. When you store AAC, the index contains AAC and
 the second key value becomes 5BA.  Each subsequent entry requires a change
 to the next older entry.


Thank you for your excellent explanation.

 
 I based this question

[firebird-support] Benefit from Sequential Insert of High/Low PK

2013-01-25 Thread Jeff
Hi all,

(Using FB 1.5)

Do inserts in FB benefit from ordered or sequential PK?
Please allow me to clarify. I intend to use High/Low for table PKs. With this 
approach, it is very possible that PKs will not be in sequential order as they 
are inserted into the db. Will this be an issue in FB?

I based this question on this article: 
http://www.codeproject.com/Articles/388157/GUIDs-as-fast-primary-keys-under-multiple-database

Thank you.



[firebird-support] Solution for a redundant join?

2012-10-24 Thread Jeff
The following simple query produces the results below:

SELECT SCHEDULE.DATE_TIME, TEAMS.TEAM FROM SCHEDULE 
LEFT JOIN TEAMS ON SCHEDULE.HOME_TEAMID=TEAMS.PRIMARYKEY 

***QUERY RESULTS (Showing 'Home Team' Column)***
12/01/2012 TeamA
12/21/2012 TeamB
12/25/2012 TeamC

How do I include the corresponding 'Away Team' column? Please see tables below:


***SCHEDULE TABLE***
PRIMARYKEY  HOME_TEAMID  AWAY_TEAMID  DATE_TIME
1   1212/01/2012
2   2312/21/2012
3   3112/25/2012

***TEAMS TABLE***
PRIMARYKEY  TEAM_NAME
1   TeamA
2   TeamB
3   TeamC

Please advise, thank you!



[firebird-support] Re: Solution for a redundant join?

2012-10-24 Thread Jeff


--- In firebird-support@yahoogroups.com, Leyne, Sean Sean@... wrote:
 
 Use tables aliases to include multiple references to a single table.
 
 
 SELECT 
   SCHEDULE.DATE_TIME, HomeTeam.TEAM, AwayTeam.Team
 FROM SCHEDULE 
   LEFT JOIN TEAMS HomeTeam ON HomeTeam.PRIMARYKEY = SCHEDULE.HOME_TEAMID
   LEFT JOIN TEAMS AwayTeam ON AwayTeam.PRIMARYKEY = SCHEDULE.AWAY_TEAMID
 
 
 Sean


Sean, aliases did the trick, thank you kindly!



[firebird-support] v 2.5 embedded is multi-user ?

2012-01-26 Thread Jeff
i'm evaluating firebird embedded for use in small workgroup setting.
i'm a little confused by documentation.

The embedded server in the Windows library, fbembed.dll, now uses 
Superclassic, not Superserver as previously,
thus unifying its model with that of local connection to Superclassic on POSIX. 
The database filelock
that previously restricted connections to a single application space is 
replaced by a global lock table that
allows simultaneous access to the same database from different embedded server 
modules

I see that it supports multi-processes.

how bout multi-user / DB on shared network drive ?

if so, any experience / suggestions regarding concurrency, file lock issues etc 
in windows environment