Re: [firebird-support] Re: FB suitability for Consolidated Database of 15 GB.

Helen Borrie Sat, 08 Oct 2011 20:48:22 -0700

At 02:45 PM 9/10/2011, raja_s_patil wrote:

>We have decided to tryout IBPhoenix IBreplicator for this. For me also this is 
>First Kind of Task. We thought that deciding would be Database size is First 
>Step then checking which RDBMS will support is Next. Now since FB 2.5 seems to 
>be OK so we will evaluate IBReplicator. We can download trial version and test 
>the replication schema. If needed we can get queries solved in its Forum. Next 
>10/15 days we are going to tryout replication and performance testing of FB 
>before commencement of Actual Project's Development.

That's a wise move.  It is very flexible and I think you will find a solution 
to *your* requirements there.  Without going into detail (wrong forum) two 
important aspects will be 1) that you have primary keys defined for all tables 
and 2) you keep a key on every replicated record that distinguishes the branch. 
 Do join the Replicator forum and do study the help thoroughly.

>--- In firebird-support@yahoogroups.com, "samcarleton" <scarleton@...> wrote:
>
>> Have you come up with a strategy to replicate the brand databases to the HO? 
>> If so, have you validated that it will work?

You can trial IBPReplicator on InterBase.  It supports IB, Firebird and Oracle. 
 Note, these three (but not MSSQLServer) have multiple record versioning.

> 
>> In my situation, we want to take data from multiple IB clients and put it 
>> into one central Microsoft SQL database. Microsoft has developed this very 
>> power and flexable system called Microsoft Sync Framework. Sync Framework 
>> can work with any RDBMS, assuming the RDBMS exposes enough info.
>> 
>> Take a look at the Synchronization Example at the bottom of this link: 
>> http://msdn.microsoft.com/en-us/sync/bb821992 It explains the basic concept 
>> of how Sync Framework syncs the data. The whole key to it is the "Update 
>> Tick Count" which is database wide. In the Microsoft world is the @@DBTS 
>> function. They refer to this function as either the timestamp or as the 
>> rowversion.

This seems to imply that "any RDBMS" holds each specific record in exactly the 
same place on disk for its entire life.  Amongst the non-desktop databases, 
MSSQLServer might be on its own in this simplistic assumption.  In a MRV 
database, potentially one "global" version and any number of private versions 
might exist on disk at any moment.  The next private version that commits 
successfully becomes the new "global" version that all new transactions 
subsequently see.  At that moment in time, one or more of the co-existing 
private versions will (under almost all transaction configurations) still see a 
previous global version as the latest committed.  

The most recently committed version won't be visible to those other 
transactions until their own transactions commit.  It's not even as simple as 
*two* potentially different views of what is the latest committed.  
Transactions start and end constantly at random times. Meanwhile, the older and 
newer versions more than likely exist in different physical places on disk.

>>  When syncing begins, the sync process needs to be able to get the "minimum 
>> active rowversion". This is the very last update tick count value used that 
>> has been commited.

A synch tool is a client.  Like any client, it sees the latest committed 
version at the moment its transaction starts and continues to see that version 
until its transaction ends. (The exception here would be a transaction with 
Read Committed isolation, which would be suicidal for a 
replication/synchronisation scenario so we won't even think about it.)

The physical address of a record is stored on the record, in an accessible 
field called RDB$DB_KEY.  The problem for your requirement is that RDB$DB_KEY 
is transient.  You have no way to tell *how* transient.  For the client, the  
RDB$DB_KEY (db_key for short) is sticky for the duration of a transaction.  The 
client has the option to request for the scope of db_key to be connection-wide 
(which is another way to commit database suicide, as it inhibits garbage 
collection).

>> 
>> In IB 7.5.1 I don't see any way to get at any value that could be used for 
>> this. There is a generator, but when you pool the generator, it always gives 
>> you the very latest value, even if that value is in an uncommited 
>> transaction. If you use that value as the start of the next sync, all the 
>> rows in the uncommited will never be synced. If you pick a time that is too 
>> far in the past, next time the sync happens there will be duplicates.

Confusion here.  A generator value is always unique and won't be duplicated, 
even if it is generated and stored in a record version that never gets 
committed.  You wouldn't have duplicates unless you replicated the same record 
twice with the same generated "record version" on it.  

>> 
>> Internally FB has to have this basic concept since the system works just 
>> fine when there are 15 active insert transactions and another transaction 
>> does a select, that select returns the 'safe' values. I am assuming that 
>> some concept along the lines of 'minimum active rowversion' is being used to 
>> determine what are the 'safe' rows to return.

The concept used by replicators for Firebird and IB is to keep a log of all 
changes, based on primary key, with timestamping.  When the replicator starts a 
replication, it reads the log and replicates to the target tables in the exact 
order that the changes occurred on the source.  The nett result of a 
replication is that the target records have the same "latest version" as was 
the "latest version" on the source when the transaction started.  You don't get 
duplicates, you don't rely on any uniqueness criteria other than those 
intrinsic to the data and there's no use for a "time-ticker".

>> 
>> So the question is: Does FB expose this type of information to make 
>> implementing a sync type of process easy or is this concept deep in the 
>> engine, never exposed, so one needs to hack the whole sync process?

You could look into RDB$DB_KEY to see whether it could give you anything you 
could use but it does not work like a ticker.  It identifies a unique address 
on disk. It's not hacking;  it has always been there and some tools actually  
show it to you on views and unkeyed tables.  You need to get yourself a clear 
idea of just what it represents and NOT rely on it persisting beyond the bounds 
of your transaction.  Myself, I don't think it's what you are looking for.

>> On the IB 7.5.1 project I am simply the developer trying to figure out Sync 
>> Framework, thus have little say in long term stradegy. For the other project 
>> where I am looking at FB2.5/3.0, I am in command thus I am starting to look 
>> at other options.

Good luck with the Sync Framework thing, which I don't think I actually "get", 
since the boundaries of a record's status in a multi-user environment are 
necessarily very dynamic...hence I can't grokk how real-time synchronisation 
could be achieved while other work was in progress.  Replication can't be 
anything but a few steps out, from source to target or target to source, 
because that's the nature of the beast that is the multi-user database.

And to R. S. Patil, we have hijacked your thread and for this, I apologise.

heLen

Re: [firebird-support] Re: FB suitability for Consolidated Database of 15 GB.

Reply via email to