Basically, I have several production databases with various data, and I have a 
reports database that grabs all necessary data once a day.

Now, there is is a new requirement to have some of the data available in the 
reports database as soon as it is inserted in the production database. 
Specifically, the data in question is in two particular tables. However, in 
addition to just shipping the tables in, I also need to perform a bit of 
processing on the data as it comes. Basically, each transaction in production 
is represented by 1-3 rows in the two tables. One of the fields contains 
urlencoded data, which I need to decode and concatenate, so that the rows are 
represented by one row in another table.

E.g.

Table A brought from production:
key field 1 | key field 2 | num of segments | segment num | segment id |
------------+-------------+-----------------+-------------+------------|
abcde       | 134         | 3               | 1           | 999000     |
abcde       | 567         | 3               | 3           | 999000     |
abcde       | 890         | 3               | 2           | 999000     |
fghij       | 987         | 2               | 1           | 999001     |
fghij       | 654         | 2               | 2           | 999001     |
abcde       | -11         | 1               | 1           | 999003     |

Table B from production

key field 1 | key field 2 | urlencoded data |
------------+-------------+-----------------+
abcde       | 134         | AAA             |
abcde       | 567         | CCC             |
abcde       | 890         | BBB             |
fghij       | 987         | fff             |
fghij       | 654         | ggg             |
abcde       | -11         | XXX             |

Here we have basically three transactions - one with three segments, one with 
two, and one with a single segment. The data that identifies that certain rows 
belong to the same transation and what the order is is in table A. The actual 
data to decode is in table B.

The result I need to produce is a new table like this:

key field 1 | key field 2 | segment id | decoded concatenated data|
------------+-------------+------------+--------------------------|
abcde       | 134         | 999000     | AAABBBCCC                |
fghij       | 987         | 999001     | fffggg                   |
abcde       | -11         | 999003     | XXX                      |

Basically, a single row for each transaction, with the key fields taken from 
the original's first segment, and the data decoded and concatenated. But I need 
this to be done in (near) real-time - as the rows are added. Because of the 
decoding and concatenation requirements, this can't be a view, because there is 
no SQL function that will do the required processing, and I'm not sure a view 
would be efficient for running reports anyway. So a new table it will be.

Basically, I can replicate these two tables from production to the reports 
database using Londiste, which allows me to replicate single tables and is also 
supposed to allow me to run triggers on the tables.

What I thought of doing was create a trigger on one of the tables, such that 
when data is inserted, it will insert data into the "result" table, and write 
this trigger using PL/Perl or PL/Python which I'm guessing will allow me to do 
the string manipulation required.

The issue is that when an insert or an update is fired, I can't say whether all 
the segments of the same transaction have been written yet, and if only some of 
them were written, there is no guarantee on the order in which they are 
written. The question is whether such a trigger can be created at all. It needs 
to verify that all parts are in the database - run a query on the same table at 
the time the trigger is running. If not all data is there, it shouldn't insert 
anything. Is this feasible at all? How would you achieve it?

Thank you,
Herouth

Reply via email to