Re: [GENERAL] Strategies/Best Practises Handling Large Tables

Chitra Creta Thu, 15 Nov 2012 03:35:50 -0800

Thanks for your example Chris. I will look into it as a long-term solution.


Partitioning tables as a strategy worked very well indeed. This will be my
short/medium term solution.

Another strategy that I would like to evaluate as a short/medium term
solution is archiving old records in a table before purging them.

I am aware that Oracle has a tool that allows records to be exported into a
file / archive table before purging them. They also provide a tool to
import these records.

Does PostgreSQL have similar tools to export to a file and re-import?

If PostgreSQL does not have a tool to do this, does anyone have any ideas
on what file format (e.g. text file containing a table of headers being
column names and rows being records) would be ideal for easy re-importing
into a PostgreSQL table?

Thank you for your ideas.


On Mon, Oct 22, 2012 at 12:14 AM, Chris Travers <[email protected]>wrote:

>
>
> On Sun, Oct 21, 2012 at 5:46 AM, Chitra Creta <[email protected]>wrote:
>
>> Thank you all.
>>
>> Ryan, would you mind sharing your one-time function to move it?
>>
>> Merlin, what are your suggestions to improve query performance?
>>
>> Shaun, thank you. I will look into facts and dimensions should all else
>> fail.
>>
>> Chris, would you mind giving me an example of what you mean by your log,
>> aggregate and snapshot approach. Also, with indexing, I believe composite
>> and partial indexes are better than indexes, am I correct? Do you have any
>> recommendations as to which type (e.g btree, hash) is better for which
>> situations.
>>
>
> Sure.  Suppose I have an accounting system.
>
> I may record the amounts in the transactions in a journal_entry and
> journal_line table.   These will be write once read many.  However time you
> will end up having to digest millions of records (given sufficient volume)
> to find out the balance of a checking account, and this is not really ideal.
>
> So to deal with this, I might, for example, add a table called
> account_checkpoint which might have the following fields:
>
> account_id
> end_date
> debits
> credits
> balance
>
> And then I can snapshot on closing of books the accumulated debits,
> credits, and balance to date.  If I need any of these numbers I can just
> grab the appropriate number from account_checkpoint and roll forward from
> end_date.  If I have too much volume I can have closings on a monthly level
> of whatever.
>
> The checkpoint table contains sufficient information for me to start a
> report at any point and end it at any other point without having to scan
> interceding checkpointed periods.  Additionally if I want to purge old
> data, I can do so without losing current balance information.
>
> So what this approach does, in essence is it gives you a way to purge
> without losing some aggregated information, and a way to skip portions of
> the table for aggregation purposes you can't skip otherwise.  The big thing
> is you cannot insert (and if this is in doubt, you need to enforce this
> with a trigger) any records effective before the most recent checkpoint.
>
> Best Wishes,
> Chris Travers
>

Re: [GENERAL] Strategies/Best Practises Handling Large Tables

Reply via email to