So, as we discussed during meeting, I have offloaded the data (for stg)
older than half a year to another database. This is how I did it (probably
could have been done more efficiently, but hey, this worked, and I'm not
postgres expert...):

$ pg_dump -Fc resultsdb_stg > resultsdb_stg.dump # dump the resultsdb_stg
to file
$ createdb -T template0 resultsdb_stg_archive # create new empty database
callend resultsdb_stg_archive
$ pg_restore -d resultsdb_stg_archive resultsdb_stg.dump # load data from
the dump to the resultsbd_stg_archive db
$ psql resultsdb_stg_archive
=# -- Get the newest result we want to keep in archives
=# select id, job_id from result where submit_time<'2016-06-01' order by
submit_time desc limit 1;
   id    | job_id
---------+--------
 7857664 | 308901

=# -- Since jobs can contain multiple results, let's select the first
result with the 'next' job_id (could be done as 'select id, job_id from
result where job_id = 308902 order by id limit 1;' too, but this would
automagically catch a hole in the job sequence)
=# select id, job_id from result where job_id > 308901 order by id limit 1;
   id    | job_id
---------+--------
 7857665 | 308902

=# -- delete all the result_data, results, and jobs, starting from what we
got in the previous query
=# delete from result_data where result_id >= 7857665;
=# delete from result where id >= 7857665;
=# delete from job where id >= 308902;

$ psql resultsdb_stg
=# -- since the db's were 'cloned' at the beginning, perform deletion of
the inverse set of data than we did in archive
=# delete from result_data where result_id < 7857665;
=# delete from result where id < 7857665;
=# delete from job where id < 308902;



On Wed, Dec 7, 2016 at 2:19 PM, Josef Skladanka <jskla...@redhat.com> wrote:

>
>
> On Mon, Dec 5, 2016 at 4:25 PM, Tim Flink <tfl...@redhat.com> wrote:
>
>> Is there a way we could export the results as a json file or something
>> similar? If there is (or if it could be added without too much
>> trouble), we would have multiple options:
>>
>
> Sure, adding some kind of export should be doable
>
>
>>
>> 1. Dump the contents of the current db, do a partial offline migration
>>    and finish it during the upgrade outage by export/importing the
>>    newest data, deleting the production db and importing the offline
>>    upgraded db. If that still takes too long, create a second postgres
>>    db containing the offline upgrade, switchover during the outage and
>>    import the new results since the db was copied.
>>
>>
> I slept just two hours, so this is a bit entangled for me. So - my initial
> idea was, that we
>  - dump the database
>  - delete most of the results
>  - do migration on the small data set
>
> In paralel (or later on), we would
>  - create a second database (let's call it 'archive')
>  - import the un-migrated dump
>  - remove data that is in the production db
>  - run the lenghty migration
>
> This way, we have minimal downtime, and the data are available in the
> 'archive' db,
>
> With the archive db, we could either
> 1) dump the data and then import it to the prod db (again no down-time)
> 2) just spawn another resultsdb (archives.resultsdb?) instance, that would
> operate on top of the archives
>
> I'd rather do the second, since it also has the benefit of being able to
> offload old data
> to the 'archive' database (which would/could be 'slow by definition'),
> while keeping the 'active' dataset
> small enough, that it could all be in memory for fast queries,.
>
> What do you think? I guess we wanted to do something pretty similar, I
> just got lost a bit in what you wrote :)
>
>
>
>> 2. If the import/export process is fast enough, might be able to do
>>    instead of the inplace migration
>>
>
> My gut feeling is that it would be pretty slow, but I have no relevant
> experience.
>
> Joza
>
>
_______________________________________________
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org

Reply via email to