Re: [GENERAL] queries on xmin

2009-06-11 Thread Marko Kreen
On 6/11/09, Matt Amos wrote: > On Thu, Jun 11, 2009 at 2:48 PM, Marko Kreen wrote: > > On 6/11/09, Matt Amos wrote: > >> On Thu, Jun 11, 2009 at 1:13 PM, Brett Henderson wrote: > >> >> See pgq.batch_event_sql() function in Skytools [2] for how to > >> >> query txids between snapshots effici

Re: [GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
On Thu, Jun 11, 2009 at 2:48 PM, Marko Kreen wrote: > On 6/11/09, Matt Amos wrote: >> On Thu, Jun 11, 2009 at 1:13 PM, Brett Henderson wrote: >>  >> See pgq.batch_event_sql() function in Skytools [2] for how to >>  >> query txids between snapshots efficiently and without being affected >>  >> by l

Re: [GENERAL] queries on xmin

2009-06-11 Thread Marko Kreen
On 6/11/09, Matt Amos wrote: > On Thu, Jun 11, 2009 at 1:13 PM, Brett Henderson wrote: > >> See pgq.batch_event_sql() function in Skytools [2] for how to > >> query txids between snapshots efficiently and without being affected > >> by long transactions. > > > > I'll take a look. > > it was l

Re: [GENERAL] queries on xmin

2009-06-11 Thread Marko Kreen
On 6/11/09, Brett Henderson wrote: > Marko Kreen wrote: > > > 4-byte xids on btree may create data corruption. > > > > > Can you be more specific on this? I'm aware of xid being an unsigned > integer which means we need to deal with the cast resulting in negative > numbers. This means we have t

Re: [GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
On Thu, Jun 11, 2009 at 1:13 PM, Brett Henderson wrote: > Marko Kreen wrote: > Sorry, I'm not sure what you're suggesting with txid_current().  We're > currently using the |txid_current_snapshot|() method which returns us the > maximum transaction id plus in-flight transactions.  We specifically ex

Re: [GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
On Thu, Jun 11, 2009 at 12:59 PM, Brett Henderson wrote: > Greg Stark wrote: >> Another option to consider would be including a boolean column >> "dumped" defaulted to false. Then you could have a partial index on >> the primary key or date "WHERE NOT dumped". Then when you dump you can >> "SELECT

Re: [GENERAL] queries on xmin

2009-06-11 Thread Brett Henderson
I've been working with Matt on this. Thanks for the suggestions. Greg Stark wrote: On Thu, Jun 11, 2009 at 11:25 AM, Matt Amos wrote: what problems are we going to create for ourselves if we create a btree index on xmin casted to int4? would it be as efficient to use a hash index, create a

Re: [GENERAL] queries on xmin

2009-06-11 Thread Greg Stark
On Thu, Jun 11, 2009 at 12:59 PM, Brett Henderson wrote: > I have a couple of hesitations with using this approach: > 1. We can only run the replicator once. > 2. We can only run a single replicator. > 3. It requires write access to the db. > > 1 is perhaps the biggest issue.  It means that we only

Re: [GENERAL] queries on xmin

2009-06-11 Thread Brett Henderson
Marko Kreen wrote: 4-byte xids on btree may create data corruption. Can you be more specific on this? I'm aware of xid being an unsigned integer which means we need to deal with the cast resulting in negative numbers. This means we have to split our range queries into several pieces when

Re: [GENERAL] queries on xmin

2009-06-11 Thread Marko Kreen
On 6/11/09, Matt Amos wrote: > the openstreetmap project (http://osm.org/) recently moved from using > mysql to postgres and we're trying to improve some of our tools using > the new functionality that postgres provides. > > in particular, we are dumping changes to the database at short > inte

Re: [GENERAL] queries on xmin

2009-06-11 Thread Greg Stark
On Thu, Jun 11, 2009 at 11:25 AM, Matt Amos wrote: > > what problems are we going to create for ourselves if we create a > btree index on xmin casted to int4? would it be as efficient to use a > hash index, create a temporary table of txids that we're querying with > a hash index and do an explicit

[GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
the openstreetmap project (http://osm.org/) recently moved from using mysql to postgres and we're trying to improve some of our tools using the new functionality that postgres provides. in particular, we are dumping changes to the database at short intervals (currently every minute, hour and day [