On Wed, 2017-09-20 at 09:44 +0200, Magnus Hagander wrote: > On Wed, Sep 20, 2017 at 8:53 AM, Nicklas Avén <nicklas.aven@jordogsko > g.no> wrote: > > Hallo all > > > > > > > > I am thrilled about logical replication in PostgreSQL 10. My head > > have > > > > started spinning about use cases. > > > > > > > > Would it be possible to use logical replication as a distribution > > > > method of data? > > As an answer to the generic question: yes :) > > > > I think about map data from national mapping authorities. The > > problem > > > > is to get the updates of their data sets. Especially the open data > > sets > > > > are now distributed as files (shape files) in Sweden and as pg_dump > > in > > > > Norway. > > > > > > > > I guess it is beyond what logical replication is designed for, so I > > ask > > > > what problems that might arise for a scenario like: > > > > > > > > The distributor has a publication database with logical replication > > > > publications of the tables. All users, probably thousands or more, > > > > would subscribe to that publication to get an updated copy of the > > data > > > > set. > > > > > > > > How would the publication server react? I guess the WAL-files will > > only > > > > be written once anyway? > > Yes. But they will need to be kept around until *all* subscribers > have pulled down their changes. So even one subscriber falling behind > will mean your WAL will never get cleaned up. > > Of course, you can keep some sort of watcher process that kills old > replication slots. > > I am also not sure how well PostgreSQL will react to having thousands > of replication slots. It's not what the system was designed for I > believe :) > Ok, I have to read me up on how this works. I thought about it as a bucket of WAL-files that the subscribers just "grab".But of course there is some bookkeeping to make things work. I guess there is also mechanisms so a new subscriber can get a complete table after the publcation have been active for a long time? I mean the "cleaning up" of Wal-files will not leave new subscribers missing what is written to the table long ago?
> You might be better of using logical decoding (which goes back to > 9.4) to stream the data out, but not having each subscriber be a > postgresql subscriber. Either using it to generate some sort of "diff > files" that can then be consumed downstream, or by distributing it > via some kind of dedicated queuing system designed to handle that > many downstreams. The thing is that I am not involved in any of those organisations, just a user that want to give them some good reasons to consider deistributing through PostgreSQL :-) > > > My guess is that it will be a lower payload than today anyway when > > the > > > > whole data set have to be fetched to get updates. > > As always that's going to depend on the number of updates. If the > same thing is updated 1000 times, then if you use logical replication > it gets transferred 1000 times. So there are definitely cases when > there will be *more* traffic with logical replication, but in cases > like yours I would guess it will be less. > > //Magnus > > >