Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-24 Thread François Beausoleil
Hi! Le 2013-04-17 à 14:15, Jeff Janes a écrit : On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil franc...@teksol.info wrote: Insert on public.persona_followers (cost=139261.12..20483497.65 rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1) Buffers:

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-24 Thread Amador Alvarez
I would also give it a try on turning on statistics on service_id and follower_id fields and tune collecting of distinct values for the optimizer. Cheers, Amador A. On Wed, Apr 24, 2013 at 9:04 AM, François Beausoleil franc...@teksol.infowrote: Hi! Le 2013-04-17 à 14:15, Jeff Janes a écrit

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread François Beausoleil
Le 2013-04-16 à 22:51, François Beausoleil a écrit : Hi all! I track Twitter followers in my database. I have the following table: # \d persona_followers Table public.persona_followers Column|Type | Modifiers

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Chris Curvey
INSERT INTO persona_followers(service_id, follower_id, valid_at) SELECT service_id, follower_id, NOW() FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id = persona_followers.service_id AND

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Jeff Janes
On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil franc...@teksol.infowrote: Insert on public.persona_followers (cost=139261.12..20483497.65 rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1) Buffers: shared hit=33135295 read=4776921 - Subquery Scan on

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Moshe Jacobson
On Tue, Apr 16, 2013 at 10:51 PM, François Beausoleil franc...@teksol.infowrote: INSERT INTO persona_followers(service_id, follower_id, valid_at) SELECT service_id, follower_id, NOW() FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import WHERE NOT EXISTS(SELECT * FROM

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread François Beausoleil
Le 2013-04-17 à 14:15, Jeff Janes a écrit : On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil franc...@teksol.info wrote: Insert on public.persona_followers (cost=139261.12..20483497.65 rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1) Buffers:

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Jeff Janes
On Wed, Apr 17, 2013 at 1:19 PM, François Beausoleil franc...@teksol.infowrote: Le 2013-04-17 à 14:15, Jeff Janes a écrit : It looks like 12% of the time is being spent figuring out what rows to insert, and 88% actually doing the insertions. So I think that index maintenance is killing

[GENERAL] Most efficient way to insert without duplicates

2013-04-16 Thread François Beausoleil
Hi all! I track Twitter followers in my database. I have the following table: # \d persona_followers Table public.persona_followers Column|Type | Modifiers -+-+--- service_id | bigint