Re: [GENERAL] huge price database question..

2012-03-21 Thread Lee Hachadoorian
On Wed, Mar 21, 2012 at 12:57 PM, Andy Colson wrote: > On 3/21/2012 11:45 AM, Lee Hachadoorian wrote: > >> >> >>On 20 March 2012 22:57, John R Pierce >> wrote: >> >> > avg() in the database is going to be a lot faster than copying >>the data into >>

Re: [GENERAL] huge price database question..

2012-03-21 Thread Steve Crawford
On 03/21/2012 09:34 AM, Jim Green wrote: On 21 March 2012 11:01, Steve Crawford wrote: Something sounds wrong, here. XID wraps around somewhere around 4-billion transactions which is a substantial multiple of the entire number of records you are trying to insert. Do you have any unusual vacuum

Re: [GENERAL] huge price database question..

2012-03-21 Thread Andy Colson
On 3/21/2012 11:45 AM, Lee Hachadoorian wrote: On Tue, Mar 20, 2012 at 11:28 PM, Jim Green mailto:student.northwest...@gmail.com>> wrote: On 20 March 2012 22:57, John R Pierce mailto:pie...@hogranch.com>> wrote: > avg() in the database is going to be a lot faster than copying the

Re: [GENERAL] huge price database question..

2012-03-21 Thread Adrian Klaver
On 03/21/2012 09:34 AM, Jim Green wrote: On 21 March 2012 11:01, Steve Crawford wrote: Something sounds wrong, here. XID wraps around somewhere around 4-billion transactions which is a substantial multiple of the entire number of records you are trying to insert. Do you have any unusual vacuum

Re: [GENERAL] huge price database question..

2012-03-21 Thread Lee Hachadoorian
On Tue, Mar 20, 2012 at 11:28 PM, Jim Green wrote: > On 20 March 2012 22:57, John R Pierce wrote: > > > avg() in the database is going to be a lot faster than copying the data > into > > memory for an application to process. > > I see.. > As an example, I ran average on a 700,000 row table with

Re: [GENERAL] huge price database question..

2012-03-21 Thread Jim Green
On 21 March 2012 11:01, Steve Crawford wrote: > Something sounds wrong, here. XID wraps around somewhere around 4-billion > transactions which is a substantial multiple of the entire number of records > you are trying to insert. Do you have any unusual vacuum settings? I have autvacumm=off in my

Re: [GENERAL] huge price database question..

2012-03-21 Thread Steve Crawford
On 03/20/2012 06:16 PM, Jim Green wrote: It looks like alternatives are kind of complex to me, right now my approach(perl dbi and prepared insert) would take about 8/9 mins to insert a day's data. I think I'll probably just stick with it and wait. the autovacuum processes does a lot of io and

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 23:01, Andy Colson wrote: > Of course, there are probably other usage patters I'm not aware of.  And I > also am assuming some things based on what I've heard -- not of actual > experience. I am not expert in sql, so what I get out of postgresql is probably mostly select, but as

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:57, John R Pierce wrote: > avg() in the database is going to be a lot faster than copying the data into > memory for an application to process. I see.. > > Also, you know there's a plR for postgres that lets you embed R functions in > the database server and invoke them in S

Re: [GENERAL] huge price database question..

2012-03-20 Thread David Kerr
On 03/20/2012 07:26 PM, Jim Green wrote: On 20 March 2012 22:21, David Kerr wrote: I'm imagining that you're loading the raw file into a temporary table that you're going to use to process / slice new data data into your 7000+ actual tables per stock. Thanks! would "slice new data data into

Re: [GENERAL] huge price database question..

2012-03-20 Thread Andy Colson
On 03/20/2012 08:54 PM, Brent Wood wrote: Also look at a clustered index on timestamp Brent Wood GIS/DBA consultant NIWA +64 (4) 4 386-0300 A clustered index is only "clustered" at the point in time you run the command. It wont remain that way, and with a really big table, you don't wanna

Re: [GENERAL] huge price database question..

2012-03-20 Thread Brent Wood
, March 21, 2012 2:50 PM To: David Kerr Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] huge price database question.. On 20 March 2012 21:40, David Kerr wrote: > On 03/20/2012 04:27 PM, Jim Green wrote: > > Greetings list! > I am pretty new to postgresql from mysql and did a fair

Re: [GENERAL] huge price database question..

2012-03-20 Thread Andy Colson
On 03/20/2012 09:49 PM, Jim Green wrote: On 20 March 2012 22:43, Andy Colson wrote: Do you ever plan on batch deleted a BUNCH of records? no, after historical data is populated, I'll only add data daily. no delete.. Do you ever want to do read all of one symbol (like, select avg(high) fro

Re: [GENERAL] huge price database question..

2012-03-20 Thread John R Pierce
On 03/20/12 7:49 PM, Jim Green wrote: yes its possible but I would more likely grab the data to R and get the avg in R.. avg() in the database is going to be a lot faster than copying the data into memory for an application to process. Also, you know there's a plR for postgres that lets you

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:43, Andy Colson wrote: > Here is some copy/pasted parts: > > my @list = glob('*.gz'); > for my $fname (@list) > { >        $db->do('copy access from stdin'); >        open my $fh, "-|", "/usr/bin/zcat $fname" or die "$fname: $!"; >        while (<$fh>) >        { >            

Re: [GENERAL] huge price database question..

2012-03-20 Thread Andy Colson
On 03/20/2012 09:35 PM, Jim Green wrote: On 20 March 2012 22:25, Andy Colson wrote: I think the decisions: 1) one big table 2) one big partitioned table 3) many little tables would probably depend on how you want to read the data. Writing would be very similar. I tried to read through the t

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:25, Andy Colson wrote: > I think the decisions: > > 1) one big table > 2) one big partitioned table > 3) many little tables > > would probably depend on how you want to read the data.  Writing would be > very similar. > > I tried to read through the thread but didnt see how you

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:22, John R Pierce wrote: > your 7000 tables all contain the exact same information, with the only > difference being the stock ticker symbol, right?    then really, the single > table, perhaps partitioned by month or whatever, is the right way to go. >  Any schema that makes yo

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:21, David Kerr wrote: > I'm imagining that you're loading the raw file into a temporary table that > you're going to use to > process / slice new data data into your 7000+ actual tables per stock. Thanks! would "slice new data data into your 7000+ actual tables per stock." be

Re: [GENERAL] huge price database question..

2012-03-20 Thread Andy Colson
On 03/20/2012 09:12 PM, Jim Green wrote: On 20 March 2012 22:08, Jim Green wrote: On 20 March 2012 22:03, David Kerr wrote: \copy on 1.2million rows should only take a minute or two, you could make that table "unlogged" as well to speed it up more. If you could truncate / drop / create / lo

Re: [GENERAL] huge price database question..

2012-03-20 Thread John R Pierce
On 03/20/12 7:12 PM, Jim Green wrote: Also if I use copy, I would be tempted to go the one table route, or else I need to parse my raw daily file, separate to individual symbol file and copy to individual table for each symbol(this sounds like not very efficient).. your 7000 tables all contain

Re: [GENERAL] huge price database question..

2012-03-20 Thread David Kerr
On 03/20/2012 07:08 PM, Jim Green wrote: On 20 March 2012 22:03, David Kerr wrote: \copy on 1.2million rows should only take a minute or two, you could make that table "unlogged" as well to speed it up more. If you could truncate / drop / create / load / then index the table each time then yo

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:08, Jim Green wrote: > On 20 March 2012 22:03, David Kerr wrote: > >> \copy on 1.2million rows should only take a minute or two, you could make >> that table "unlogged" >> as well to speed it up more.  If you could truncate / drop / create / load / >> then index the table each

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 22:03, David Kerr wrote: > \copy on 1.2million rows should only take a minute or two, you could make > that table "unlogged" > as well to speed it up more.  If you could truncate / drop / create / load / > then index the table each > time then you'll get the best throughput. Tha

Re: [GENERAL] huge price database question..

2012-03-20 Thread David Kerr
On 03/20/2012 06:50 PM, Jim Green wrote: On 20 March 2012 21:40, David Kerr wrote: On 03/20/2012 04:27 PM, Jim Green wrote: Greetings list! I am pretty new to postgresql from mysql and did a fairly extensive search of the list and came up with a few good ones but didn't find the exact same sit

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 21:54, Brent Wood wrote: > > Also look at a clustered index on timestamp Thanks, this looks very helpful. what do you think about the thousands table vs one table partitioned by month? I guess if I go with one table, index would be too big to fit in ram? Jim. -- Sent via pgsql

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 21:40, David Kerr wrote: > On 03/20/2012 04:27 PM, Jim Green wrote: > > Greetings list! > I am pretty new to postgresql from mysql and did a fairly extensive > search of the list and came up with a few good ones but didn't find > the exact same situation as I have now. so I am ven

Re: [GENERAL] huge price database question..

2012-03-20 Thread David Kerr
On 03/20/2012 04:27 PM, Jim Green wrote: Greetings list! I am pretty new to postgresql from mysql and did a fairly extensive search of the list and came up with a few good ones but didn't find the exact same situation as I have now. so I am venturing asking here. I have daily minute stock price

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 20:19, Steve Crawford wrote: > On 03/20/2012 04:27 PM, Jim Green wrote: >> >> Greetings list! >> I am pretty new to postgresql from mysql > > Welcome. > >> I have daily minute stock price data from 2005 on and each day with >> columns timestamp, open,high,low,close,volume and a fe

Re: [GENERAL] huge price database question..

2012-03-20 Thread Jim Green
On 20 March 2012 19:45, Michael Nolan wrote: > >> >> right now I am having about 7000 tables for individual stock and I use >> perl to do inserts, it's very slow. I would like to use copy or other >> bulk loading tool to load the daily raw gz data. but I need the split >> the file to per stock fil

Re: [GENERAL] huge price database question..

2012-03-20 Thread Steve Crawford
On 03/20/2012 04:27 PM, Jim Green wrote: Greetings list! I am pretty new to postgresql from mysql Welcome. I have daily minute stock price data from 2005 on and each day with columns timestamp, open,high,low,close,volume and a few more. each day's data is about 1.2million rows. I want import al

Re: [GENERAL] huge price database question..

2012-03-20 Thread Michael Nolan
> > right now I am having about 7000 tables for individual stock and I use > perl to do inserts, it's very slow. I would like to use copy or other > bulk loading tool to load the daily raw gz data. but I need the split > the file to per stock files first before I do bulk loading. I consider > this

[GENERAL] huge price database question..

2012-03-20 Thread Jim Green
Greetings list! I am pretty new to postgresql from mysql and did a fairly extensive search of the list and came up with a few good ones but didn't find the exact same situation as I have now. so I am venturing asking here. I have daily minute stock price data from 2005 on and each day with columns