Well, i'll try that, but honestly i don't think it's the rule as the rule is really simple: it's just one simple integer comparison...
Q <[EMAIL PROTECTED]> wrote on 07/19/06 11:54 am: > On 19/07/2006, at 6:32 PM, Christian Rengstl wrote: > >> The analyze is from the exact query and i dropped the indexes >> before the insert as well without imrpvement. The target table is >> as well completely empty and the insert is supposed to write, in >> this case, more or less 8 million lines in the table. There is a >> rule though, because i have inheritance table structure with one >> master table and around 20 child tables. > > I would say the problem is in the rule. Try doing the insert into a > duplicate table with no rules or inheritance and see how long it takes. > > Perhaps you should provide the actual schema of tables and rules that > are involved in the query in question. > >> Q <[EMAIL PROTECTED]> wrote on 07/19/06 4:37 am: >>> >>> On 19/07/2006, at 4:24 AM, Christian Rengstl wrote: >>> >>>> now finally after a long time i have the query plan for the whole >>>> filled table. I hope somebody can explain me why it takes so much >>>> longer... >>> >>> >>> These explain analyze results don't appear to be from the queries you >>> posted previously. For these results to mean anything you need to >>> include the EXACT queries you used to generate them. >>> >>>> Result (cost=0.06..254643.06 rows=8044000 width=39) (actual >>>> time=0.056..655772 >>>> .273 rows=8044000 loops=1) >>>> One-Time Filter: ((split_part(($1)::text, '_'::text, >>>> 2))::smallint = 1) >>>> InitPlan >>>> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >>>> time=0.003..0.003 rows >>>> =1 loops=1) >>>> -> Seq Scan on temp_table (cost=0.00..194313.00 >>>> rows=8044000 width= >>>> 10) (actual time=0.002..0.002 rows=1 loops=1) >>>> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >>>> time=0.006..0.007 rows >>>> =1 loops=1) >>>> -> Seq Scan on temp_table (cost=0.00..194313.00 >>>> rows=8044000 width= >>>> 10) (actual time=0.004..0.004 rows=1 loops=1) >>>> -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 >>>> width=39) (act >>>> ual time=0.002..191672.344 rows=8044000 loops=1) >>>> Total runtime: 62259544.896 ms >>> >>> >>> This is the query you want to be interested in, the others took no >>> time at all. >>> >>> As a guess I would say the query is an INSERT INTO ... FROM >>> SELECT ... WHERE (split_part(???, '_', 2))::smallint = 1 statement. >>> The majority of the time appears to be taken up on the actual INSERT >>> and not the SELECT part. >>> >>> How many rows are in the target table and what indexes does it have? >>> Does it have any triggers, check constraints, or rules applied to it? >>> All these things can make the insert take longer as the number of >>> rows you have already in the table increases. >>> >>> More than likely you have a target table with a LOT of rows and a >>> bunch of indexes on it and your disks are being thrashed because the >>> indexes are not able to stay cached in RAM. At this point you should >>> ensure your machine is not swapping do disk, and at the very least >>> you should go through one of the many tuning guidelines available and >>> ensure you have allocated the appropriate amount of memory to >>> postgresql for your needs. >>> >>> You may also want to consider dropping the indexes before you do the >>> INSERT and recreate them afterwards. >>> >>> >>> >>>>>>> "Christian Rengstl" <[EMAIL PROTECTED]> >>>>>>> 13.07.06 8.37 Uhr >>> >>>> Good morning list, >>>> >>>> the following query takes about 15 to 20 minutes for around 2 >>>> million lines in the file myfile.txt, but with 8 million lines it >>>> takes around 5 hours and i just don't understand why there is such >>>> a huge discrepancy in performance. >>>> >>>> COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t'; >>>> >>>> INSERT INTO public.master(pid,smid, val1, val2, chr) >>>> SELECT pid, smid, val1, val12, CAST(split_part((SELECT >>>> chr from public.temp_table LIMIT 1), '_', 2) as int2) >>>> FROM public.temp_table; >>>> >>>> INSERT INTO public.values(smid, pos, chr) >>>> SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr from >>>> public.temp_table LIMIT 1), '_', 2) as int2) >>>> FROM public.temp_table; >>>> >>>> I came up with this query, because i wanted to use the COPY command >>>> to load huge files into the db, but i don't want to load all the >>>> columns contained in the file in only one table but copy some of >>>> them into one table and some in a second table. As i found out with >>>> logging, the data is loaded into temp_table within 15 minutes, but >>>> to transfer it from the temp_table toagain only something like 10 >>>> minutes. Can it be that the cast takes up so much more time than >>>> when reading and transferring 2 million lines? >>> >>> >>> -- >>> Seeya...Q >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> >>> _____ / Quinton Dolan - [EMAIL PROTECTED] >>> __ __/ / / __/ / / >>> / __ / _/ / / Gold Coast, QLD, Australia >>> __/ __/ __/ ____/ / - / Ph: +61 419 729 806 >>> _______ / >>> _\ >>> >>> >>> >>> >>> ---------------------------(end of >>> broadcast)--------------------------- >>> TIP 2: Don't 'kill -9' the postmaster >> >> >> -- >> Christian Rengstl M.A. >> Klinik und Poliklinik für Innere Medizin II >> Kardiologie - Forschung >> Universitätsklinikum Regensburg >> B3 1.388 >> Franz-Josef-Strauss-Allee 11 >> 93053 Regensburg >> Tel.: +49-941-944-7230 > > > -- > Seeya...Q > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > _____ / Quinton Dolan - [EMAIL PROTECTED] > __ __/ / / __/ / / > / __ / _/ / / Gold Coast, QLD, Australia > __/ __/ __/ ____/ / - / Ph: +61 419 729 806 > _______ / > _\ > > -- Christian Rengstl M.A. Klinik und Poliklinik für Innere Medizin II Kardiologie - Forschung Universitätsklinikum Regensburg B3 1.388 Franz-Josef-Strauss-Allee 11 93053 Regensburg Tel.: +49-941-944-7230 ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match