I have tried pg_getcopydata, however I was not able to make it better than
my old approach. After many tries it was still 15-20% slower.
My guess is that pg_getcopydata(..) might be significantly faster when
dumping the whole table (which I was not able to test as the table in
question was too big). When dumping a result of SQL query there seems to be
no advantage.
I have also slightly updated my "maybe module" at
https://gist.github.com/kmx/6f1234478828e7960fbd
--
kmx
On 12.11.2014 23:54, kmx wrote:
Thanks, pg_getcopydata sounds very promising.
I'll try to implement an alternative solution based on pg_getcopydata and
compare it with my current approach.
--
kmx
On 12.11.2014 16:48, Vikas N Kumar wrote:
On 11/12/2014 07:43 AM, kmx wrote:
my $dbh = DBI->connect($dsn);
my $pdl = pdl($dbh->selectall_arrayref($sql_query));
But it does not scale well for very large data (millions of rows).
Hi KMX
If you're using Postgresql you should use the DBD::Pg->pg_getcopydata
using the "COPY mytable to STDOUT" functionality for accessing millions
of rows. You can do this in async or sync mode. This will get you there
faster than using selectall_arrayref(). This allows you to get the rows
without having to redesign your DB.
SQLite has a stream API but I am not familiar with it.
--Vikas
_______________________________________________
Perldl mailing list
Perldl@jach.hawaii.edu
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl