Hi,
as I am not producing DB data via PDL (in fact I am not producing that data
at all) it is not possible to store them as PDL binary data.
As for the performance:
- loading 3.4 million rows, 7 columns each, pdl type: double
- 41s** SQLite (from SSD disk)
- 34s Postgres 9.2 (at localhost, non-SSD disk)
**) the first run is usually 2 times longer, the consequent runs somehow
utilize caching or some other kung-fu I am not aware of
--
kmx
On 12.11.2014 15:14, Ingo Schmid wrote:
Hi,
if you can, I'd suggest storing the pdl as a binary data into the
database, for best performance.
DBI converts everything else into perl strings, which you probably want
to avoid. How well does your approach scale?
I've been thinking about that problem - but no more - for some time, and
would have ended with something very similar to your module.
Ingo
On 11/12/2014 01:43 PM, kmx wrote:
Hi,
I want to ask what others use when need to load data from database into
a piddle.
Of course I know about simple approach like this:
use PDL;
use DBI;
my $dbh = DBI->connect($dsn);
my $pdl = pdl($dbh->selectall_arrayref($sql_query));
But it does not scale well for very large data (millions of rows).
I have turned my solution into a "maybe module" that you can find here
https://gist.github.com/kmx/6f1234478828e7960fbd (see pod doc at the end)
Any suggestions, improvements or alternative solutions are welcome.
--
kmx
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl