Hi,

as I am not producing DB data via PDL (in fact I am not producing that data at all) it is not possible to store them as PDL binary data.

As for the performance:
- loading 3.4 million rows, 7 columns each, pdl type: double
- 41s** SQLite (from SSD disk)
- 34s Postgres 9.2 (at localhost, non-SSD disk)

**) the first run is usually 2 times longer, the consequent runs somehow utilize caching or some other kung-fu I am not aware of

--
kmx

On 12.11.2014 15:14, Ingo Schmid wrote:

Hi,

if you can, I'd suggest storing the pdl as a binary data into the database, for best performance.

DBI converts everything else into perl strings, which you probably want to avoid. How well does your approach scale?

I've been thinking about that problem - but no more - for some time, and would have ended with something very similar to your module.
Ingo
On 11/12/2014 01:43 PM, kmx wrote:
Hi,

I want to ask what others use when need to load data from database into a piddle.

Of course I know about simple approach like this:

  use PDL;
  use DBI;
  my $dbh = DBI->connect($dsn);
  my $pdl = pdl($dbh->selectall_arrayref($sql_query));

But it does not scale well for very large data (millions of rows).

I have turned my solution into a "maybe module" that you can find here https://gist.github.com/kmx/6f1234478828e7960fbd (see pod doc at the end)

Any suggestions, improvements or alternative solutions are welcome.

--
kmx


_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl



_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to