On Jun 26, 6:47 am, Mag Gam wrote:
> Thankyou everyone for the responses! I took some of your suggestions
> and my loading sped up by 25%
what a useless post...
--
http://mail.python.org/mailman/listinfo/python-list
Thankyou everyone for the responses! I took some of your suggestions
and my loading sped up by 25%
On Wed, Jun 24, 2009 at 3:57 PM, Lie Ryan wrote:
> Mag Gam wrote:
>> Sorry for the delayed response. I was trying to figure this problem
>> out. The OS is Linux, BTW
>
> Maybe I'm just being pedant
Mag Gam wrote:
> Sorry for the delayed response. I was trying to figure this problem
> out. The OS is Linux, BTW
Maybe I'm just being pedantic, but saying your OS is Linux means little
as there are hundreds of variants (distros) of Linux. (Not to mention
that Linux is a kernel, not a full blown OS
Mag> s=0
Mag> #Takes the longest here
Mag> for y in fs:
Mag> continue
Mag> a=y.split(',')
Mag> s=s+1
Mag> dset.resize(s,axis=0)
Mag> fs.close()
Mag> f.close()
Mag> This works but just takes a VERY long time.
Mag> Any way to optimize this?
Sorry for the delayed response. I was trying to figure this problem
out. The OS is Linux, BTW
Here is some code I have:
import numpy as np
from numpy import *
import gzip
import h5py
import re
import sys, string, time, getopt
import os
src=sys.argv[1]
fs = gzip.open(src)
x=src.split("/")
filena
Terry Reedy wrote:
Mag Gam wrote:
Yes, the system has 64Gig of physical memory.
drool ;-).
Well, except that, dependent on what OS he's using, the size of one
process may well still be limited to 2GB...
Chris
--
Simplistix - Content Management, Zope & Python Consulting
- http:
Mag,
If your source data is clean, it may also be faster for you to parse
your input files directly vs. use the CSV module which may(?) add some
overhead.
Check out the struct module and/or use the split() method of strings.
We do a lot of ETL processing with flat files and on a slow single core
Mag Gam wrote:
Yes, the system has 64Gig of physical memory.
drool ;-).
What I meant was, is it possible to load to a hdf5 dataformat
(basically NumPy array) without reading the entire file at first? I
would like to splay to disk beforehand so it would be a bit faster
instead of having 2 copi
Mag Gam wrote:
> Yes, the system has 64Gig of physical memory.
>
>
> What I meant was, is it possible to load to a hdf5 dataformat
> (basically NumPy array) without reading the entire file at first? I
> would like to splay to disk beforehand so it would be a bit faster
> instead of having 2 copi
Yes, the system has 64Gig of physical memory.
What I meant was, is it possible to load to a hdf5 dataformat
(basically NumPy array) without reading the entire file at first? I
would like to splay to disk beforehand so it would be a bit faster
instead of having 2 copies in memory.
On Tue, Jun
Do you even HAVE 14 gigs of memory? I can imagine that if the OS needs to
start writing to the page file, things are going to slow down.
--
http://mail.python.org/mailman/listinfo/python-list
On Mon, 22 Jun 2009 23:17:22 -0400, Mag Gam wrote:
> Hello All,
>
> I have a very large csv file 14G and I am planning to move all of my
> data to hdf5.
[...]
> I was wondering if anyone knows of any techniques to load this file
> faster?
Faster than what? What are you using to load the file?
Hello All,
I have a very large csv file 14G and I am planning to move all of my
data to hdf5. I am using h5py to load the data. The biggest problem I
am having is, I am putting the entire file into memory and then
creating a dataset from it. This is very inefficient and it takes over
4 hours to cr
13 matches
Mail list logo