A Dijous 12 Abril 2007 18:18, Michael Hoffman escrigué:
> Francesc Altet wrote:
> > A Dilluns 09 Abril 2007 15:57, Michael Hoffman escrigué:
> >> As a followup to my previous message, I have realized that I am supposed
> >> to tune the lustre filesystem for large files. Hopefully that will solve
> >> my performance problems.
> >
> > Maybe. A good crosscheck would be to copy the file to a local filesystem
> > and test the performance. If you still see high latency, please explain
> > which hierarchy have you endowed to your data and I'll try to provide you
> > more feedback.
>
> Well, I tried that and it was still really slow. So I tried balancing
> the tree by creating groups named _00 through _ff, from the first octet
> of the MD5 digest of the dataset name. This afforded a considerable
> speedup in opening even on a remote filesystem:
>
> $ time python -c 'import tables; tables.openFile("original.h5")'
> Closing remaining opened files... original.h5... done.
>
> real 2m25.643s
> user 0m1.271s
> sys 0m1.379s
>
> $ time python -c 'import tables; tables.openFile("balanced.h5")'
> Closing remaining opened files... balanced.h5... done.
>
> real 0m2.186s
> user 0m0.158s
> sys 0m0.106s
>
> So perhaps sticking to <4096 nodes per group (or here, <256) is still a
> good idea. I'm thankful that I don't need to move to multiple files
> which would have been a real pain. It would be nice if this sort of
> thing were done automatically but that would probably be best handled
> upstream in HDF5.
I see. So, in the end the PerformanceWarning that was issued some time ago
when too many nodes were put in a single group was not a bad idea...
In any case, could you develop further which is your tree structure
in 'original.h5' and how you changed it for 'balanced.h5'? I'd like to
figure out what's going on there so as to see whether it is worth to setup
the PerformanceWarning back.
Cheers,
--
>0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
"-"
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Pytables-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/pytables-users