Re: [Hdf-forum] ANN: HDF5 for Python 2.4.0 BETA

John Readey Tue, 11 Nov 2014 13:29:11 -0800

Hey Ray,

  For this first release, the focus will be mostly on the API definition rather 
than performance.  For example, data is being sent as json formatted text.  I 
don't think it should be an issue to support BASE64 encoding for data 
read/writes in a future release.   The client can specify the desired format in 
the Content-type http header.


 Similarly, I'm not doing anything special for reader/writer concurrency - the 
server is serializing all the requests.  Clearly not suitable for a production 
service that will be seeing a lot of traffic.

 I'd be interested in hearing what performance requirements people have for an 
HDF server: bandwidth in/out, latency, request volume, etc.   Depending on the 
specifics, there are different approaches for achieving performance targets.

 I hadn't heard about the issue with ever-growing hdf5 files.  Well, one nice 
aspect of the server-based approach is that you can consolidate any maintenance 
workflows.  E.g. Periodically running h5repack on files in the server.

John

From: Ray Polachikov <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, November 11, 2014 at 4:47 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: ANN: HDF5 for Python 2.4.0 BETA

Hi John and Stuart,

Thanks for the hint. I'm aware of this limitation. The wrapper classes do 
open/close the underlying file for every single operation. I found the overhead 
of this to be negligible (relative to the actual I/O operations).

HDF5 Server sounds promising. It's great that some progress is being made in 
this area. I experimented with Array-based database servers such as SciDB, but 
- to date - data I/O is so much slower than with hdf5. One problem being that 
the SciDB Python-API is HTTP-based and, hence, numerical data is encoded as 
text.
Very much looking forward to seeing your code. I wonder how you dealt with 
those reader/writer concurrency issues. I also wonder if you found a solution 
to the problem that deleting nodes in an hdf5 file does not affect file size, 
i.e., files are ever-growing. In my opinion, this is a nasty limitation of hdf5.

Ray

--
You received this message because you are subscribed to the Google Groups 
"h5py" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected]<mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Re: [Hdf-forum] ANN: HDF5 for Python 2.4.0 BETA

Reply via email to