Dear Martjin,

Yes, this is very promising. Thank you for bringing this to my attention.

Michaël

2017-12-05 21:34 GMT+01:00 Martijn Jasperse <[email protected]>:

> Dear Michaël,
> Have you tried using the core driver with a file image? Seems to me that
> this is what you want to do, see H5Pset_file_image
> <https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetFileImage>.
> This enables you to "open" the file data in memory and then retrieve it
> again after you've finished operations, using H5Fget_file_image.
>
> We have previously used this for networked HDF5-based data transfer;
> admittedly with small data instead of big data, but the disk access
> overhead was unacceptable in that case too.
>
> Cheers,
> Martijn
>
> On 6 December 2017 at 03:43, Michaël Melchiore <[email protected]> wrote:
>
>> Dear Andrey,
>>
>> While Apache Spark does aim at working in memory when possible, my need
>> is not related to Spark. There are many alternatives to Spark which can be
>> used to perform in memory processing (Apache Storm, Apache Flink, Google
>> Dataflow...)
>> I have registered for more information regarding the Spark Connector but
>> I am not sure it is what I am looking for.
>>
>> Kind regards,
>>
>> Michaël
>>
>> 2017-12-05 15:11 GMT+01:00 Андрей Парамонов <[email protected]>:
>>
>>> Hello Michaël!
>>>
>>> 04.12.2017 21:23, Michaël Melchiore пишет:
>>>
>>>> I build an application which operates on NetCDF data using Big Data
>>>> technologies.
>>>>
>>>> My design aims at avoiding unnecessarily writing data to disk. Instead,
>>>> I want to operate as much as possible in memory. The challenge is data
>>>> (de)serialization for distributed communications between computing nodes.
>>>>
>>>> Since NetCDF4 and HDF5 already provide a portable data format, a simple
>>>> and efficient design would simply access and then exchange the raw binary
>>>> data over the network.
>>>>
>>>> Currently, I fail to access this buffer without creating files. I am
>>>> investigating the use of the Apache Common VFS Ram file system to trick
>>>> NetCDF into working in memory.
>>>>
>>>> But, a suggestion on the NetCDF Java mailing list (see ticket
>>>> MQO-415619) was to build an alternative to the core driver. I feel this is
>>>> the more desirable course of actions as it is about improving the existing
>>>> solutions instead of working around their limitations.
>>>>
>>>> Do you think this approach is feasible ? Any starting pointers would be
>>>> appreciated !
>>>>
>>>
>>> I am probably not a distinguished expert in HDF5, but I take courage to
>>> suggest you to check
>>> https://www.hdfgroup.org/downloads/spark-connector/
>>> It would be superb if you could share your experience and whether Spark
>>> connector helped you to implement in-memory processing.
>>>
>>> Best wishes,
>>> Andrey Paramonov
>>>
>>> --
>>> This message has been scanned for viruses and
>>> dangerous content by MailScanner, and is
>>> believed to be clean.
>>>
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>>> Twitter: https://twitter.com/hdf5
>>
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>> Twitter: https://twitter.com/hdf5
>>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
>
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to