Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-11317
@avulanov Also, we're going to have to add a dependency with this with the
HDF5 library, I think this should be handled the way the netlib is handled with
the user
Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-112948835
@avulanov How about I take the read and you take the write?
In an ideal world we should be able to take the implementation from here
https://github.com/h5py
Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-111598205
@avulanov Would you like to split some of this work up or do you want to
tackle this alone?
---
If your project is set up for it, you can reply to this email and have
Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-111337485
@avulanov Can spark even read an HDF5 file or would we have to write that
as well? While, I can't donate any professional time to this conversion
problem, but I may
Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-111320997
@avulanov To be perfectly honest, does the modern-ness of the dataset
really matter? This dataset has been a standard for a long time in this area so
it seems perfectly
Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-110499728
Hi everyone,
My colleagues and I are very interested in this implementation and have
read most of the discussion going on here about this implementation
Github user hntd187 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-110517022
@avulanov If it would aid in speeding this up I can test or benchmark on
some EC2 instances we have, which run on mesos. If you want to give a general
dataset to use we