On Wednesday, 14 January 2015 at 16:27:17 UTC, Laeeth Isharc wrote:

struct File { Location _location; alias _location this; ... }

// group.d
public import commonfg;
struct File { Location _location; alias _location this; ... }

// commonfg.d { ... }
enum isContainer(T) = is(T: File) || is(T : Group);
auto method1(T)(T obj, args) if (isContainer!T) { ... }
auto method2(T)(T obj, args) if (isContainer!T) { ... }

I guess two of my gripes with UFCS is (a) you really have to


// another hdf-specific thing here but a good example in general is that some functions return you an id for an object which is one of the location subtypes (e.g. it could be a File or could be a Group depending on run-time conditions), so it kind of feels natural to use polymorphism and classes for that, but what would you do with the struct approach? The only thing that comes to mind is Variant, but it's quite meh to use in practice.

Void unlink(File f){}
Void unlink(Group g){}

For simple cases maybe one can keep it simple, and despite the Byzantine interface what one is trying to do when using HDF5 is not intrinsically so complex.
So your solution is copying and pasting the code?

But now repeat that for 200 other functions and a dozen more types that can be polymorphic in weirdest ways possible...

If you are simply have a few lines calling the API and the validation is different enough for file and group (I haven't written unlink yet) then why not (and move proper shared code out into helper functions). The alternative is a long method with lots of conditions, which may be the best in some cases but may be harder to follow.

I do like the h5py and pytables approaches. One doesn't need to bother too much with the implementation when using their library. However, what I am doing is quite simple from a data perspective - a decent amount of it, but it is not an interesting problem from a theoretical perspective - just execution. Now if you are higher octane as a user you may be able to see what I cannot. But on the other hand, the Pareto principle applies, and in my view a library should make it simple to do simple things. One can't get there if the primary interface is a direct mapping of the HDF5 hierarchy, and I also think that is unnecessary with D.

But I very much appreciate your work as the final result is better for everyone that way, and you are evidently a much longer running user of D than me. I never used C++ as it just seemed too ugly! and I suspect the difference in backgrounds is shaping perspectives.

What do you think the trickiest parts are with HDF5? (You mention weird polymorphism).



Laeeth
I don't think you've read h5py source in enough detail :) It's based HEAVILY on duck typing. In addition, it has way MORE classes than the C++ hierarchy does. E.g., the high-level File object actually has these parents: File : Group, Group : HLObject, MutableMappingWithLock, HLObject : CommonStateObject and internally the File also keeps a reference to file id which is an instance of FileID which inherits from GroupID which inherits from ObjectID, do I need to continue? :) PyTables, on the contrary is quite badly written (although it works quite well and there are brilliant folks on the dev team like francesc alted) and looks like a dump of C code interweaved with hackish Python code.

In h5py you can do things like file["/dataset"].write(...) --> this just wouldn't work as is in a strictly typed language since the indexing operator generally returns you something of a Location type (or an interface, rather) which can be a group/datatype/dataset which is only known at runtime. Out of all of them, only the dataset supports the write method but you don't know it's going to be a dataset. See the problem? I don't want the user code to deal with any of the HDF5 C API and/or have a bunch of if conditions or explicit casts which is outright ugly. Ideally, it would work kind of like H5PY, abstracting the user away from refcounting, error code checking after each operation, object type checking and all that stuff.

Reply via email to