David Cournapeau wrote: > Maybe I am naive, but I think a worthy goal would be a minimal C++ > library which wraps ndarray, without thinking about SWIG, boost and co > first.
That's exactly what I had in mind. If you have something that works well with ndarray -- then SWIG et al. can work with it. In principle, if you can do the transition nicely with hand-written wrappers, then you can do it with the automated tools too. > I don't know what other people are looking for, but for me, the > interesting things with using C++ for ndarrays would be (in this order > of importance): > 1 much less error prone memory management less than what? std:valarray, etc. all help with this. > 2 a bit more high level than plain C ndarrays (syntactic sugar > mostly: keyword args, overloaded methods and so on) Yes. > 3 more high level things for views I think views are key. > 4 actual computation (linear algebra, SVD, etc...) This is last on my list -- key is the core data type. I may be an unusual user, but what I expect is that in a given pile of code, we need one or two linear algebra routines, so I don't mind hand-wrapping LAPACK. Not that it wouldn't be nice to have it built in, but it's not a deal breaker. In any case, it should be separate: a core set of array objects, an a linear algebra (or whatever else) package built on top of it. > 3 > would be a pain to do right without e.g boost::multiarray; Yes, it sure would be nice to build it on an existing code base, and boost::multiarray seems to fit. > One huge advantage of being independant of external libraries would be > that the wrapper could then be included in numpy, and you could expect > it everywhere. That would be nice, but may be too much work. I"m really a C++ newbie, but it seems like the key here is the view semantics -- and perhaps the core solution is to have a "data block" class -- all it would have is a pointer to a block of data, and a reference counter. Then each array object would have a view of one of those -- each new array object that used a given instance would increase the ref count, and decrease it on deletion. The view would destroy itself when its refcount went to zero. (is this how numpy works now?) Even if this makes sense, I have no idea how compatible it would be with numpy and/or python. boost:multiarray does not seem to take this approach. Rather it has two classes: a multi_array: responsible for its own data block, and a multi_array_ref: which uses a view on another multiarray's data block. This is getting close, but it means that when you create a multi_array_ref, the original multi_array needs to stay around. I'd rather have much more flexible system,where you could create an array, create a view of that array, then destroy the original, then have the data block go away when you destroy the view. This could cause little complications if you started with a huge array, made a view into a tiny piece of it, then the whole data block would stick around -- but that would be up to the user to think about. >> Would it make sense to use this approach in C++? I suspect not -- all >> your computational code would have to deal with it. > Why not making one non template class, and having all the work done > inside the class instead ? > > class ndarray { > private: > ndarray_imp<double> a; > }; hm. that could work (as far as my limited C++ knowledge tells me),b ut it's still static at run time -- which may be OK -- and is C++-is anyway. > If you have an array with several views on it, why not just enforcing > that the block data address cannot change as long as you have a view ? Maybe I"m missing what you're suggesting but this would lock in the original array once any views were on it -- that would greatly restrict flexibility. My suggestion above may help, but I think maybe I could just live without re-sizing. > This should not be too complicated, right ? I don't use views that much > myself in numpy (other than implicitly, of course), so I may missing > something important here Implicitly, we're all using them all the time -- which is why I think views are key. Alexander Schmolck wrote: > I'd ideally like something that I can more or less transparently > pass and return data between python and C++ and I want to use numpy arrays on > the python side. It'd also be nice to have reference semantics and reference > counting working fairly painlessly between both sides. Can the python gurus here comment on how possible that is? > as I said I expect that most > data I deal with will be pretty large, so overheads from creating python > objects aren't likely to matter that much. I"m not so much worried about the overhead as the dependency -- to use your words, it would feel perverse to by including python.h for a program that wasn't using python at all. >> Our case is such: We want to have a nice array-like container that we >> can use in C++ code that makes sense both for pure C++, and interacts >> well with numpy arrays, as the code may be used in pure C++ app, but >> also want to test it, script it, etc from Python. > > Yes, that's exactly what I'm after. What's your current solution for this? We're trying to build it now. The old code used Mac-OS Handles, those have been converted to std::valarrays, and we're working on wrapping those for with numpy arrays -- which, at the moment looks like copying the data back and forth -- fine for testing code, but maybe not OK for production work. >> did you check out >> boost::multiarray ? I didn't see that on your list. > Since I'm mostly going to use > matrices (and vectors, here and there), maybe ublas, which does provide useful > numeric functionality is a better choice. Well, one of the lesson's I learned from numpy is that I'm much happier with a general purpose n-d array than with a "matrix" and "vector". the latter can be built on top of the former if you want (like it is in numpy). How compatible are multiarray and ublas matrices? It kind of looks like boost isn't really a single project, so things that could be related may not be. Hmmm -- if my concept above works, then all you need is for your n-d arrays and your matrices and vectors to all share the data "data block" class. > I must say I find it fairly painful > to figure out how to do things I consider quite basic with the matrix/array > classes I come accross in C++ (I'm not exactly a C++ expert, but still); neither am I -- but I think it's the nature of C++! > I > also can't seem to find a way to construct an ublas matrix or vector from > existing C-array data. This functionality seems to be missing from many (moat) of these C++ containers. I suspect that it's the memory management issue. One of the points of these containers it to take care of memory management for you -- if you pass in a pointer to an existing data block -- it's not managing your memory any more. >> It would be nice to just have that (is MTL viable?) > > No idea -- as far as I can tell the webpage is broken, so I can't look at the > examples (http://osl.iu.edu/research/mtl/examples.php3). Too many dead or sleeping projects.... > Yes. C++ copying semantics seem completely braindamaged to me. It's the memory management issue again -- C++ doesn't have it built in -- so it's built in to each class instead. >>> <http://thread.gmane.org/gmane.comp.python.c++/11559/focus=11560> >> That does look promising -- and it used boost::multiarrays > > Yes (and also ublas vectors and matrices). Unfortunately, the author just > wrote in the c++-sig noted that he's unlikely to work on the code again -- Darn but > it might still make a good starting point for someone The advantage of open source! Full Disclosure: I have neither the skills nor the time to actually implement any of these ideas. If no one else does, then I guess we're just blabbing -- not that there is anything wrong with blabbing! -Chris _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion