On Tue, 2006-09-26 at 07:46 -0700, James Dennett wrote: > > the way STL handles > > this is very inefficient. > It's within a factor of 2 of optimal, no? What do you mean by > "very" inefficient?
See below. > > Using mmap/realloc is much faster, > > but C++ does not provide a move or 'relocate' method > > for objects: realloc doesn't work with copy ctor. > > > It seems a fair bet that C++0x will support "move semantics" > which will make std::vector more efficient in the case of objects > which can be cheaply moved but not cheaply copied. But that isn't what's required .. they haven't really done their homework ;( Suppose you have a large vector and want to extend it, say with 'push_back'. When you go past the available store, what happens? Some new storage is allocated which is a bit longer, then the existing elements are copied to it, finally the old store is deleted. This invalidates existing iterators. Now, you have two choices here: waste a heap of memory allocating lots more storage, but hopefully avoid having to copy the array again for some time. OR you can allocate a small amount of extra store, optimising space at the expense of possibly copying the array multiple times. Either way it is horrendously inefficient compared with C. Using 'realloc', C can extend an array without copying or moving ANY storage by simply allocating new address space and remapping the pages of the old array. With extended functionality, one can allocate an array AND reserve address space beyond the end, so that the realloc, at least up to the reserved limit, doesn't invalidate any iterators .. it just maps virtual store on demand at the end of the array, into already allocated address space. This is very efficient, since storage use is optimal -- at most one page of store is wasted. And it is very fast, since the cost of the allocation is a small constant required for an mmap system call. So why can't STL vector do this? The answer is in the semantics. Up to the 'reserved' space, it must allocate memory, not just address space (it isn't allowed to fail). But past the reserved space, it has to allocate a temporary copy of the array, doubling overall storage requirements, and then it has to actually copy all the elements. So performance is linear in array length. The reason this MUST be the case is that C++ doesn't provide the correct generic operations: copy (the copy constructor) is the WRONG operation here. Move is ALSO the WRONG operation. Move can be very fast for some types: for example moving a C++ string is much faster than copying because the underlying char array does not need to be copied when you move a string. It can be just left alone because the old string is now gone. But this is NOT what is required for mmap. What you want is an 'in place after the fact adjustment' or an 'in place before the fact adjustment'. This is a the *residual* of a move minus a memcpy. in other words, its the operation for a move excluding any bit blitting. The reason is that the move is actually done by remapping pages, which does not require any blitting. Of course this remap operation only applies to objects not containers. It turns out that this operation is mandatory for use of C++ objects with modern copying collectors. Such collectors do not generally give a hoot about containers, they manage only single objects. They also automatically adjust all known pointers, so the 'remap' is basically only required if there are hidden pointers. It's really NOT clear how to make this all work properly, which is one reason why the development of extensible arrays in Felix is stalled. It isn't so hard to make it work for Felix objects. But for opaque C++ objects, extra information is required .. I'm not sure what, or even if it is possible to provide it. Note that Felix does already have an arena store with a compactor. It uses memcpy and adjusts known pointers. It will fail for some opaque objects, for example an opaque doubly linked list. In this case a 'remap' operation would fix the problem by adjusting the pointer to the list head in the first object. For a circular list, there are two pointers -- external to the node being moved or remapped -- which need to be adjusted. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language