Re: StringNg review: MemBlob and BodyPipe
On 03/26/2011 03:44 AM, Kinkie wrote: > IMVHO body pipelining should go through SBuf, not MemBlob, but that's > for body pipelining to work out. Currently, MemBlob's append-only design prevents efficient message body pipelining. Since SBuf relies on MemBlob, SBuf inherits the same problem (and adds more problems due to poor used area tracking). We will need to decide whether to enhance MemBlob/SBuf to improve pipelining support or switch to a pipeline that uses a "bunch of buffers" instead of a single one. This is a topic for another thread though. >>> >>> Yes. Unfortunately I don't understand what is the use case you have in >>> mind :( >> >> See BodyPipe and BodyPipe::theBuf use specifically. >> >>> My original thought was to use SBuf's append/consume pair for this. >> >> Unlike MemBuf used by BodyPipe, SBuf::consume() does not free any space and, >> hence, cannot be used for efficient body pipelining (i.e., using a >> relatively small buffer to transfer a possibly large body using asynchronous >> producer and consumer). > > I understand now. > We could probably define some heuristics to handle this case. > Something like (on top of my mind): > if after consume() we have > - a large-ish backing MemBlob (e.g. > 8kb) > - with a large unused prefix space (e.g. >4kb) > > Then we cow(). > Would this enable SBuf to be used for BodyPipes in your opinion? Yes, probably, but the question is not just how to make SBuf usable for BodyPipe, but which use case we want to optimize SBuf for. If we follow the above "consume-causes-memcpy" algorithm, we are virtually guaranteed to memcpy bytes of any message body that exceeds I/O buffer size. And copy about N times where N is the content length divided by the "large unused prefix size". Thus, the "consume-causes-memcpy" algorithm is bad for body processing performance. It is debatable whether it is worse than the current BodyPipe code, but it is unlikely to be much better. If we optimize for body forwarding needs instead of preserving the current implementation, we would need to implement a circular buffer so that consume frees space without memmove or memcpy. This would be a lot faster and a lot more complex than the current BodyPipe and SBuf code. Both approaches would have to cow() in consume(), of course, but the hope is that cow() would be a no-op after the headers are processed. Still, you are likely to do at least one large memcpy per message just to keep all the strings pointing to the headers happy. A third approach is to use a series of buffers instead of a single buffer. This approach avoids memcpy and avoids cow() in consume(), so it is the most efficient one. It can be implemented on top of the existing SBuf so that we do not make SBuf more complex and so that BodyPipe is usually compatible with other SBuf users (usually avoiding extra conversion copies). This third approach requires custom (or inefficient) processing when a string crosses buffer boundaries though. In practice, such crossings should be very rare, but we would still need to write code to handle them (I would opt for inefficient but simple code in this case). To summarize, we have four options: current: StringNG is unusable for BodyPipe; must copy to convert to/from consume-causes-memcpy: lots of memcpy overhead; few StringNG changes circular-buffer: at least one memcpy overhead; many StringNG changes many-buffers: usually no memcpy overhead; many StringNG/core additions We can move to consume-causes-memcpy now, but it adds more SBuf complexity and heuristic headaches without an obvious performance advantage. I am not sure, but it may even throw Squid performance back, and we already have many performance regressions to deal with. It feels like consume-causes-memcpy may not be worth it. I think the long-term goal should be the many-buffers design, but I am not sure. If others agree, we can just accept the fact that BodyPipe and SBuf are not compatible for now and move on, until the many-buffers approach is implemented. If StringNG becomes a performance degradation point because of this, it will have to be introduced after v3.2 becomes stable. Thank you, Alex.
Re: StringNg review: MemBlob
On 10/19/2010 08:40 AM, Kinkie wrote: On Tue, Oct 19, 2010 at 4:35 PM, Alex Rousskov wrote: On 10/19/2010 08:08 AM, Kinkie wrote: On Mon, Oct 18, 2010 at 7:56 PM, Alex Rousskov wrote: Once MemBlob is committed, please consider re-integrating it with memory pools. This is the biggest remaining XXX, IMO. I've checked and confirmed why MemPools integration was disabled: it's due to an inherent asymmetry in memAllocString/memFreeString: if mempools are not initialized, memAllocString will happily allocate the string using xcalloc. When the time comes to deallocate such strings, it will fail to find a pool to dealloc from and fail an assertion. The only way I can think of to cleanly resolve this is quite disruptive, as it means making MemPools a proper singleton obect to make sure they are initialized before being used: it's a complex project on its own which I'd list as a dependency before integrating MemPools back in. What happens if you call Mem::Init() from memAllocString()? I am sure a few things will need to be fixed (e.g., Mem::Init() should exit if called twice), but it does not sound disruptive to me. Tried that, using a string-pool as singleton-like initialization indicator. There are some prerequisites to be called in turn. I have not investigated further; I can investigate further if you wish. I think this must be fixed, but I do not think this has to be fixed before MemBlob is committed. IMVHO body pipelining should go through SBuf, not MemBlob, but that's for body pipelining to work out. Currently, MemBlob's append-only design prevents efficient message body pipelining. Since SBuf relies on MemBlob, SBuf inherits the same problem (and adds more problems due to poor used area tracking). We will need to decide whether to enhance MemBlob/SBuf to improve pipelining support or switch to a pipeline that uses a "bunch of buffers" instead of a single one. This is a topic for another thread though. Yes. Unfortunately I don't understand what is the use case you have in mind :( See BodyPipe and BodyPipe::theBuf use specifically. My original thought was to use SBuf's append/consume pair for this. Unlike MemBuf used by BodyPipe, SBuf::consume() does not free any space and, hence, cannot be used for efficient body pipelining (i.e., using a relatively small buffer to transfer a possibly large body using asynchronous producer and consumer). Hope this clarifies, Alex.
Re: StringNg review: MemBlob
On Tue, Oct 19, 2010 at 4:35 PM, Alex Rousskov wrote: > On 10/19/2010 08:08 AM, Kinkie wrote: >> >> On Mon, Oct 18, 2010 at 7:56 PM, Alex Rousskov wrote: >> >>> Once MemBlob is committed, please consider re-integrating it with memory >>> pools. This is the biggest remaining XXX, IMO. >> >> I've checked and confirmed why MemPools integration was disabled: it's >> due to an inherent asymmetry in memAllocString/memFreeString: if >> mempools are not initialized, memAllocString will happily allocate the >> string using xcalloc. When the time comes to deallocate such strings, >> it will fail to find a pool to dealloc from and fail an assertion. >> >> The only way I can think of to cleanly resolve this is quite >> disruptive, as it means making MemPools a proper singleton obect to >> make sure they are initialized before being used: it's a complex >> project on its own which I'd list as a dependency before integrating >> MemPools back in. > > What happens if you call Mem::Init() from memAllocString()? I am sure a few > things will need to be fixed (e.g., Mem::Init() should exit if called > twice), but it does not sound disruptive to me. Tried that, using a string-pool as singleton-like initialization indicator. There are some prerequisites to be called in turn. I have not investigated further; I can investigate further if you wish. > > >> IMVHO body pipelining should go through SBuf, not MemBlob, but that's >> for body pipelining to work out. > > Currently, MemBlob's append-only design prevents efficient message body > pipelining. Since SBuf relies on MemBlob, SBuf inherits the same problem > (and adds more problems due to poor used area tracking). We will need to > decide whether to enhance MemBlob/SBuf to improve pipelining support or > switch to a pipeline that uses a "bunch of buffers" instead of a single one. > This is a topic for another thread though. Yes. Unfortunately I don't understand what is the use case you have in mind :( My original thought was to use SBuf's append/consume pair for this. Thanks again! -- /kinkie
Re: StringNg review: MemBlob
On 10/19/2010 08:08 AM, Kinkie wrote: On Mon, Oct 18, 2010 at 7:56 PM, Alex Rousskov wrote: Once MemBlob is committed, please consider re-integrating it with memory pools. This is the biggest remaining XXX, IMO. I've checked and confirmed why MemPools integration was disabled: it's due to an inherent asymmetry in memAllocString/memFreeString: if mempools are not initialized, memAllocString will happily allocate the string using xcalloc. When the time comes to deallocate such strings, it will fail to find a pool to dealloc from and fail an assertion. The only way I can think of to cleanly resolve this is quite disruptive, as it means making MemPools a proper singleton obect to make sure they are initialized before being used: it's a complex project on its own which I'd list as a dependency before integrating MemPools back in. What happens if you call Mem::Init() from memAllocString()? I am sure a few things will need to be fixed (e.g., Mem::Init() should exit if called twice), but it does not sound disruptive to me. IMVHO body pipelining should go through SBuf, not MemBlob, but that's for body pipelining to work out. Currently, MemBlob's append-only design prevents efficient message body pipelining. Since SBuf relies on MemBlob, SBuf inherits the same problem (and adds more problems due to poor used area tracking). We will need to decide whether to enhance MemBlob/SBuf to improve pipelining support or switch to a pipeline that uses a "bunch of buffers" instead of a single one. This is a topic for another thread though. Cheers, Alex.
Re: StringNg review: MemBlob
On Mon, Oct 18, 2010 at 7:56 PM, Alex Rousskov wrote: > Hi Kinkie, > > Attached is am untested patch with some final polishing changes for > MemBlob, based on lp:~kinkie/squid/stringng revision 9517. I think the > MemBlob class should be committed to trunk after these changes are tested > and polished (I may have missed a few minor problems when moving or renaming > stuff). The rest of the branch code would need to be synced as well. Polished and build-tested. Unless someone objects, I'm going to merge MemBlob to trunk this evening (GMT). It lacks unit-tests (SBuf's unit tests provide them indirectly). > > The patch preamble documents the major changes. > > While working on this, I realized that the "MemBlob::size" member is only an > approximation, and that MemBlob class is probably not usable for efficient > message body pipelining (BodyPipe and friends). We will probably change > MemBlob API later to accommodate users other than strings, but that can > wait. I adjusted the append call to remove the use of pointers which may > make future changes less disruptive. Fixed SBuf accordingly. > Once MemBlob is committed, please consider re-integrating it with memory > pools. This is the biggest remaining XXX, IMO. I've checked and confirmed why MemPools integration was disabled: it's due to an inherent asymmetry in memAllocString/memFreeString: if mempools are not initialized, memAllocString will happily allocate the string using xcalloc. When the time comes to deallocate such strings, it will fail to find a pool to dealloc from and fail an assertion. The only way I can think of to cleanly resolve this is quite disruptive, as it means making MemPools a proper singleton obect to make sure they are initialized before being used: it's a complex project on its own which I'd list as a dependency before integrating MemPools back in. IMVHO body pipelining should go through SBuf, not MemBlob, but that's for body pipelining to work out. -- /kinkie
Re: StringNg review: MemBlob
On Mon, Oct 18, 2010 at 7:56 PM, Alex Rousskov wrote: > Hi Kinkie, > > Attached is am untested patch with some final polishing changes for > MemBlob, based on lp:~kinkie/squid/stringng revision 9517. I think the > MemBlob class should be committed to trunk after these changes are tested > and polished (I may have missed a few minor problems when moving or renaming > stuff). The rest of the branch code would need to be synced as well. revno 9518 covers one of the changes you requested: it turns MemBlob::calcAllocSize into a nonmember static inline call in MemBlob.cc. > > The patch preamble documents the major changes. > > While working on this, I realized that the "MemBlob::size" member is only an > approximation, and that MemBlob class is probably not usable for efficient > message body pipelining (BodyPipe and friends). We will probably change > MemBlob API later to accommodate users other than strings, but that can > wait. I adjusted the append call to remove the use of pointers which may > make future changes less disruptive. > > Once MemBlob is committed, please consider re-integrating it with memory > pools. This is the biggest remaining XXX, IMO. I'd postpone that to after merging SBuf in. The biggest issues I have met when trying to do that were related to initialization (IIRC): - static variables can't use mempools - some nonstatic MemBlobs get allocated before mempools are initialized. While this can IIRC be detected at initialization time, what happens at destruction time? Do the blobs need to remember whether they were pool-allocated or not? Or do mempools handle gracefully an attempt to free something which was not obtained from a pool? The fact that mempools have only been partly converted to c++ doesn't really help :( More to come as I check the patch(es). Thanks for taking the time Alex; I'm really happy to see things progressing. -- /kinkie
Re: StringNg review: MemBlob
Hi Kinkie, Attached is am untested patch with some final polishing changes for MemBlob, based on lp:~kinkie/squid/stringng revision 9517. I think the MemBlob class should be committed to trunk after these changes are tested and polished (I may have missed a few minor problems when moving or renaming stuff). The rest of the branch code would need to be synced as well. The patch preamble documents the major changes. While working on this, I realized that the "MemBlob::size" member is only an approximation, and that MemBlob class is probably not usable for efficient message body pipelining (BodyPipe and friends). We will probably change MemBlob API later to accommodate users other than strings, but that can wait. I adjusted the append call to remove the use of pointers which may make future changes less disruptive. Once MemBlob is committed, please consider re-integrating it with memory pools. This is the biggest remaining XXX, IMO. Thank you, Alex. Got rid of MemBlob.cci because one member should not have been inlined (it is complex code, with complex linking dependencies, that is used as a part of an expensive call) and the other members are one-liners that are better inlined directly (less time spent finding and understanding the code). Replaced raw pointers with offsets in canAppend(). Current MemBlob users already use offsets to compute those pointers anyway. And future MemBlobs may support sliding use window via ever-incrementing offsets, which will make raw pointer use even less appropriate. This change effectively removes the implicit "am I using the blob I think I am using?" check. I think it is OK to remove that because if some user is so seriously confused, we should not try to mask the problem at the canAppend() level. Added clear() method so that users can reset the blob without modifying size member directly. Users still need access to reference count to know when clearing is safe, unfortunately. We may add something like clearIfLonely() later. Documented the two blob areas and explained how they change. Documented lack of sharing protections beyond blob refcounting. Documented that "size" member is only approximate because it is not decreased when users leave. Polished creation/destruction messages to match find-alive patterns. Polished method and method argument names. Polished comments, made comments more consistent, and applied a few Source Formatting rules. === modified file 'src/Makefile.am' --- src/Makefile.am 2010-10-15 10:22:29 + +++ src/Makefile.am 2010-10-18 17:05:43 + @@ -543,8 +543,7 @@ String.cci \ SquidString.h \ SquidTime.h \ - SBuf.cci \ - MemBlob.cci + SBuf.cci BUILT_SOURCES = \ cf_gen_defines.cci \ === modified file 'src/MemBlob.cc' --- src/MemBlob.cc 2010-10-14 22:31:46 + +++ src/MemBlob.cc 2010-10-18 17:17:18 + @@ -37,68 +37,55 @@ #include #endif -MemBlobStats MemBlob::stats; -InstanceIdDefinitions(MemBlob, "MemBlob"); - -MemBlobStats::MemBlobStats() : alloc(0), live(0), append(0) +#define MEMBLOB_USES_MEM_POOLS 0 + +MemBlobStats MemBlob::Stats; +InstanceIdDefinitions(MemBlob, "blob"); + + +/* MemBlobStats */ + +MemBlobStats::MemBlobStats(): alloc(0), live(0), append(0) {} -const MemBlobStats& -MemBlob::GetStats() -{ -return stats; -} - std::ostream& -MemBlobStats::dump(std::ostream& os) const +MemBlobStats::dump(std::ostream &os) const { os << -"MemBlob allocations: " << alloc << -"\nMemBlob live references: " << live << -"\nMemBlob append operations: " << append << -"\nMemBlob total allocated volume: " << liveBytes << -"\nlive MemBlobs mean size: " << +"MemBlob created: " << alloc << +"\nMemBlob alive: " << live << +"\nMemBlob append calls: " << append << +"\nMemBlob currently allocated size: " << liveBytes << +"\nlive MemBlob mean current allocation size: " << (static_cast(liveBytes)/(live?live:1)) << std::endl; return os; } -std::ostream& -MemBlob::dump (std::ostream &os) const -{ -os << "id @" << (void *)this -<< "mem:" << (void*) mem -<< ",size:" << capacity -<< ",used:" << size -<< ",refs:" << RefCountCount() << "; "; -return os; -} +/* MemBlob */ MemBlob::MemBlob(const MemBlob::size_type reserveSize) : mem(NULL), capacity(0), size(0) // will be set by memAlloc { -debugs(MEMBLOB_DEBUGSECTION,9, id << "constructed, @" - << static_cast(this) - << ", reserveSize=" << reserveSize); +debugs(MEMBLOB_DEBUGSECTION,9, HERE << "constructed, this=" + << static_cast(this) << " id=" << id << + << " reserveSize=" << reserveSize); memAlloc(reserveSize); } -MemBlob::MemBlob(const char * memBlock, const MemBlob::size_type memSize) : +MemBlob::MemBlob(const char *buffer, const MemBlob::size_type bufSize) : mem(NULL), capacity(0), size(0) // will be set by memAlloc { -debugs(MEMBLOB_DEBUGSECTION,9, id << "constructed, @" - << static_cast
Re: StringNg review: MemBlob
On Thu, Oct 7, 2010 at 12:49 AM, Alex Rousskov wrote: > Hi Kinkie, Hi alex! > > Here is another review you have requested. This review is based on > lp:~kinkie/squid/stringng revision 9504. Thanks >> #include "base/TextException.h" > > Remove if not needed in MemBlob.h Moved. >> class MemBlobStats { > > ... >> >> }; > > Not reviewed per Kinkie's request. Should now be OK. I've removed the reallocs stat, which doesn't really belong) to MemBlob, as it's a SBuf construct. >> /** >> * Refcounted memory buffer, content-agnostic. Meant to only be used >> * by SBuf. >> */ > > Remove "Meant to only be used by SBuf." This class should be generally > usable. Removed. Removed all references to SBuf in MemBlob.* >> class MemBlob: public RefCountable { >> public: > > Still missing source formatting. Done. >> char *mem; ///< Raw memory block >> size_type capacity;///< Size of the memory block >> size_type size; ///< How much of the memory is actually used > > Start comment with a small letter or end with a period. I recommend the > former for short comments that are usually not full sentences. Done. >> //do not meddle with. Use getStats to obtain a copy for dumping >> static MemBlobStats stats; >> static const MemBlobStats& getStats(); > > Both missing documentation. What not to do instructions are not enough. With the removal of nreallocs stats can be (and was) made private. > I do not recall whether we have discussed this already, but if you can make > the stats member private, please do. > > getStats() name should start with a capital letter since it is static. Done. >> /** >> * Create a new MemBlob, containing a copy of the >> * contents of the buffer of size memSize >> * pointed to by memBlock. It is up to the caller to >> * manage the memBlock >> */ >> MemBlob(const char * memBlock, const size_type memSize); > > Remove "It is up to ..." because const pointer already implies that. ok > > >> /** check whether the supplied pointer is the first unused byte in the >> MemBlob >> */ > >> _SQUID_INLINE_ bool isAtEnd(const char *ptr) const; > > > nitpick: the "supplied pointer" is not a "byte". > > Please document whether this returns true when there are no unused MemBblob > bytes. I still do not like this interface itself though. There must be a > cleaner way to do what you want. How about this: > > /// whether writing to the [start, start + n) area is currently OK > bool writable(const char *start, size_type n) const; > > This would both solve the problem with the lack of unused bytes and check > for exclusive control of the area. Would that work for the current callers? This is what canAppend does. What I'll do is make private the isAtEnd and willFit methods (they're only really used by canAppend). I would prefer not to hand-inline them tho, for readability reasons. > Moreover, do we need this method, willFit(), and canAppend()? They all do > pretty much the same kind of checks, from different angles, with different > bugs, under very different names. This needs to be polished, but I do not > know which end to start from. Can you reduce the problem space? Done, I hope. >> /** check whether we can append data at the tail of the MemBlob >> * >> * \param toAppend number of bytes to be appended >> * \return true there is at least toAppend free space available in the >> MemBlob >> */ >> _SQUID_INLINE_ bool willFit(const size_type toAppend) const; > > . > >> /** return the number of bytes available at the tail of the MemBlob >> * >> */ > > /// the number of allocated but unused bytes at the end of the blob > >> _SQUID_INLINE_ size_type getAvailableSpace() const; > > Wrong name. The method does not give the caller the available space, only > its size. Here are some combinations that work better: > > size(), spaceSize(), capacity() > contentSize(), spaceSize(), capacity() > usedSize() and availableSize(), capacity() Fixed. availableSize() is the winner. > >> /** check whether we can append data at the tail of the MemBlob >> * >> * \param toAppend number of bytes to be appended >> * \return true there is at least toAppend free space available in the >> MemBlob >> */ >> _SQUID_INLINE_ bool willFit(const size_type toAppend) const; > > . > >> /** check if there is enough unused space at the tail of the MemBlob >> * >> * \return true the append would fit >> * \param bufEnd first unused byte after the end of the SBuf being >> appended to >> * \param toAppend how many bytes would be nice to append >> */ >> _SQUID_INLINE_ bool canAppend(char const * bufEnd, const size_type >> toAppend) const; > > MemBlob should not know about SBuf; please remove references to SBuf. > > If we are really _appending_ to MemBlob, we should not have to specify the > end of the used buffer area because MemBlob knows where it ends. See above > for general commen
Re: StringNg review: MemBlob
On Wed, Sep 22, 2010 at 5:42 PM, Alex Rousskov wrote: > On 09/22/2010 03:30 AM, Kinkie wrote: >> >> On Tue, Sep 21, 2010 at 4:18 AM, Alex Rousskov wrote: >>> >>> On 09/03/2010 09:21 AM, Kinkie wrote: >>> You'll find the branch at lp:~kinkie/squid/stringng > > /** * Low-level buffer. It's responsible for allocating and managing * the blocks of memory. It's refcounted, but it doesn't know * which SBufs hold references to it. */ >>> >>> >>> Consider: >>> /// A refcounted fixed-size buffer. Tracks the portion used by opaque >>> content. >> >> How about: >> "Refcounted memory buffer, content-agnostic. Meant to only be used by >> SBuf." ? > > Fixed-size is an key property unless you think it may change soon. > Content-agnostic is fine, I guess. Done. > As for "meant to only be used by Foo", I am still against such restrictions > on use in general and in this context in particular. We have discussed this > before. Use-cases may determine the initial design sketch and optimizations, > but the final APIs should be usable in any context that needs them rather > than restricted to one specific user. MemBlob is no exception IMO. Removed the comment. I hope to avoid overdesign. class MemBlob: public RefCountable { public: typedef RefCount Pointer; typedef unsigned int size_type; >>> >>> Would be nice to guarantee support for 5GB and larger blobs. >> >> Do we need them now? It'll be trivial to extend when needed. > > I may need them soon. Based on past experience, I am not convinced that > replacing 'int' with some other type will be trivial. I just tried it: changed the typedef in MemBlob and SBuf to int64_t and adapted SQUIDSBUFPRINT, found one glitch but after fixing it build and test-suite worked without a hitch. Until the need arises, I'd prefer to stick to int32_t in order to avoid overdesign, performance penalties (64-bit math is said to be bad on many 32-bit platforms) and memory waste. >>> I would not leave these data members public, but it is your call. >> >> This class originally was private to SBuf. I've now clarified in the >> class definition that it's not meant for general use; its only client >> should be SBuf. > > Which is a wrong restriction made worse by wrongly guessed implications. As > SBuf review shows, we are already paying the price for the lack of proper > boundaries between the two classes. It makes some areas of the SBuf code > confusing at best. Removed all such implications; I agree that the boundary betweeh MemBlob and SBuf is blurry in some places; the decisions were not taken lightly and in my very humble opinion pay off in terms of performance gains. I hope that the documentation improvements you're suggesting will clarify the situation. Once the API is finalized, I'll further document in the Wiki (there is already a design blueprint, but it is out of date) _SQUID_INLINE_ void deleteSelf(); >>> >>> I hope this can be removed. If not, it must be documented. The >>> corresponding >>> comments in .cci do not make sense to me and do not match RefCount.h in >>> trunk, as far as I can tell. >> >> http://old.squid-cache.org/Doc/Prog-Guide/prog-guide-24.html >> http://markmail.org/message/kt7357pwttmzrey3 >> >> Maybe the documentaiton is outdated? > > That old documentation is wrong. Use the source code. Remove deleteSelf(). Done.The only references to such construct right now are in BasicUser, NegotiateUser and NTLMUser (out of scope for me) _SQUID_INLINE_ size_type freeSpace() const; >>> >>> Missing documentation. >> >> Renamed to getAvailableSpace. Documentation was in the .cci, moved it > > Wrong name. The method does not give the caller the available space, only > its size. Here are some combinations that work better: > > size(), spaceSize(), capacity() > contentSize(), spaceSize(), capacity() > usedSize() and availableSize(), capacity() availableSize() it is then :) MemBlobStats MemBlob::stats; >>> >>> If some static code wants to allocate MemBlob, will its stats not get >>> counted, depending on static initialization order? >> >> Yes. I don't think that's much of an issue; > > It becomes an issue when we get bug reports about negative or too-big > counters. Or when somebody adds an assert() that the counter is correct. > Both have happened before. all counters are unsigned. I'm adding documentation telling not to rely on them. The only way I can think of to disenfranchise from the static allocation trap is to use a singleton, which would make bookkeeping a relatively complex operation (much more complex than incrementing an integer, anyways). Once we put this in context that I'd remove most counters once SBuf is fully in... >> stats are mostly meant to >> >> help us diagnose the effectiveness of the solution, and in fact I >> wouldn't be against #ifdef-ing them out in production code. > > Sure, nobody cares about effectiveness of the code in production as long as > it was effec
Re: StringNg review: MemBlob
On Wed, Sep 22, 2010 at 5:31 PM, Amos Jeffries wrote: > On 22/09/10 21:30, Kinkie wrote: >> >> On Tue, Sep 21, 2010 at 4:18 AM, Alex Rousskov >> wrote: >>> >>> On 09/03/2010 09:21 AM, Kinkie wrote: >>> You'll find the branch at lp:~kinkie/squid/stringng >>> >>> I decided to take a look at MemBlob because it is used by SBuf which was >>> reviewed a few days ago. It would be nice to get a fixed SBuf committed, >>> but >>> that would require committing the classes it uses. >>> >>> Committing smaller self-contained parts may be an overall better strategy >>> than trying to fix everything first and then do one huge commit. >>> >>> So here is a MemBlob review: >> > >>> >>> Remove the "buf" prefix until you have some other size and capacity >>> members >>> to distinction from. Besides, this is not a "buf". This is a "blob" :-) >> >> Yup. Relics... I've been working on this for more than two years now. >> >>> I would not leave these data members public, but it is your call. >> >> This class originally was private to SBuf. I've now clarified in the >> class definition that it's not meant for general use; its only client >> should be SBuf. > > If still relevant that would be reason for protected: and friend > declarations in the class definition instead of public:. I'm fine either ways, the only client I'm concerned with is SBuf. _SQUID_INLINE_ void deleteSelf(); >>> >>> I hope this can be removed. If not, it must be documented. The >>> corresponding >>> comments in .cci do not make sense to me and do not match RefCount.h in >>> trunk, as far as I can tell. >> >> http://old.squid-cache.org/Doc/Prog-Guide/prog-guide-24.html >> http://markmail.org/message/kt7357pwttmzrey3 >> >> Maybe the documentaiton is outdated? >> > > Seems to be. The RefCountable classes no longer have "deleteSelf". removed. _SQUID_INLINE_ bool isAtEnd(const char *ptr) const; >>> >>> Missing documentation. >>> Why is this needed? Some transitional call? If yes, document as such. >> >> It is needed to know whether a the tail of a SBuf is at the tail of a >> shared MemBlob. Such a SBuf can append without needing a cow, if >> there's enough headroom in the MemBlob. > > So, a char* way of saying: > > if (SBuf::off_+Sbuf::len_ == MemBlob::size && > SBuf::off_+Sbuf::len_ < MemBlob::capacity) { > ... > > with canAppend being a duplicate as above but <= MemBlob::capacity+N canAppend is exactly "isAtEnd && willFit" (isAtEnd is always true if RefCount==1) > or rather isAtEnd being a case of canAppend(0) Well, yes: willFit(0) is always true by definition (you can always append nothing). >>> This reads like a command to free some space. >>> Consider: spaceSize() >>> spaceSize() would be more consistent with other Squid code as well. >>> >>> _SQUID_INLINE_ bool canGrow(const size_type howmuch) const; >>> >>> Missing documentation. >>> Usually unsafe and wastefull. See similar comments for SBuf. >> >> It'd be unsafe in a concurrent (multithreaded) environment; how would >> it be unsafe in a single-threaded env? >> > > Squid won't be single-threaded for much longer. At least on the timescale > for string-level changes. Most String/Buf implementations avoid locking, there's too much overhead to be spent for such a low-level function, it's usually better dealt with on an higher level. _SQUID_INLINE_ bool canAppend(char const * bufEnd, const size_type howmuch) const; >>> >>> Missing documentation. >>> Especially since it is not clear why canAppend() needs a pointer. >>> Looking at the .cci implementation, I question why would the caller >>> keep the bufEnd pointer if MemBlob manages it already? I do not think >>> this >>> call or duplicated "bufEnd" pointer is needed. >> >> if many SBufs reference the same MemBlob, an append operation to one >> of them does not need a cow if the appended-to SBuf is at the tail of >> the MemBlob and there is enough free space at the end of the MemBlob >> to hold the data being appended. >> This canAppend call checks exactly that. the SBuf knows its own tail, >> while the MemBlob knows the tail of the used portion of itself. >> > > see isAtEnd() comment above. > /** * constructor, creates an empty MemBlob of size>= requested * \param size the minimum size to be guarranteed from the MemBlob */ MemBlob::MemBlob(const MemBlob::size_type size) { debugs(MEMBLOB_DEBUGSECTION,9,HERE<< "size="<< size); memAlloc(size); } >>> >>> I would recommend initializing mem and sizes to NULL/0s explicitly in all >>> the constructors. It does not cost much and may save a few hours of >>> debugging if things go wrong. >> >> Isn't setting them in memAlloc enough? Besides the actual size may be >> different (or would you mean to set them twice?) > > In the constructor init list. > > You can then enforce the MUST and ONLY requirements with assert(mem==NULL) > etc at the start of memAlloc. Done. Thanks. -- /kinkie
Re: StringNg review: MemBlob
Hi Kinkie, Here is another review you have requested. This review is based on lp:~kinkie/squid/stringng revision 9504. #include "base/TextException.h" Remove if not needed in MemBlob.h class MemBlobStats { ... }; Not reviewed per Kinkie's request. /** * Refcounted memory buffer, content-agnostic. Meant to only be used * by SBuf. */ Remove "Meant to only be used by SBuf." This class should be generally usable. class MemBlob: public RefCountable { public: Still missing source formatting. char *mem; ///< Raw memory block size_type capacity;///< Size of the memory block size_type size; ///< How much of the memory is actually used Start comment with a small letter or end with a period. I recommend the former for short comments that are usually not full sentences. //do not meddle with. Use getStats to obtain a copy for dumping static MemBlobStats stats; static const MemBlobStats& getStats(); Both missing documentation. What not to do instructions are not enough. I do not recall whether we have discussed this already, but if you can make the stats member private, please do. getStats() name should start with a capital letter since it is static. /** * Create a new MemBlob, containing a copy of the * contents of the buffer of size memSize * pointed to by memBlock. It is up to the caller to * manage the memBlock */ MemBlob(const char * memBlock, const size_type memSize); Remove "It is up to ..." because const pointer already implies that. /** check whether the supplied pointer is the first unused byte in the MemBlob */ > _SQUID_INLINE_ bool isAtEnd(const char *ptr) const; nitpick: the "supplied pointer" is not a "byte". Please document whether this returns true when there are no unused MemBblob bytes. I still do not like this interface itself though. There must be a cleaner way to do what you want. How about this: /// whether writing to the [start, start + n) area is currently OK bool writable(const char *start, size_type n) const; This would both solve the problem with the lack of unused bytes and check for exclusive control of the area. Would that work for the current callers? Moreover, do we need this method, willFit(), and canAppend()? They all do pretty much the same kind of checks, from different angles, with different bugs, under very different names. This needs to be polished, but I do not know which end to start from. Can you reduce the problem space? > /** check whether we can append data at the tail of the MemBlob > * > * \param toAppend number of bytes to be appended > * \return true there is at least toAppend free space available in the MemBlob > */ > _SQUID_INLINE_ bool willFit(const size_type toAppend) const; . /** return the number of bytes available at the tail of the MemBlob * */ /// the number of allocated but unused bytes at the end of the blob _SQUID_INLINE_ size_type getAvailableSpace() const; Wrong name. The method does not give the caller the available space, only its size. Here are some combinations that work better: size(), spaceSize(), capacity() contentSize(), spaceSize(), capacity() usedSize() and availableSize(), capacity() /** check whether we can append data at the tail of the MemBlob * * \param toAppend number of bytes to be appended * \return true there is at least toAppend free space available in the MemBlob */ _SQUID_INLINE_ bool willFit(const size_type toAppend) const; . /** check if there is enough unused space at the tail of the MemBlob * * \return true the append would fit * \param bufEnd first unused byte after the end of the SBuf being appended to * \param toAppend how many bytes would be nice to append */ _SQUID_INLINE_ bool canAppend(char const * bufEnd, const size_type toAppend) const; MemBlob should not know about SBuf; please remove references to SBuf. If we are really _appending_ to MemBlob, we should not have to specify the end of the used buffer area because MemBlob knows where it ends. See above for general comments and suggestions to fix the will/can/is* mess: We need fewer append-related members with a consistent naming scheme. /** append a c-string by copy and explicit size Consider: * Appends exactly n bytes starting with S. Does not check for * null-termination of S. Throws if there is not enough space. * It is responsibility of the caller to make sure that there is * enough space What happens if there is not? I think we should throw. This needs to be documented. See above for a suggestion. * \param S the char* to be appended at the end of MemBlob "at the end of MemBlob" can be interpreted as the end of allocated area. Use "after the used area" or similar. * \param Ssize the length of the array to be appended
Re: StringNg review: MemBlob
On 09/22/2010 03:44 AM, Kinkie wrote: On Wed, Sep 22, 2010 at 8:47 AM, Amos Jeffries wrote: On 22/09/10 16:31, Alex Rousskov wrote: On 09/20/2010 08:18 PM, Alex Rousskov wrote: On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng ... Comments for MemBlob.cci r9472: Found one more: * \note memcpy is used as the copying function. This is safe, * because it is guarranteed by design that the copied-to * memory area is only owned by the appended-to string, * and thus doesn't overlap with the appended-from area */ void MemBlob::append(const char *S, const MemBlob::size_type Ssize) { memcpy(mem+bufUsed,S,Ssize); bufUsed+=Ssize; ++stats.append; } The \note is correct, No. As you pointed out to me earlier with SBuf the S here may be the result of: a.clear(); a.append(a.mem, 1); Using SBuf's raw memory access functions is dangerous, and this is well documented. It gives users rope, can't be blamed for how they use it.. This has little to do with raw memory access, actually. If a is SBuf and a.mem is a.buf(), then a.clear(); a.append(a.buf(), 1); is wrong not because some areas overlap (they actually do not, see my earlier response) but because a.size() is less than 1 after a.clear() and so the caller should not be using "1" as the size-to-append parameter. In other words, there is a bug, but the bug is not in the append(), but in the caller code. I just realized that SBuf uses length() instead of standard size(). I hope you got the idea though. :-( Alex.
Re: StringNg review: MemBlob
On 09/22/2010 12:47 AM, Amos Jeffries wrote: On 22/09/10 16:31, Alex Rousskov wrote: On 09/20/2010 08:18 PM, Alex Rousskov wrote: On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng ... Comments for MemBlob.cci r9472: Found one more: * \note memcpy is used as the copying function. This is safe, * because it is guarranteed by design that the copied-to * memory area is only owned by the appended-to string, * and thus doesn't overlap with the appended-from area */ void MemBlob::append(const char *S, const MemBlob::size_type Ssize) { memcpy(mem+bufUsed,S,Ssize); bufUsed+=Ssize; ++stats.append; } The \note is correct, No. As you pointed out to me earlier with SBuf the S here may be the result of: a.clear(); a.append(a.mem, 1); The situation here is different, I believe, but I do not understand your example. What is "a"? MemBlob does not have clear(). SBuf does not have mem... If s* are SBufs and you meant s1.clear(); s1.append(s1.buf(), s1.size()) then it is fine because s1.size() will be zero. If you meant s1.clear(); s1.append(s2.buf(), s2.size()) then it is fine because s1.append will COW if s1 and s2 share MemBlob and s2.size() is positive. If you mean blob."clear"; blob.append(blob.mem, blob.size()) then it is fine because blob.size() will be zero. ... but we should not mention "string" in MemBlob.cci. MemBlob does not (or should not) know what its users are. There may be a better way to express the same thought. Consider: \none memcpy() is safe because we copy to an unused area *that* is better. There is still no guarantee it wont overflow on the destination. memcpy() makes no mention about when happens when one of the pointers is NULL, but experience shows a lot of segfaults. The MemBlob append() code should not call memcpy() with a NULL destination pointer, of course, but that is not related to the \note. My understanding is that blob.mem is never NULL. The MemBlob append() code must check for overflows. If I did not comment on that before, please consider this a "please fix" comment. Kinkie, while you are at it, please also avoid calling memcpy/move() when size-to-append is zero, just in case. memcpy() docs state "always copies exactly num bytes" and that the buffers should not overlap. memmove() is the safe one which should be used if there is any doubt about overlapping memory. But even that can't validate against overflows. Per \note, I think we are fine with memcpy() here but counter-examples are more than welcome. I assume memcpy() is significantly faster in some environments we care about. On the other hand, if memmove() implementations we care about always quickly check for overlaps and actually call memcpy() when possible, then we should just use memmove()! Thank you, Alex.
Re: StringNg review: MemBlob
On 09/22/2010 03:30 AM, Kinkie wrote: On Tue, Sep 21, 2010 at 4:18 AM, Alex Rousskov wrote: On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng /** * Low-level buffer. It's responsible for allocating and managing * the blocks of memory. It's refcounted, but it doesn't know * which SBufs hold references to it. */ Consider: /// A refcounted fixed-size buffer. Tracks the portion used by opaque content. How about: "Refcounted memory buffer, content-agnostic. Meant to only be used by SBuf." ? Fixed-size is an key property unless you think it may change soon. Content-agnostic is fine, I guess. As for "meant to only be used by Foo", I am still against such restrictions on use in general and in this context in particular. We have discussed this before. Use-cases may determine the initial design sketch and optimizations, but the final APIs should be usable in any context that needs them rather than restricted to one specific user. MemBlob is no exception IMO. class MemBlob: public RefCountable { public: typedef RefCount Pointer; typedef unsigned int size_type; Would be nice to guarantee support for 5GB and larger blobs. Do we need them now? It'll be trivial to extend when needed. I may need them soon. Based on past experience, I am not convinced that replacing 'int' with some other type will be trivial. I would not leave these data members public, but it is your call. This class originally was private to SBuf. I've now clarified in the class definition that it's not meant for general use; its only client should be SBuf. Which is a wrong restriction made worse by wrongly guessed implications. As SBuf review shows, we are already paying the price for the lack of proper boundaries between the two classes. It makes some areas of the SBuf code confusing at best. _SQUID_INLINE_ void deleteSelf(); I hope this can be removed. If not, it must be documented. The corresponding comments in .cci do not make sense to me and do not match RefCount.h in trunk, as far as I can tell. http://old.squid-cache.org/Doc/Prog-Guide/prog-guide-24.html http://markmail.org/message/kt7357pwttmzrey3 Maybe the documentaiton is outdated? That old documentation is wrong. Use the source code. Remove deleteSelf(). _SQUID_INLINE_ size_type freeSpace() const; Missing documentation. Renamed to getAvailableSpace. Documentation was in the .cci, moved it Wrong name. The method does not give the caller the available space, only its size. Here are some combinations that work better: size(), spaceSize(), capacity() contentSize(), spaceSize(), capacity() usedSize() and availableSize(), capacity() MemBlobStats MemBlob::stats; If some static code wants to allocate MemBlob, will its stats not get counted, depending on static initialization order? Yes. I don't think that's much of an issue; It becomes an issue when we get bug reports about negative or too-big counters. Or when somebody adds an assert() that the counter is correct. Both have happened before. > stats are mostly meant to help us diagnose the effectiveness of the solution, and in fact I wouldn't be against #ifdef-ing them out in production code. Sure, nobody cares about effectiveness of the code in production as long as it was effective in the lab :-). /** * constructor, creates an empty MemBlob of size>= requested * \param size the minimum size to be guarranteed from the MemBlob */ MemBlob::MemBlob(const MemBlob::size_type size) { debugs(MEMBLOB_DEBUGSECTION,9,HERE<< "size="<< size); memAlloc(size); } I would recommend initializing mem and sizes to NULL/0s explicitly in all the constructors. It does not cost much and may save a few hours of debugging if things go wrong. Isn't setting them in memAlloc enough? Besides the actual size may be different (or would you mean to set them twice?) Initialize them to NULLs and 0s before calling memAlloc. Such initializations cost little and avoid having an object with uninitialized members. Thank you, Alex.
Re: StringNg review: MemBlob
On 22/09/10 21:30, Kinkie wrote: On Tue, Sep 21, 2010 at 4:18 AM, Alex Rousskov wrote: On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng I decided to take a look at MemBlob because it is used by SBuf which was reviewed a few days ago. It would be nice to get a fixed SBuf committed, but that would require committing the classes it uses. Committing smaller self-contained parts may be an overall better strategy than trying to fix everything first and then do one huge commit. So here is a MemBlob review: Remove the "buf" prefix until you have some other size and capacity members to distinction from. Besides, this is not a "buf". This is a "blob" :-) Yup. Relics... I've been working on this for more than two years now. I would not leave these data members public, but it is your call. This class originally was private to SBuf. I've now clarified in the class definition that it's not meant for general use; its only client should be SBuf. If still relevant that would be reason for protected: and friend declarations in the class definition instead of public:. _SQUID_INLINE_ void deleteSelf(); I hope this can be removed. If not, it must be documented. The corresponding comments in .cci do not make sense to me and do not match RefCount.h in trunk, as far as I can tell. http://old.squid-cache.org/Doc/Prog-Guide/prog-guide-24.html http://markmail.org/message/kt7357pwttmzrey3 Maybe the documentaiton is outdated? Seems to be. The RefCountable classes no longer have "deleteSelf". _SQUID_INLINE_ bool isAtEnd(const char *ptr) const; Missing documentation. Why is this needed? Some transitional call? If yes, document as such. It is needed to know whether a the tail of a SBuf is at the tail of a shared MemBlob. Such a SBuf can append without needing a cow, if there's enough headroom in the MemBlob. So, a char* way of saying: if (SBuf::off_+Sbuf::len_ == MemBlob::size && SBuf::off_+Sbuf::len_ < MemBlob::capacity) { ... with canAppend being a duplicate as above but <= MemBlob::capacity+N or rather isAtEnd being a case of canAppend(0) This reads like a command to free some space. Consider: spaceSize() spaceSize() would be more consistent with other Squid code as well. _SQUID_INLINE_ bool canGrow(const size_type howmuch) const; Missing documentation. Usually unsafe and wastefull. See similar comments for SBuf. It'd be unsafe in a concurrent (multithreaded) environment; how would it be unsafe in a single-threaded env? Squid won't be single-threaded for much longer. At least on the timescale for string-level changes. _SQUID_INLINE_ bool canAppend(char const * bufEnd, const size_type howmuch) const; Missing documentation. Especially since it is not clear why canAppend() needs a pointer. Looking at the .cci implementation, I question why would the caller keep the bufEnd pointer if MemBlob manages it already? I do not think this call or duplicated "bufEnd" pointer is needed. if many SBufs reference the same MemBlob, an append operation to one of them does not need a cow if the appended-to SBuf is at the tail of the MemBlob and there is enough free space at the end of the MemBlob to hold the data being appended. This canAppend call checks exactly that. the SBuf knows its own tail, while the MemBlob knows the tail of the used portion of itself. see isAtEnd() comment above. /** * constructor, creates an empty MemBlob of size>= requested * \param size the minimum size to be guarranteed from the MemBlob */ MemBlob::MemBlob(const MemBlob::size_type size) { debugs(MEMBLOB_DEBUGSECTION,9,HERE<< "size="<< size); memAlloc(size); } I would recommend initializing mem and sizes to NULL/0s explicitly in all the constructors. It does not cost much and may save a few hours of debugging if things go wrong. Isn't setting them in memAlloc enough? Besides the actual size may be different (or would you mean to set them twice?) In the constructor init list. You can then enforce the MUST and ONLY requirements with assert(mem==NULL) etc at the start of memAlloc. Amos -- Please be using Current Stable Squid 2.7.STABLE9 or 3.1.8 Beta testers wanted for 3.2.0.2
Re: StringNg review: MemBlob
On Wed, Sep 22, 2010 at 8:47 AM, Amos Jeffries wrote: > On 22/09/10 16:31, Alex Rousskov wrote: >> >> On 09/20/2010 08:18 PM, Alex Rousskov wrote: >>> >>> On 09/03/2010 09:21 AM, Kinkie wrote: >>> You'll find the branch at lp:~kinkie/squid/stringng >> >> ... >>> >>> Comments for MemBlob.cci r9472: >> >> Found one more: >> >>> * \note memcpy is used as the copying function. This is safe, >>> * because it is guarranteed by design that the copied-to >>> * memory area is only owned by the appended-to string, >>> * and thus doesn't overlap with the appended-from area >>> */ >>> void >>> MemBlob::append(const char *S, const MemBlob::size_type Ssize) >>> { >>> memcpy(mem+bufUsed,S,Ssize); >>> bufUsed+=Ssize; >>> ++stats.append; >>> } >> >> The \note is correct, > > No. As you pointed out to me earlier with SBuf the S here may be the result > of: > a.clear(); a.append(a.mem, 1); Using SBuf's raw memory access functions is dangerous, and this is well documented. It gives users rope, can't be blamed for how they use it.. > > ... > > >> but we should not mention "string" in MemBlob.cci. >> MemBlob does not (or should not) know what its users are. There may be a >> better way to express the same thought. Consider: >> >> \none memcpy() is safe because we copy to an unused area > > *that* is better. There is still no guarantee it wont overflow on the > destination. memcpy() makes no mention about when happens when one of the > pointers is NULL, but experience shows a lot of segfaults. Changed to reflect this. MemBlob.mem can never be NULL; we could add a guard if the user passes a NULL char*, and treat it as a NOP. > memcpy() docs state "always copies exactly num bytes" and that the buffers > should not overlap. > memmove() is the safe one which should be used if there is any doubt about > overlapping memory. But even that can't validate against overflows. Can change to memmove; is there a performance hit? -- /kinkie
Re: StringNg review: MemBlob
On Tue, Sep 21, 2010 at 4:18 AM, Alex Rousskov wrote: > On 09/03/2010 09:21 AM, Kinkie wrote: > >> You'll find the branch at lp:~kinkie/squid/stringng > > I decided to take a look at MemBlob because it is used by SBuf which was > reviewed a few days ago. It would be nice to get a fixed SBuf committed, but > that would require committing the classes it uses. > > Committing smaller self-contained parts may be an overall better strategy > than trying to fix everything first and then do one huge commit. > > So here is a MemBlob review: MemBlob and SBuf are meant to be used together. They split the burden of managing memory correcly. > Comments for src/MemBlob.h r9472: > >> #include "MemPool.h" Needed for MEMPROXY_CLASS etc > Remove if not needed. > >> #include "RefCount.h" >> >> /** >> * A container for various MemBlob class-wide statistics >> */ >> class MemBlobStats { >> public: >> u_int64_t alloc; ///number of allocation operations > > Class allocations or buffer allocations? Clarified. MemBlob allocations >> u_int64_t realloc; ///number of re-allocation operations > > . > >> u_int64_t live; ///number of live references to bufs > > How can you know the number of references? Do you mean the number of MemBob > instances in existence? Number of MemBlob instances. Clarified. > >> u_int64_t append; ///number of append operations (whtout grow) > > spelling > Besides, MemBob does not grow! Fixed >> u_int64_t allocatedBytes; ///the total size of the allocated storage > > Conflicts with the previous use of "allocation". If that is not changed, > use liveBytes here. Done. >> /** >> * Dumps statistics to an ostream. Use an ostringstream to >> * capture them to a string >> */ >> std::ostream& dump(std::ostream& os) const; > > Consider removing reference to ostringstream or moving it to some .dox file. > It does not belong to interface documentation. Fixed. >> MemBlobStats(); >> }; > > . > >> /** >> * Low-level buffer. It's responsible for allocating and managing >> * the blocks of memory. It's refcounted, but it doesn't know >> * which SBufs hold references to it. >> */ > > MemBlob does not manage blocks, it manages one block. It does not know many > things (do not document what is not there). Fixed > > Consider: > /// A refcounted fixed-size buffer. Tracks the portion used by opaque > content. How about: "Refcounted memory buffer, content-agnostic. Meant to only be used by SBuf." ? >> class MemBlob: public RefCountable { >> public: >> typedef RefCount Pointer; >> typedef unsigned int size_type; > > Would be nice to guarantee support for 5GB and larger blobs. Do we need them now? It'll be trivial to extend when needed. >> char *mem; /// The block of memory > > Missing "<" in Doxygen comments? Yes. I wasn't aware of the construct. > > Consider: ///< raw allocated memory > >> size_type bufSize; /// Size of the allocated memory >> size_type bufUsed; /// How much of the memory is actually used > > Other Squid code is using capacity and size for this. Std::string also uses > size() for user-facing buffer length. Please be consistent. Changed, even though I find the name "size" confusing in this case. > Remove the "buf" prefix until you have some other size and capacity members > to distinction from. Besides, this is not a "buf". This is a "blob" :-) Yup. Relics... I've been working on this for more than two years now. > I would not leave these data members public, but it is your call. This class originally was private to SBuf. I've now clarified in the class definition that it's not meant for general use; its only client should be SBuf. >> _SQUID_INLINE_ void deleteSelf(); > > I hope this can be removed. If not, it must be documented. The corresponding > comments in .cci do not make sense to me and do not match RefCount.h in > trunk, as far as I can tell. http://old.squid-cache.org/Doc/Prog-Guide/prog-guide-24.html http://markmail.org/message/kt7357pwttmzrey3 Maybe the documentaiton is outdated? >> MemBlob(const size_type size); > > Missing "explicit". It is better to document whether it allocates anything. Fixed. >> MemBlob(char * memBlock, const size_type memSize); > > Not sure why this is needed, but needs to be documented, especially with > regard to memBlock ownership and destruction rules. Documented. > Please keep in mind that without a default constructor (and reserve()), it > would be impossible to have containers of blobs. Not needed. We want containers of SBufs, a MemBlob is more akin to a "smart struct". >> _SQUID_INLINE_ bool isAtEnd(const char *ptr) const; > > Missing documentation. > Why is this needed? Some transitional call? If yes, document as such. It is needed to know whether a the tail of a SBuf is at the tail of a shared MemBlob. Such a SBuf can append without needing a cow, if there's enough headroom in the MemBlob. >> _SQUID_INLINE_ size_type freeSpace(
Re: StringNg review: MemBlob
On 22/09/10 16:31, Alex Rousskov wrote: On 09/20/2010 08:18 PM, Alex Rousskov wrote: On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng ... Comments for MemBlob.cci r9472: Found one more: * \note memcpy is used as the copying function. This is safe, * because it is guarranteed by design that the copied-to * memory area is only owned by the appended-to string, * and thus doesn't overlap with the appended-from area */ void MemBlob::append(const char *S, const MemBlob::size_type Ssize) { memcpy(mem+bufUsed,S,Ssize); bufUsed+=Ssize; ++stats.append; } The \note is correct, No. As you pointed out to me earlier with SBuf the S here may be the result of: a.clear(); a.append(a.mem, 1); ... but we should not mention "string" in MemBlob.cci. MemBlob does not (or should not) know what its users are. There may be a better way to express the same thought. Consider: \none memcpy() is safe because we copy to an unused area *that* is better. There is still no guarantee it wont overflow on the destination. memcpy() makes no mention about when happens when one of the pointers is NULL, but experience shows a lot of segfaults. memcpy() docs state "always copies exactly num bytes" and that the buffers should not overlap. memmove() is the safe one which should be used if there is any doubt about overlapping memory. But even that can't validate against overflows. The note belongs to the append() implementation, not its description. Amos -- Please be using Current Stable Squid 2.7.STABLE9 or 3.1.8 Beta testers wanted for 3.2.0.2
Re: StringNg review: MemBlob
On 09/20/2010 08:18 PM, Alex Rousskov wrote: On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng ... Comments for MemBlob.cci r9472: Found one more: * \note memcpy is used as the copying function. This is safe, * because it is guarranteed by design that the copied-to * memory area is only owned by the appended-to string, * and thus doesn't overlap with the appended-from area */ void MemBlob::append(const char *S, const MemBlob::size_type Ssize) { memcpy(mem+bufUsed,S,Ssize); bufUsed+=Ssize; ++stats.append; } The \note is correct, but we should not mention "string" in MemBlob.cci. MemBlob does not (or should not) know what its users are. There may be a better way to express the same thought. Consider: \none memcpy() is safe because we copy to an unused area The note belongs to the append() implementation, not its description. Thank you, Alex.
Re: StringNg review: MemBlob
On 09/03/2010 09:21 AM, Kinkie wrote: You'll find the branch at lp:~kinkie/squid/stringng I decided to take a look at MemBlob because it is used by SBuf which was reviewed a few days ago. It would be nice to get a fixed SBuf committed, but that would require committing the classes it uses. Committing smaller self-contained parts may be an overall better strategy than trying to fix everything first and then do one huge commit. So here is a MemBlob review: Comments for src/MemBlob.h r9472: > #include "MemPool.h" Remove if not needed. > #include "RefCount.h" > > /** > * A container for various MemBlob class-wide statistics > */ > class MemBlobStats { > public: > u_int64_t alloc; ///number of allocation operations Class allocations or buffer allocations? > u_int64_t realloc; ///number of re-allocation operations . > u_int64_t live; ///number of live references to bufs How can you know the number of references? Do you mean the number of MemBob instances in existence? > u_int64_t append; ///number of append operations (whtout grow) spelling Besides, MemBob does not grow! > u_int64_t allocatedBytes; ///the total size of the allocated storage Conflicts with the previous use of "allocation". If that is not changed, use liveBytes here. > /** > * Dumps statistics to an ostream. Use an ostringstream to > * capture them to a string > */ > std::ostream& dump(std::ostream& os) const; Consider removing reference to ostringstream or moving it to some .dox file. It does not belong to interface documentation. > MemBlobStats(); > }; . > /** > * Low-level buffer. It's responsible for allocating and managing > * the blocks of memory. It's refcounted, but it doesn't know > * which SBufs hold references to it. > */ MemBlob does not manage blocks, it manages one block. It does not know many things (do not document what is not there). Consider: /// A refcounted fixed-size buffer. Tracks the portion used by opaque content. > class MemBlob: public RefCountable { > public: > typedef RefCount Pointer; > typedef unsigned int size_type; Would be nice to guarantee support for 5GB and larger blobs. > char *mem; /// The block of memory Missing "<" in Doxygen comments? Consider: ///< raw allocated memory > size_type bufSize; /// Size of the allocated memory > size_type bufUsed; /// How much of the memory is actually used Other Squid code is using capacity and size for this. Std::string also uses size() for user-facing buffer length. Please be consistent. Remove the "buf" prefix until you have some other size and capacity members to distinction from. Besides, this is not a "buf". This is a "blob" :-) I would not leave these data members public, but it is your call. > _SQUID_INLINE_ void deleteSelf(); I hope this can be removed. If not, it must be documented. The corresponding comments in .cci do not make sense to me and do not match RefCount.h in trunk, as far as I can tell. > MemBlob(const size_type size); Missing "explicit". It is better to document whether it allocates anything. > MemBlob(char * memBlock, const size_type memSize); Not sure why this is needed, but needs to be documented, especially with regard to memBlock ownership and destruction rules. Please keep in mind that without a default constructor (and reserve()), it would be impossible to have containers of blobs. > _SQUID_INLINE_ bool isAtEnd(const char *ptr) const; Missing documentation. Why is this needed? Some transitional call? If yes, document as such. > _SQUID_INLINE_ size_type freeSpace() const; Missing documentation. This reads like a command to free some space. Consider: spaceSize() spaceSize() would be more consistent with other Squid code as well. > _SQUID_INLINE_ bool canGrow(const size_type howmuch) const; Missing documentation. Usually unsafe and wastefull. See similar comments for SBuf. After reading MemBuf.cc: MemBlob cannot grow, apparently! Replace this with something like willFit(sizeToAppend) or canAppend(sizeToAppend). s/howmuch/byHowMuch/ or s/howmuch/toHowMuch/, depending on what this does. > _SQUID_INLINE_ bool canAppend(char const * bufEnd, const size_type howmuch) const; Missing documentation. Especially since it is not clear why canAppend() needs a pointer. Looking at the .cci implementation, I question why would the caller keep the bufEnd pointer if MemBlob manages it already? I do not think this call or duplicated "bufEnd" pointer is needed. > _SQUID_INLINE_ void append(const char *S, const size_type Ssize); Missing documentation. Comments for MemBlob.cci r9472: * Please move methods longer than one line to .cc. * Please move public method comments to .h per current documentation rules. * MemBlob::append() should check boundaries and throw. Comments for MemBlob.cc r9472: > #include "config.h" > #include "MemBlob.h" > #include