On 08/19/2015 09:47 AM, Amos Jeffries wrote: > On 19/08/2015 4:50 p.m., Alex Rousskov wrote: >> On 08/15/2015 12:20 AM, Amos Jeffries wrote: >>> For now this class is specifically and intentionally dumping out the >>> (old) format for cachemgr.cgi. Other third-party tools are considered >>> only so far as the cachemgr ambiguous syntax was published in various >>> parts in various places. As long as the output still matches that they >>> should be fine (or better off).
>> IMO, made-up after-the-fact syntax rules are irrelevant (because >> "nobody" knows about them, and they are imprecise). Cachemgr.cgi itself >> is irrelevant (because we control it). Admin scripts parsing old output >> are relevant. Thus, we should either keep the old output pretty much "as >> is" or replace it with something sufficiently better to justify the >> change pains for the admins. Adjusting output (while claiming >> compatibility with some syntax rules we made up) is a bad approach >> because it makes admins unhappy while not making us happy. > "The Definitive Guide" 14.2. All the way through its talking about > cachemgr.cgi display "columns". > > To find the detail of what a input "column" actually is refer to > cachemgr.cgi code itself. Specificaly the munge_other_line() function > for converting report text/plain to text/html. This code has not changed > in any significant way since the 1990's. > > * Any line excluding a \t is a free-form text line. Lets call that ... > comment, kv-pair, LF. > > * table is a series of \t delimited cells. Lets call that ... 'table-row'. > > * If the first cell of the first line of a table includes a ':' its a > header row. Lets call that ... 'table' for the whole set of rows, and we > will need a optional 'label' for the pre-colon bit in the first table-row. > The Guide is also talking about how several reports including > 'redirectors' are *identical* in format to 'idns' ... not today, they > use tab-delimiting. idns got screwed up years later and now uses space > delimiting and looks like crap in CGI display as a result. > > > The kv-pairs is not part of the Guide + CGI documentation. Those parts I > took from your own emails and IRC chats educating me an kinkie about > what the report format was supposed to be. This part of the email thread was discussing whether the existence of admin scripts (rather than various imprecise syntax rules or cachemgr.cgi code) should be the primary factor in our decision making. How is the above information relevant to that discussion? > It turns out that no, those were just PRE (free-form) lines "more easily > machine interpretable than human readable" as the Guide says. But it > does make sense to have them, and many reports look like they were > created with that in mind so I added the kv-pair descriptor to the > grammar too. You object to that? Do I object to adding kv-pair to the grammar? No. I did not even review those low-level details! I trust you with roughly approximating the actual output. And a rough approximation is sufficient [and the only thing possible] here IMO. > The lack of a written copy of the grammar to date does not preclude its > existence prior to now. In practical terms, it pretty much does preclude it actually: Whenever dozens of poorly coordinated developers spend years producing informal output, one can virtually guarantee that no useful formal grammar can describe what they did. > Human language is a good example there, it > "obviously" has a grammar even though an entire field of research was > created to find out what it is, a long time after use began. Human language is a good example of where no useful formal grammar exists. There are various useful approximations, of course, but I do not think anybody is even trying to confine human language into a useful formal grammar for the last 30 years or so. >>>>> +class PayloadFormatter >>>> ... >>>>> + Packable &outBuf; >>>>> +}; >>>> >>>> Since the majority of PayloadFormatter methods and callers are going to >>>> assemble and format pieces of text and numbers, I think this should >>>> become an ostream. >> >>> You want to go back to PackableStream now? >> >> If possible, the new page/report/payload building interface should be >> ostream-enabled, not PackableStream-enabled. IIRC, that's what I >> sketched a few emails back. > > So how does the data get from std::ostreambuf to the client if its not > going near a Packable (ie StoreEntry). I do not think ostream should use std::ostreambuf if that is what you meant. It should use our own buffer, backed by Packable (or StoreEntry). However, cache manager code would not know that and, hence, there should be no linking problems associated with that knowledge. > You ask to data-copy into the stream buffer, then data-copy a second > time to the StoreEntry buffer. No, I do not. The stream buffer should use StoreEntry as storage [when the output goes to Store]. > PackableStream is an ostream child providing StoreEntry as the buffer. > So PackableStream data-copies to the StoreEntry buffer. Yes. > Which would you expect provides better performance: > one or two data-copies ? One copy. (The design choices here are actually not driven by performance, but by the requirement to avoid buffering of huge cache manager responses in RAM. However, the same custom ostream buffer design happens to eliminate the extra copy as a positive performance side effect). >> Said that, ostream is the wrong primary interface for assembling >> payload/pages/reports. IMO, you should reintroduce ostream capabilities, >> but we should not be [going back to] assembling primary >> payload/pages/reports using ostream. More on that below. > The code proposal you submitted has Formatter inherting from ostream. Yes, Page is a child of ostream in my sketch. > The only use of Formatter is by Action. Yes, at least in the scope of this discussion. > Ergo you intend Action to be > using a ostream interface to assemble top-level report data. No. Page provides a set of methods for high-level assembly, including: /// starts recording a list of values virtual Page &beginList() = 0; /// finishes recording a list of values virtual Page &endList() = 0; /// starts recording a list of values virtual Page &beginListItem() = 0; /// finishes recording a list of values virtual Page &endListItem() = 0; /// starts recording a key:value pair with a given key virtual Page &beginKv(const SBuf &key) = 0; /// finishes recording a key:value pair virtual Page &endKv() = 0; > Either ostream is a private internal to Formatter for low-level use > only, or it is available (and eventually used) at high-level Action. It is a public interface for assembling some opaque/atomic values for formal output such as YAML. Whether inheriting from ostream is the best approach to accomplish that, I am not sure, but it is probably the simplest one. The alternative to consider is to provide a method that returns an ostream reference. There are some examples below. > Formatter with its clean API distinct from StoreEntry and Action does > not pull in the dependencoes I was worried about with libmem. Yes, of course. >>> Because StoreEntryStream would place a Store.h dependency almost as >>> widely as squid.h dependency. >> The _implementation_ of the ostream-based interface may use something >> like PackableStream or even StoreEntryStream without introducing any >> excessive dependencies AFAICT. In the sketch I posted, only one line of >> code new about std::cout backing while the rest was using ostream. > You argued strongly for *not* doing it exactly that way only the other week. I did not. I argued against using ostream to format strict syntax (e.g., YAML) output. Later, I provided a sketch how to avoid doing that while still supporting ostream-based assembly of YAML atomic values. I now included specific examples further below. > Your response has been to propose an object ... which is an ostream > child. And would thus be used like PackableStream in my first proposal. The difference between the Page class and PackableStream is that Page provides methods for high-level formal formatting (such as YAML). PackableStream does not and should not have such methods. You should look at the whole Page class declaration, not just at its first line (where ostream inheritance is declared). >>>> Almost all the code that uses outBuf and PayloadFormatter methods in the >>>> patch already looks labored/awkward (often worse than the trivial code >>>> it replaces!), and would look much better if replaced by the usual >>>> ostream "<<" expressions. >> >>> Now you see the benefit of ostream. >> >> My attitude towards ostream has not changed. Forming payloads/pages >> /reports using ostream as the top-level interface was wrong and still is. > > Your objection was something along the lines of "do not use an ostream > in Action". > > So you proposed the Formatter class, an ostream child to be used by Action. > > WTF? When you only pay attention the first few words of my paragraph or my class declaration, it is easy to arrive at strange conclusions. Here is my initial objection, prefaced with its context sample: > Mgr::MenuAction::dump(std::ostream &p) > { > p.fill(' '); // space pad the menu fields for readability > p.setf(std::ios::left); > for (auto &a : CacheManager::GetInstance()->menu()) { > p << '\t' // menu is a table > << std::setw(22) << a->name << '\t' > << std::setw(32) << a->desc << '\t' > << std::setw(8) << CacheManager::GetInstance()->ActionProtection(*a) > << '\n'; > } > p.unsetf(std::ios::left); > } >> It feels wrong that you want Actions to output using strict syntax but >> give them an "free form" or "anything goes" std::ostream as a tool to >> accomplish that. Should not Action dumping methods receive a specialized >> class that makes it easy to produce the output you want and hard (if not >> impossible) to make syntax mistakes? The Page class in my sketch is an example of that "specialized" class I was talking about in the above paragraph. It makes formal output "easy" by providing methods that ensure correct syntax. Here is the sketch code implementing something similar to the MenuAction::dump() above: > page.beginList(); > for (auto name: {"5min", "60min", "disk_io", "events", "shutdown"}) { > page.beginListItem(). > beginHash(). > kv(name, "menu item description to be extracted from > Action::desc"). > kv("protection", "public"). // ActionProtection(*a) > endHash(). > endListItem(); > } > page.endList(); We can discuss how "hard" it should be for Action to get access to Page's ostream interface. In the sketch, the developer would have to use an ostream method or operator to get access to ostream features. For example: page.beginKv("statistics_foo") << something.mean() << endKv; page.beginkv("worker_id") << "disker" << DiskerId << endKv; If that usage of a low-level interface is not visible enough, we can hide ostream access behind a dedicated method returning an ostream reference: class Page { public: ... /// use this to assemble opaque/atomic values ostream &raw(); ... } It might be used along these lines: page.beginKv("statistics_foo").raw() << something.mean() << endKv; page.beginkv("worker_id").raw() << "disker" << DiskerId << endKv; Needless to say, there is some danger in providing such raw access. If a developer is not careful, they might introduce a syntax error inside what they saw were perfectly fine atomic values. We can protect against that and offer automatic escaping of bad characters (among other things), but that would require more work. I suspect such protections would not necessarily require Page interface changes; if I am right, we can delay their implementation until it is clear they are actually needed. >> Making ostream available for opaque and low-level parts of payload is a >> good approach. My earlier sketch was probably too small to illustrate >> this point, but it does provide high-level report structuring methods >> while keeping ostream available for low-level stuff. More needs to be >> done to polish that split based on actual typical use cases. > The low-level stuff that makes sense using ostream is all internal to > Formatter. We seemed to be agreeing on that much. We do not agree on the "all internal part". IMO, ostream should be exposed so that Actions can assemble opaque values using it. I hope the specific examples above illustrate what I mean by that. > What I'm not getting is why you insist this ostream API be exposed and > used by Action now but that PackableStream was inappropriate when it > could simply have had new members added and been identical to your > proposed Formatter. First of all, the first patch I was reviewing did not pass PackableStream to Actions. It passed ostream. We cannot add methods to ostream. Second, adding formal-syntax formatting methods to PackableStream (and passing PackableStream to Actions) would be wrong. The PackableStream class should be dedicated to providing a Packable-backed ostream. It should not know anything about Cache Manager reports and their formatting. We may use PackableStream outside Cache Manager. > The problem in the memory libraries is the *StoreEntry*. In particular > its class declaration head and things it pulls in. > > Action depends deeply on StoreEntry. So there is no Action dumper in > libmem. Just currently some pool methods taking (ostream&) and free-form > dumping out "tables". > > StoreEntry was made a Packable child for the purpose of using the > Packable API, possibly via PackableStream. To store stuff into a > StoreEntry (or just a Membuf/SBuf) in more complex Formatter-like ways > without having libmem be aware it was anything other than a Packable. > > If we are going to use Formatter in the libmem dumper code. Then it has > to be detatched from StoreEntry completely. That also means detatching > it from Action which depends deeply on StoreEntry. Yes, I suspect we are [still] in agreement here. > And that it can only > use the Packable API, possibly via PackableStream (aka ostream&). I hope that Page/Formatter does not need to know about Packable, just ostream, but there may be some corner cases I am not aware of. >>>>> + // a comment >>>>> + virtual void notice(const SBuf &) = 0; >>>> >>>> Let's try to define the purpose of this method more precisely so that we >>>> know when it is being used [in]correctly. For example: >>>> >>>> /// a free-form comment or informational text that >>>> /// is not meant to be further parsed by automation tools >> >> >>> No. >>> >>> notice = "A free-form informational text block that is meant for display >>> without further processing." >>> >>> comment = "A free-form informational text that is not meant for >>> automated tool processing or display. May be discarded by automation." >> >> You have documented notice() as "a comment". Now you give two different >> definitions for those two words. Please make up your mind and adjust the >> notice() description accordingly. After that, it should be clear whether >> we agree what notice() is. > *Do you agree to those definitions* before I go changing any more code > around them? Do I agree that any notice and any comment fit your definitions? No. Fortunately, we are not writing word definitions for a dictionary but documenting a specific class method, so we only need to agree on what our notice() should do. I have proposed how to describe the notice() method. You said "No" and provided two different descriptions for the two words you previously used as _synonyms_. You did not say which of those two descriptions applies to the notice() method. There was no explanatory text accompanying the two descriptions either (unless you count "No" as such). I hope you can see why I asked you to pick one of the two descriptions so that we can focus the discussion on it. I believe your "notice" description is closer to what we need than your "comment" description. The key here, IMO, is that notice() text is not meant to be parsed further by automation tools. If we do not want to prohibit comparisons, then a tool may even compare the received notice to some notices it knows about, but only as an opaque blob of text, with no internal structure. Do I like "display without further processing"? No, I do not like that phrasing because it is not clear whether * further processing for display purposes is allowed; * other use (e.g., comparison with other notices) is prohibited; * automated reaction without display is prohibited. In my imperfect description example, I tried to avoid those pitfalls by focusing on prohibiting just one key activity: parsing of the notice value [to extract some usable parts]. If we prohibit that, then we are free to use any format/arrangement for the notice text, but must resist the temptation to include some information that automation scripts will want to parse. For example, the following are "bad" notices (Squid should not generate them): uptime: 1 hour built with libecap v1.1 reconfiguration finished in 5 seconds error: I/O error while responding to a cache manager request and the following are "good" notices: reconfiguration finished unsupported action unknown cache manager output format > For now notice() is just informational with a requirement that it > reaches the report reader. There is no other implications on teh > processing tool. If we require that a notice reaches a human report reader, does it mean that a script receiving a notice cannot act on it autonomously? If yes, I do not think we want that. If no, then I recommend avoiding such language as "report reader". > I'm not sure how to emphasise *informational* any better than using it > as the active noune in the descriptive. ALL-CAPS seems over the top. > Any suggestions? I do not think we should try to regulate that aspect of notice usage. What is "informational" (as in "FYI, X happened, but feel free to disregard"?) for one, could be very important (as in "do Y if you receive notice X") for another recipient. >> Going forward, please do not use "final" unless really necessary. Treat >> it like we treat "throw()" declarations. > You mean spotted liberally around the new code in the form of a wrapper > keyword like Must() ? No, I mean avoided like the now-deprecated "throw()" exception specification: http://en.cppreference.com/w/cpp/language/except_spec http://www.gotw.ca/publications/mill22.htm Proposed rule of thumb: If correct inheritance or overriding is possible, do not use "final", even if you do not see any good reasons to inherit or override. HTH, Alex. _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev