Re: [PATCHES] equal() perf tweak
Bruce Momjian wrote: Is there a TODO here? I'm just evaluating the performances of two version the one proposed by Neil and the one in a stl::list fashion. Tom suggested also to see if is the case to implement the function size() constant time or not. Regards Gaetano Mendola. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PATCHES] equal() perf tweak
Neil Conway [EMAIL PROTECTED] writes: Interesting. I've heard in some shops it is standard policy to order the fields in all structs by their descending sizes (making allowances for platform-specific variations), so as to reduce padding. Do you think it would be worthwhile to systematically make this kind of reorganization throughout the backend? Not really. I'll go with ordering fields in a logical fashion (related fields together) every time. But when there's no particular semantic reason to choose one ordering over another, you might as well look at padding concerns for a tiebreaker. In this case there isn't any particular logical reason AFAICS to prefer whether length appears before or after head/tail, so why not pick the more compact form? (Note that I put a higher premium on avoiding padding in on-disk structures. But for transient in-memory structures, it definitely seems like a secondary consideration.) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] equal() perf tweak
Tom Lane wrote: Gaetano Mendola [EMAIL PROTECTED] writes: [ use list with master node and circular linking ] now is very late here ( 04:17 AM ) so I wrote the wrong code ( not checked ) but show the idea of avoid if. This saves an if during addition of a list item, at the cost of greater expense during list traversal. (Instead of a loop termination test like ptr != NULL, you need something like ptr != master_node. This may not take an extra cycle, but only if you can afford to dedicate a register to holding the value of master_node within the loop. That hurts, especially on architectures with not very many registers.) Well, if the same architecture have not so many register I guess don't have a double prefetch instruction queue and an if branch will hurt/destroy that queue, am I wrong ? I'm going to post on comp.lang.c++.moderated and comp.std.c++ in order to have better argumentations Anyway, we should build the code that way and then do profiling with and without support for constant-cost length(). I totally agree with you. Regards Gaetano Mendola ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] equal() perf tweak
Tom Lane wrote: Gaetano Mendola [EMAIL PROTECTED] writes: [ use list with master node and circular linking ] now is very late here ( 04:17 AM ) so I wrote the wrong code ( not checked ) but show the idea of avoid if. This saves an if during addition of a list item, at the cost of greater expense during list traversal. (Instead of a loop termination test like ptr != NULL, you need something like ptr != master_node. This may not take an extra cycle, but only if you can afford to dedicate a register to holding the value of master_node within the loop. That hurts, especially on architectures with not very many registers.) Well, if the same architecture have not so many register I guess don't have a double prefetch instruction queue and an if branch will hurt/destroy that queue, am I wrong ? I'm going to post on comp.lang.c++.moderated and comp.std.c++ in order to have better argumentations Anyway, we should build the code that way and then do profiling with and without support for constant-cost length(). I totally agree with you. Regards Gaetano Mendola ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PATCHES] equal() perf tweak
Ok, I've attached new versions of list.c and pg_list.h -- this is just a *rough sketch* of the new List code -- it definitely won't compile, the intent is just to concretely specify the new List design. Also, I've only updated the important list functions: I stopped at nth(). (I've attached the whole files, rather than diffs, because I thought that would be easier to read...) Any comments would be welcome -- I'm sure I've mucked a few things up. Also, since it doesn't compile and I haven't done any testing, there are probably some thinkos typos in the code. On Tue, 2003-11-04 at 11:40, Tom Lane wrote: Hmm ... taking that one step further, you could probably avoid the need for two palloc's in the first lcons/lappend if the List header were combined with the first ListCell. This would make for some grottiness in ldelete when removing the first list entry, but that's probably a good tradeoff. I haven't implemented this (yet), or the aset.c changes you suggested. -Neil /*- * * pg_list.h * POSTGRES generic list package * * * Portions Copyright (c) 1996-2003, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * * $Id: pg_list.h,v 1.41 2003/08/22 20:34:33 tgl Exp $ * *- */ #ifndef PG_LIST_H #define PG_LIST_H #include nodes/nodes.h /* * node definitions * */ /*-- * Value node * * The same Value struct is used for five node types: T_Integer, * T_Float, T_String, T_BitString, T_Null. * * Integral values are actually represented by a machine integer, * but both floats and strings are represented as strings. * Using T_Float as the node type simply indicates that * the contents of the string look like a valid numeric literal. * * (Before Postgres 7.0, we used a double to represent T_Float, * but that creates loss-of-precision problems when the value is * ultimately destined to be converted to NUMERIC. Since Value nodes * are only used in the parsing process, not for runtime data, it's * better to use the more general representation.) * * Note that an integer-looking string will get lexed as T_Float if * the value is too large to fit in a 'long'. * * Nulls, of course, don't need the value part at all. *-- */ typedef struct Value { NodeTag type; /* tag appropriately (eg. T_String) */ union ValUnion { long ival; /* machine integer */ char *str; /* string */ } val; } Value; #define intVal(v) (((Value *)(v))-val.ival) #define floatVal(v) atof(((Value *)(v))-val.str) #define strVal(v) (((Value *)(v))-val.str) typedef struct ListCell { union { void *ptr_value; int int_value; Oid oid_value; } elem; struct ListCell *next; } ListCell; typedef struct List { NodeTag type; ListCell *head; ListCell *tail; int length; } List; #defineNIL ((List *) NULL) /* * accessor macros * * The general naming convention is that the base name xyz() is for the * pointer version, xyzi() is for integers, xyzo() is for Oids. We don't * bother with multiple names if the same routine can handle all cases. * */ #define length(l) ((l)-length) #define value(cell) ((cell)-elem.ptr_value) #define ivalue(cell) ((cell)-elem.int_value) #define ovalue(cell) ((cell)-elem.oid_value) #define lfirst(l) value((l)-head) #define lfirsti(l) ivalue((l)-head) #define lfirsto(l) ovalue((l)-head) #define llast(l) value((l)-tail) #define lsecond(l) value((l)-head-next) #define lthird(l) value((l)-head-next-next) #define lfourth(l) value((l)-head-next-next-next) /* * foreach - * a convenience macro which loops through the list */ #define foreach(_cell_,_list_) \ for (_cell_ = (_list_)-head; _elt_-next != NULL; _elt_ = _elt-next) /* * Convenience macros for building fixed-length lists */ #define makeList1(x1)lcons(x1, NIL) #define makeList2(x1,x2) lcons(x1, makeList1(x2)) #define makeList3(x1,x2,x3) lcons(x1, makeList2(x2,x3)) #define makeList4(x1,x2,x3,x4) lcons(x1, makeList3(x2,x3,x4)) #define makeListi1(x1)lconsi(x1, NIL) #define makeListi2(x1,x2) lconsi(x1, makeListi1(x2)) #define makeListo1(x1)lconso(x1, NIL) #define makeListo2(x1,x2) lconso(x1, makeListo1(x2)) /* * function prototypes in nodes/list.c */ extern Value *makeInteger(long i); extern Value *makeFloat(char *numericStr); extern Value *makeString(char *str); extern Value *makeBitString(char *str); extern List *lcons(void *datum, List *list); extern List *lconsi(int datum, List *list); extern List *lconso(Oid datum, List *list); extern List *lappend(List *list, void *datum); extern List *lappendi(List *list, int datum); extern List
Re: [PATCHES] equal() perf tweak
Neil Conway [EMAIL PROTECTED] writes: #define length(l) ((l)-length) #define value(cell) ((cell)-elem.ptr_value) #define ivalue(cell) ((cell)-elem.int_value) #define ovalue(cell) ((cell)-elem.oid_value) Couple of objections here: one is that defining such common words as length or value as macros is going to break everything in sight (starting with the List struct itself ;-)). Also, if we are going to use NIL to represent an empty list, length() is really going to need to be like (l) ? l-length : 0 Coupling that objection with the fear of multiple evaluations of arguments (not sure doing so would break anything, but definitely not sure that it wouldn't), I think that length() will have to continue to be a function not a macro. Possibly for the gcc case we could make it a static inline function. As for value() and friends, use a less generic name like cellvalue() ... and please respect the naming convention stated just five lines earlier. #define lfirst(l) value((l)-head) #define lfirsti(l)ivalue((l)-head) #define lfirsto(l)ovalue((l)-head) No, lfirst and friends will need to apply to ListCells not to Lists, else the coding of foreach loops will break everywhere. I think there are many more places that will want lfirst to apply to a ListCell than will want it to apply to a raw List. You will probably want to invent a function (not macro for fear of multiple evals) like ListCell *firstcell(l) { return l ? l-head : NULL; } and use this (not a direct reference to l-head) in foreach(). Then the places that do want a true lfirst() on a list will need to be rewritten as lfirst(firstcell(list)). Not sure what to do about the lsecond/lthird/lfourth macros. Those are not used much. Maybe we could rename them to ListSecond etc and then define ListFirst as lfirst(firstcell(list)) for the places that do want it to work that way. #define foreach(_cell_,_list_) \ for (_cell_ = (_list_)-head; _elt_-next != NULL; _elt_ = _elt-next) The test condition still needs to be _elt_ != NULL. * XXX a few of the following functions are duplicated to handle * List of pointers and List of integers separately. Some day, * someone should unify them.- ay 11/2/94 How much of this NOTE can go away now? List *retval; retval = palloc(sizeof(*retval)); Can't we use makeNode()? /* * Add a new, and as yet undefined, head element to the list, which * must be non-NIL. The list is mutated in place. The caller should * fill in the value of the new head element. */ static List * new_head_elem(List *list) I don't think you want to do things this way; it won't scale up to the merge-the-first-element-into-the-List-header approach. What would probably be better is to have a combined build-the-List-header-and-first-cell subroutine. Also, don't put in cases to handle initially-empty lists, because there won't be any. check_invariant() could reasonably also test that list-tail-next is NULL. nconc(List *l1, List *l2) { /* * XXX: memory allocation is ambiguous here: does the caller free * the memory for l1 or l2? */ At present, the caller doesn't pfree anything, since the result list needs every cell in both inputs. I think what we will want nconc to do is pfree the second List header; in the event that the second List's first element is palloc'd with the header, it will have to be re-allocated (or maybe better, just re-use the storage in place). You'll need to look at all the uses of nconc (fortunately there are not many) to verify that the second List is never again used as an independent entity. If it is, then we can't use nconc anyway unless we first copy the second List, as later mods to either list could destroy the invariant for the other list. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] equal() perf tweak
Gaetano Mendola [EMAIL PROTECTED] writes: Why instead of reinvent the whell not use, or at least do a C port of stl::list ? Because (a) implementing a linked list is pretty trivial (b) the only difficult part is getting the semantics / API right. I don't see how std::list would help with (b), and (a) negates the benefit of importing the code from elsewhere. We'd also have to gut std::list, since we wouldn't be able to make use of C++ templates. That said, if you know of any specific techniques from std::list implementations that would be useful, please let me know. PS: My 2 cents: I don't like too much have the lenght inside the list struct. Why not? -Neil ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] equal() perf tweak
Neil Conway wrote: Gaetano Mendola [EMAIL PROTECTED] writes: Why instead of reinvent the whell not use, or at least do a C port of stl::list ? Because (a) implementing a linked list is pretty trivial (b) the only difficult part is getting the semantics / API right. I don't see how std::list would help with (b), and (a) negates the benefit of importing the code from elsewhere. We'd also have to gut std::list, since we wouldn't be able to make use of C++ templates. That said, if you know of any specific techniques from std::list implementations that would be useful, please let me know. An interesting think that stl::list do is to never do an if branch during an insert/remove phase, an stl::list is a struct with a Master Node: struct node { void* value; node* next; node* prev; } struct list { node* master_node; }; here respect your proposal a node is saved. an empty list is initialized with: master_node = [ ... create a node ... ] master_node-next = master_node; master_node-prev = master_node; and the insertion phase sound like: void AddHead(list *l, node *e) { node *where = l-master_node-next; e-next = where; e-prev = where-prev; where-prev-next = e; where-prev = e; } void AddTail(list *l, node *e) { node *where = l-master_node; e-next = where; e-prev = where-prev; where-prev-next = e; where-prev = e; } now is very late here ( 04:17 AM ) so I wrote the wrong code ( not checked ) but show the idea of avoid if. PS: My 2 cents: I don't like too much have the lenght inside the list struct. Why not? What if you will never call the size() on a list doing massive insert ? May be we need two list, depending on the way we are going to use it? I'm too much C++ oriented and another very weak argument is: for the same reason that STL (at least in gcc) doesn't have it: http://gcc.gnu.org/ml/gcc-bugs/1998-12/msg00270.html may be the third point not apply to us. Regards Gaetano Mendola ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] equal() perf tweak
Gaetano Mendola [EMAIL PROTECTED] writes: [ use list with master node and circular linking ] now is very late here ( 04:17 AM ) so I wrote the wrong code ( not checked ) but show the idea of avoid if. This saves an if during addition of a list item, at the cost of greater expense during list traversal. (Instead of a loop termination test like ptr != NULL, you need something like ptr != master_node. This may not take an extra cycle, but only if you can afford to dedicate a register to holding the value of master_node within the loop. That hurts, especially on architectures with not very many registers.) Offhand I do not see a win here. Node addition implies a palloc which means you are going to be jumping control all over the place anyway; there's no way that eliminating one if in that path of control is going to be a meaningful improvement. Saving a cycle or two during list traversal could be a meaningful improvement, though. By the same token, keeping a length count in the master node might be a waste of cycles, but it's going to be a pretty tiny waste compared to the other overhead of adding or removing a node. The potential savings from making length() be a constant-time operation seem like a win to me. Anyway, we should build the code that way and then do profiling with and without support for constant-cost length(). regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] equal() perf tweak
Tom Lane [EMAIL PROTECTED] writes: This does suggest that it might be worth making the struct layout be int NodeTag; int length; foo *head; foo *tail; since this would avoid some padding overhead on a machine where pointers are 8 bytes and need 8-byte alignment. It probably doesn't help given the current implementation of palloc, but why throw away padding space? Interesting. I've heard in some shops it is standard policy to order the fields in all structs by their descending sizes (making allowances for platform-specific variations), so as to reduce padding. Do you think it would be worthwhile to systematically make this kind of reorganization throughout the backend? (Of course, we'd need to be weary of code that depends on order of the fields in structs, naturally -- such as the NodeTag must be the first field rule.) -Neil ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] equal() perf tweak
Neil Conway [EMAIL PROTECTED] writes: This code is inefficient, however: length() requires iterating through the entire list, so this code scans both lists twice. The attached patch improves this by implementing equal() with a single scan of both lists. You have effectively reverted the code to its previous slow state. The problem with what you've done is that it recursively applies equal() to the list contents, which may take a very large number of cycles, only to eventually fail because one list is a prefix of the other. The point of the current coding is to detect that condition before we recurse. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] equal() perf tweak
On Mon, 2003-11-03 at 10:04, Tom Lane wrote: You have effectively reverted the code to its previous slow state. Erm, the original version of this code in CVS (from ~7 years ago) is the following: case T_List: { List *la = (List*)a; List *lb = (List*)b; List *l; if (a==NULL b==NULL) return (true); if (length(a)!=length(b)) return (false); foreach(l, la) { if (!equal(lfirst(l), lfirst(lb))) return (false); lb = lnext(lb); } retval = true; } break; i.e. it is effectively the same as the code in CVS HEAD. To what previous state does this patch revert the code? The problem with what you've done is that it recursively applies equal() to the list contents, which may take a very large number of cycles, only to eventually fail because one list is a prefix of the other. The point of the current coding is to detect that condition before we recurse. I don't understand: granted, this applies equal() to each element of the list, but why would that be particularly slow? equal() applied to the lfirst() of a list doesn't invoke equal() on a T_List node (the tag of the lfirst() of the list is the data type of the elements of the list), so there should only be a single level of recursion. I was going to benchmark this, but pulling the list code out of the rest of the source is a bit hairy. Let me know if you think this would be helpful. -Neil ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] equal() perf tweak
Neil Conway [EMAIL PROTECTED] writes: On Mon, 2003-11-03 at 10:04, Tom Lane wrote: You have effectively reverted the code to its previous slow state. Erm, the original version of this code in CVS (from ~7 years ago) is the following: Hm. I coulda sworn that at some point I changed that code from essentially-what-you're-proposing to the current state. Memory is misserving me I guess. I don't understand: granted, this applies equal() to each element of the list, but why would that be particularly slow? The point is that the elements of the lists could be large, complicated structures that will take noticeable amounts of time to compare. A typical case would be that the elements are targetlist structures, with a minimum of three contained nodes each with several fields. equal() on that will take much longer than just counting the presence of the list element. I will grant that there are situations where your proposed change would be a win --- in particular, for long lists wherein the first few elements differ, it'd be faster to do it as you suggest. But Postgres mostly deals with relatively short lists, in which situation the compare-the-lengths step is quite cheap and can save a fair amount of per-node comparison processing. So I don't think it'd be a win overall to change. FYI, equali() (now in list.c) used to look pretty nearly like the code in equal(), but now looks exactly as you propose for equal(). The difference is that in equali(), we know for certain that comparing two list members is a very cheap operation. In equal() it's likely to be nontrivial. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]