18-Dec-2002 04:58 Bela Lubkin wrote: > Leonid Pauzner wrote: One comment on the next page.
>> Three patches aginst GridText.c attached >> (please check whether uuencoded patches now applyed cleanly). > All apply cleanly! >> patch #1: bela.dif >> changes proposed by Bela Lubkin, to optimize ALLOC_IN_POOL macro >> substitution, pack bitfields in HTStyleChanges more compact on some systems. >> Applied against clean dev.11 in either way. > This is the only one of the three patches that I've actually attempted > to analyze. It's true that it implements some suggestions of mine, but > not the main one, which was that I still think that ALLOC_IN_POOL should > be a function, not a macro. It's a lot of code to be duplicating. It > would require a little work to preserve the type polymorphism, but not > much (as long as you're willing to restrict the polymorphism to the > types you already intend to use it with). > Large macros are generally performance-negative. Once upon a time > it may have been a win to avoid the overhead of establishing a call > frame, doing a call and return, etc. Those times are gone. Bela, I completely agree with you. I want to say only the following: - ALLOC_IN_POOL appears as macros in 1999 when it was implemented for color styles. So it is an inherited code, I just clean it a bit and left as is. - The pool-allocation code, of cause, significantly faster than malloc, but it is not a hot point: it is used in split_line() and HText_BeginAnchor(), e.g. once a line or anchor (and we have *functions* HText_getLastChar() and HText_AppendCharacter() called for each character). It also used in ~5 other places in GridText.c which called very seldom. I think the code growth is not that dramatic comparing to lynx in general:) ALLOC_IN_POOL may be rewritten as function but not before Tom will release dev version that work (no leaks nor memory overrun). - The above patch optimize some repeated arithmetic seen in your macro substitution listing, no more. > Now the interesting factors are things like: is this code hot in the CPU > cache? If it's a function, only one copy has to be hot in the cache, so > your cache footprint is smaller. Your code is less likely to be pushed > out by something else; something else is less likely to be pushed out by > you. At about the same level: does the CPU's branch prediction table > remember how the decisions went last time through this code? Not if > you've scattered copies of the same code all over the place -- it has > no clue that they're from the same source, have the same basic behavior > patterns. > Next level up: are the virtual address translations for the pages that > contain your code (one copy as a function, or many as a macro) in the > CPU's TLB? > Are the pages themselves in core or do they have to be paged in from the > executable image? How big is the image itself? Will paging these pages > in cause some of your own data pages to be swapped out; will you get in > a fight with some other process, trading control of certain physical > memory pages back and forth? > Macros that contain significant amounts of code thwart many levels of > hardware optimization. >>Bela< > ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED] ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED]
