Re: [Flexwiki-users] Performance analysis

David Ornstein Thu, 30 Aug 2007 22:50:56 -0700

OK, here's a review of how caching worked for WikiTalk in 1.8... (probably 95% 
right due to it being a couple of years ago when I wrote this).  But I did 
spend several months working on this and I was quite proud of the result.  In 
particular, I was surprised that given the dynamic nature of scripting that I 
was able to get a comprehensive caching architecture working.




First, there's an old, but basically right description here: 
http://www.flexwiki.com/default.aspx/FlexWiki/CachingUpdate.html



But you can read this and get the basic idea.



Caching involves two things:

1)   funneling requests for things through something that can store a value in 
the cache and return a cached value when it's available

2)   knowing when an how to invalidate entries



1 - The Cache



The first part is typically easy.  In 1.8, the Federation held the Cache.  The 
Cache was a fairly simple.  It could hold a variety of different types of 
things (e.g., the formatted content of a page, a list of topics in a namespace, 
a list of all namespaces, a list of all topics that contain a certain property, 
etc.).  For each of these kinds of thing that could be help by the cache, there 
was localized code that knew about the shape of keys in the cache (specially 
formatted strings) and what it could expect to find at certain locations in the 
cache.  The key string formats are generated in the methods KeyForXXX() in 
FederationCacheManager.



As an example, the code that formats up a topic and answers back HTML for the 
main section of the topic (as opposed to the borders which are cached 
individually) knows that the cache keys for formatted topics look something 
like FORMATTED_TOPIC.<FULLY-QUALIFIED-TOPIC-NAME> and that key structure is 
built into the code.  So when a formatted topic body is needed, the function 
being executed assembles up the cache key string like this and asks the cache 
if the entry's there.  If it is, it's returned right away.  If it's not, the 
heavy lifting is done to format it up and then the result is first stashed in 
the cache for next time (almost always but see below) and then returned.



WikiTalk doesn't do anything special here except how it enable cache rules 
about what can be cached at all and invalidation. See below.



A very instructive thing to do is go look at the cache viewer in the /admin 
tools in a running instance of 1.8.  You can see the strings used for keys and 
the sort of things held in the cache entries.  They are VERY self-explanatory.  
2-3 minutes of poking around will tell you a lot.



2 - Invalidation



OK...  So invalidation is driven by event that happen in the federation.  For 
example, changing a topic, creating a namespace, etc.  There are about ten 
events that can cause entries in the cache to be invalidated.  The Federation 
generates events when these things happen but the Federation is very loosely 
couple into the invalidation system.  In particular, the Federation just 
generates events for various things but has no idea that they are being 
monitored by the FederationCacheManager.  The FederationCacheManager keeps 
various lists of keys to invalidate in the cache when certain events happen.  
And it invalidates them without the cache knowing where the invalidations are 
coming from.  So the FederationCacheManager is the intermediary.  It gets 
notified when certain things happen in the Federation and then invalidates the 
appropriate entries in the cache.  For example, when a specific topic FOO gets 
Changes, the federation notifies the FCM and then the FCM looks up the list of 
cache keys to invalidate when that topic gets Changed and invalidates them.  It 
doesn't know why.  It just does it because it was magically instructed earlier 
to do so.



So who instructed it?  In short, when the cache entry was created, the FCM was 
also told what sort of events should invalidate the entry.  As an example, when 
the formatted body for topic FOO is put into the cache, the FCM is told that 
when FOO is changed that that particular cache entry should be invalidated.  A 
more interesting example might be: when some WikiTalk asks the Federation for a 
list of all namespaces, the list is built and the stuck in the cache under a 
certain key.  Then the FCM is told that when the namespace creation/deletion 
events fire from the Federation that this certain key should be invalidated in 
the cache.  It sounds complex, but the knowledge is all localized where the 
cache data is generated and used.



3 - WikiTalk and Invalidation



OK, so how does this work for WikiTalk?  You'll recall that by default, the 
cache entry for the formatted body of a topic is set up to be invalidated when 
the topic changed event is fired for that topic.  But that's just the base 
thing that can cause the page to be invalidated.  Depending on the WikiTalk on 
the page, there may be additional federation events that could cause the page 
to be invalidated.



Let's take an example.   Imagine a topic that includes one WikiTalk fragment 
that enumerates all the namespaces in a federation.  As the formatting for this 
page happens (the first time when there's nothing cached), we create with an 
"invalidation collection" of events that would force the cached formatted HTML 
for the page to be invalidated.  That collection starts with only one type of 
event: the topic-changed event for that topic.  But then part way through the 
formatting of the page we fire up the WikiTalk interpreter on the WikiTalk 
fragment.   As the WikiTalk code runs, various calls into the C# 
WikiTalk-implementing classes get made including one to the federation to get 
the list of all namespaces.  As a side-effect of calling this method, the 
"invalidation collection" gets another event stuck into it that says "this guy 
needs to be invalidated if namespaces get added or deleted."  After all the 
formatting is done, all of the information in this invalidation collection is 
enumerated and used to setup the FCM so that it will properly invalidate the 
cached formatted topic when EITHER the topic gets changed OR when namespaces 
are added or removed.



That's the basic idea but it expands a bit:



*         The WikiTalk that executes can generate many, many invalidation rules 
(called CacheRules - see the code).  If a page has WikITalk that reads and 
displayed the value of a property called Foo on Page1 and Bar on Page2 then the 
cache rules will be such that if Page1 or Page2 get changed, the cached 
dependent page will be invalidated.  All of these CacheRules are collected 
together using the Composite pattern during the formatting of a page.

*         Some WikiTalk is such that you can't cache.  The example that started 
this thread (DateTime.Now) is a perfect example.  In these cases, the 
invalidation collection will contain an instance of CacheRuleNever which trumps 
all the other rules and says "I can't ever be cached."



3- Bulk updates



For performance, it's useful to not send hundreds of similar invalidation 
events along from the Federation.  More importantly, the FCM can do some major 
optimization if it gets handed a pile of updates from the Federation all in a 
bundle.  The make this work, there's a single object called a FederationUpdate 
that can represent an arbitrarily complex SET of updates from the federation.  
The FederationUpdateGenerator actually bundles all the updates from the 
Federation into a single FederationUpdate and that's sent along en-masse to the 
FCM.







And a comment on a specific scenario you list below:



> For example, a WikiTalk script that scans the whole namespace looking

> for "Summary" properties and displaying them in a table can only be cached

> until anything at all in the namespace changes.



Right.   Which is why in 1.8 the caching system knows a lot about properties.  
The cache contains things like "all properties that have property 'Summary'".  
And why the federation generates add/delete/change events for properties as 
well as topics.



In 1.8, if I have WikiTalk that asks for all topics with a given topic, and 
then do it again, the underlying calculation is cached.  Importantly, this is 
caching using the same architecture above, but NOT ABOUT CACHING THE FORMATTED 
TOPIC.  There's a cache key like 'TOPICS-WITH-PROPERTY-Summary" that holds the 
list (and will be properly invalidated).  This not only makes the operation 
fast from WikiTalk but also just plain fast if it needs to be invoked natively 
inside the engine.



A scenario that illustrates why just caching at the topic level doesn't work is 
a page that generates a number of dynamically generated lists that are all 
based on the same underlying primitive (e.g., "all topics in namespace FOO").  
If this underlying primitive gets called ten times to generate ten lists and 
it's a slow primitive, then it'll take 10X for the page.  If the primitive is 
cached, it'll cost 1X.





OK, so what to do?



In my view, the above design was a good design.  I haven't followed enough of 
the details of the 2.0 restructuring to say at the class level if it would 
still be right, but I suspect it would.  With one big issue: the one Craig 
raises below about how security and caching play together.  Imagine some 
WikiTalk that displays a list of all topics in the current namespace but Sue 
and Tom both have different permissions for the various topics.  If Sue can see 
topics A, B and C and Tim can see A, B and D, the caching becomes tricky.  I 
don't know the right solution, but something that jumps out at me right away is 
that the cache keys could be extended to encode the dependency on the 
permissions checks that were passed (e.g., dependent data like a user name).  
In this example, that would mean that instead of a cache key like 
TOPICS-IN-NAMESPACE-X, you would end up with TOPICS-IN-NAMESPACE-X(Sue) and 
TOPICS-IN-NAMESPACE-X(Tom).  Not all cache entries would depend on the specific 
user (or maybe they would, not sure).



It's a start of a thought, anyway.



My overall point would be that I think the caching architecture was actually 
pretty loosely coupled and as long as the right events could be generated from 
the federation again (I assume they aren't anymore) that a lot of the old code 
could be made operational again - if we could solve the basic architecture 
question of how the caching and security interact.



/David





> -----Original Message-----

> From: [EMAIL PROTECTED] [mailto:flexwiki-

> [EMAIL PROTECTED] On Behalf Of Craig Andera

> Sent: Wednesday, August 29, 2007 12:30 PM

> To: 'FlexWiki Users Mailing List'

> Subject: Re: [Flexwiki-users] Performance analysis

>

> > I'm still trying to get the tests against my giant (1000 namespace)

> > corpus running with the perf tool.  I suspect we'll find more of

> those

> > more than twice as slow pages in a small number of cases.

> >

> > The real question to me is how do we feel about those cases.  If 98%

> of

> > the pages are "generally as fast as 1.8" but 2% are so slow as to be

> > "unusable" (because they are >10x slower) what do we think about

> that?

> > My view of course is that those are a pretty serious problem because

> if

> > people have them then they're probably part of the all-up solution

> for

> > them and if some of it stops working then some of it stops working.

> > It's a little bit like saying if 95% of the dialog boxes in Visual

> > Studio were usable but 5% were super slow/unusable what would we

> think.

>

> I have to say I generally agree with this assessment, although I'm not

> sure

> Visual Studio should be our quality metric...and yes, I realize there

> are

> two meanings in that statement: I meant it in both senses. :)

>

> Here's my general thinking as to why we're where we're at:

>

> When I added security, that was bound to slow things down a bit because

> it

> does extra checking on just about every core operation. There are

> generally

> several core operations per page. So that probably accounts for a bit

> of the

> slowness. But I think the bigger deal is what I did to the WikiTalk

> engine.

>

> When I went to add security, the biggest motivator for the

> rearchitecture

> was the caching that was present. There was a fair amount of it, and it

> was

> really intertwined with other aspects of the code. Obviously, caching

> and

> security are related: I don't want to serve David a copy of a page that

> was

> rendered for Craig if David is not supposed to see it. As a result, the

> pipelined architecture does this (more or less):

>

> NamespaceManager => Security => Caching => Property Parsing => Built-in

> topics => Filesystem Store

>

> So we check permissions against what comes back from cache, not the

> other

> way around. And that works well.

>

> The issue is that that whole content pipeline is a full level below the

> WikiTalk stuff, which interacts with NamespaceManager and Federation,

> not

> the content pipeline. So incorporating *content* caching into the

> WikiTalk

> engine would result in code that cuts across layers, which is the mess

> I was

> trying to avoid in the first place.

>

> One option is to add another layer of caching up at the web layer.

> Specifically, *output* caching. So we'd get a rendered page, and then

> if

> nothing has changed, we could just spew it again on the next request.

> This

> would make just about every page way, way, way, way faster, once it had

> been

> rendered the first time. The problem is, it's slightly hard to get

> right.

> The caching itself is pretty easy, but the cache expiration stuff is

> not.

> For example, a WikiTalk script that scans the whole namespace looking

> for

> "Summary" properties and displaying them in a table can only be cached

> until

> anything at all in the namespace changes. And any change whatsoever to

> _ContentBaseDefinition invalidates everything else...in case there was

> a

> security change at the namespace level.

>

> I can see how to do that, but it's not a small job. Also, I'm not sure

> it'll

> really solve the problem of "unusably slow pages", because for

> infrequently

> accessed pages, or even just pages that fall out of cache, they're

> still

> going to take just as long to render as they do in the current code. On

> top

> of that, we've got a *scripting engine* in the mix. Cache expiration

> based

> on changes to the content is all well and good when the content

> deterministically produces the output, but you can't cache anything

> else. So

> if WikiTalk had the equivalent of DateTime.Now, you obviously can't

> reliably

> cache that.

>

> Now, maybe it's the case that WikiTalk can only deterministically

> produce

> output based on content. I don't know. Does anyone? If so, it makes

> life

> easier. But even if not, then the other option we can look at is making

> the

> WikiTalk execution engine smarter. I believe this is what was in 1.8 -

> there

> were certainly a bunch of annotations on the WikiTalk code about what

> could

> and could not be cached. Presumably, it let us cache the results of

> particular operations, like an Array.Sort of the topics in a namespace.

> I

> never really understood that stuff, which didn't really matter, since

> it had

> to go to accommodate the new architecture. But maybe it could be added

> back

> in. David - perhaps you could fill me in a bit on how that stuff

> worked.

>

> In the meantime, I'll take a look at the old code to see if I can

> figure out

> how it worked. I don't really have any other ideas at this point: my

> attempts at profiling the existing bottlenecks with the free tools I

> have

> available have failed.

>

>

>

> -----------------------------------------------------------------------

> --

> This SF.net email is sponsored by: Splunk Inc.

> Still grepping through log files to find problems?  Stop.

> Now Search log events and configuration files using AJAX and a browser.

> Download your FREE copy of Splunk now >>  http://get.splunk.com/

> _______________________________________________

> Flexwiki-users mailing list

> [email protected]

> https://lists.sourceforge.net/lists/listinfo/flexwiki-users

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

_______________________________________________
Flexwiki-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/flexwiki-users

Re: [Flexwiki-users] Performance analysis

Reply via email to