Hello buds, It's been a while since I've been active on the list but I figured I'd give a holler and see if anyone had any suggestions for an application design problem I've run into.
I have a large number of text fields across many tables in a fairly large database all of which can be manipulated in any number of ways. Some common manipulations would be scrubbing strings for display on the web (XHTML compliance and XSS avoidance), censoring of "bad" words, rich-text, etc. All in all, once you mix and match all of the various text manipulations, you end up with a large number of versions of the same chunk of text, and you need access to all of them based on a plethora of variables such as user options, access interface etc. On top of that, some fields can be edited, and I'd like to keep copies of the entire revision history, which adds another level of complexity. Originally I thought of some sort of memory caching solution, but the main goal of this is to come up with a scalable solution and there is currently a few gigabytes of text that this would apply to, so if anything it would probably need to expire. It's possible that I could have some mixture of short-term memory cache and long-term disk cache, as disk/database space isn't a large concern. Another issue is manipulation function versioning, e.g. when a new word is added to the censor function, you want to purge the cache of all of the censored text created by the last version. Maybe I'm just over-complicating the entire thing, but doing this sort of manipulation on a high traffic site seems like a gigantic duplication of CPU-intensive work that could (and should) be avoided. I've come up with a lot of solutions, but none of them seem very elegant. I'm trying to avoid a lot of excess DB queries and SQL joins. I've done some searching around and it seems like anyone who has solved this problem hasn't discussed it publicly. I thought maybe someone dealing with locale on a large scale might have come up with a good solution, but since locale is mostly static, it doesn't seem to apply in most cases. So has anyone dealt with something similar, or is there an obvious solution that I'm missing? I'd be interested in hearing some of the more seasoned NYPHPer's opinions. Thanks for any advice in advance! -Max
_______________________________________________ New York PHP Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk NYPHPCon 2006 Presentations Online http://www.nyphpcon.com Show Your Participation in New York PHP http://www.nyphp.org/show_participation.php