Wow. That is a substantial block of text. Are you trying to say you are mad because some App Engine app is proxying your site?
On Thu, Mar 31, 2011 at 20:48, Álvaro Degives-Más <adegives...@gmail.com> wrote: > Hi Nick - and by extension, Barry as well (unfortunately I appear to have > sent my reply directly to him - my apologies as I didn't CC myself so I > can't share what exactly I wrote!) > > First of all, rest assured that my concerns are not necessarily with Google > App Engine, but rather the species of search engine related API development > frameworks that rely on that particular address space, perhaps more commonly > referred to as cloud leveraged app platforms. > > The problem is that search engines - such as Google's - are routinely > polluted; that is not attributable to negligence but it's the same sad > reality nonetheless. Such polluted entries (e.g. certain queries) are used > as a vector tampering with other, external properties. No amount of > "sanitization" can counter the fundamental lack of a "permissible URL > tokenizing" framework, i.e. something which communicates in a uniform manner > to all interested parties (i.e. the Google family) what a "permissible" URL > looks like. > > Sadly, the robots.txt syntax and the meta tag nofollow,noindex both lack > this "syntax whitelisting" feature; they are not prescriptive ("only crawl > and index the URLs that look like this, and ignore the rest"). Of course, > with many if not most standard on-site search queries, it is possible to > script page headers that include nofollow,noindex metatags. But many other > kinds of dynamic content aren't easily "wrapped" with such headers. > > And that is where abuse of poisoned search engine indexes come into play. > > Just as I can't hunt down every non-canonical URL in the Google index, > flagging issues case-by-case is not only not effective (if only because my > logs demonstrate that) but practically prohibitive as well (I assume you can > imagine that I'm not interested in hunting down all search engine based > botnet traffic and relating that to individual sources) so my alternative is > to simply shut down access to search engines. I don't have the time or the > resources to play whack a mole with the ever increasing scourge of botnets. > Incidentally, a look at traffic evolution in my traffic logs and a cursory > look at some well-known email spam statistics suggests that indeed there's a > quantum shift afoot, shifting from email to (particularly) smaller web > property targeting for invasive "advertising" methods by the miscreants out > there. > > And that is exactly what I have chosen to do: the well-behaved search > engines (Google, Bing, Yahoo) are informed via robots.txt that they are not > welcome, and their indexes are cleared out; the ill-behaved ones are blocked > and upon sight rigorously reported to blacklists. > > Until there is something available which gives website proprietors > (especially the small to medium sized ones!) a trivial and effective means > to control which content is accessible for storage and further processing in > the cloud, the internet will continue to shrink. > > Indeed, with heavy heart. But I don't have the resources to keep my > web-based property open to "play nice" with worthwhile endeavors such as > Google App Engine, while a notorious minority of criminals (I openly prefer > the "terrorist" moniker) runs amok with virtual impunity. And so, I set a > tight regime for wrapper security scripts (e.g. ZB Block, which I find quite > effective and flexible). > > Hopefully you now understand better; it's not that I mistrust Google, or > Google App Engine in particular. I just can't afford to be available for > well-intended fun and games while carrying the weight of incidental abuse at > my own expense. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to google-appengine@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.