I think the phrase "search engine related API development frameworks"
might be the key to the misunderstanding here-- App Engine and Google
Search are simply products of the same company. It's like calling
Microsoft Windows a "search related operating system" simply because
Microsoft also makes Bing.

App Engine is just a thing for hosting web sites and web apps--
anything you could do on App Engine, you could also do on GoDaddy,
Bluehost, Rackspace, Azure, 1&1, Webfaction, Dreamhost, Slicehost,
Heroku, Engine Yard, A Small Orange, Amazon EC2, etc.

Things I *would* call "search engine related API development frameworks":

- http://www.rollyo.com/
- Google Custom Search (and it's API)
- Yahoo Boss: http://developer.yahoo.com/search/boss/




On Thu, Mar 31, 2011 at 8:48 PM, Álvaro Degives-Más
<adegives...@gmail.com> wrote:
> Hi Nick - and by extension, Barry as well (unfortunately I appear to have
> sent my reply directly to him - my apologies as I didn't CC myself so I
> can't share what exactly I wrote!)
>
> First of all, rest assured that my concerns are not necessarily with Google
> App Engine, but rather the species of search engine related API development
> frameworks that rely on that particular address space, perhaps more commonly
> referred to as cloud leveraged app platforms.
>
> The problem is that search engines - such as Google's - are routinely
> polluted; that is not attributable to negligence but it's the same sad
> reality nonetheless. Such polluted entries (e.g. certain queries) are used
> as a vector tampering with other, external properties. No amount of
> "sanitization" can counter the fundamental lack of a "permissible URL
> tokenizing" framework, i.e. something which communicates in a uniform manner
> to all interested parties (i.e. the Google family) what a "permissible" URL
> looks like.
>
> Sadly, the robots.txt syntax and the meta tag nofollow,noindex both lack
> this "syntax whitelisting" feature; they are not prescriptive ("only crawl
> and index the URLs that look like this, and ignore the rest"). Of course,
> with many if not most standard on-site search queries, it is possible to
> script page headers that include nofollow,noindex metatags. But many other
> kinds of dynamic content aren't easily "wrapped" with such headers.
>
> And that is where abuse of poisoned search engine indexes come into play.
>
> Just as I can't hunt down every non-canonical URL in the Google index,
> flagging issues case-by-case is not only not effective (if only because my
> logs demonstrate that) but practically prohibitive as well (I assume you can
> imagine that I'm not interested in hunting down all search engine based
> botnet traffic and relating that to individual sources) so my alternative is
> to simply shut down access to search engines. I don't have the time or the
> resources to play whack a mole with the ever increasing scourge of botnets.
> Incidentally, a look at traffic evolution in my traffic logs and a cursory
> look at some well-known email spam statistics suggests that indeed there's a
> quantum shift afoot, shifting from email to (particularly) smaller web
> property targeting for invasive "advertising" methods by the miscreants out
> there.
>
> And that is exactly what I have chosen to do: the well-behaved search
> engines (Google, Bing, Yahoo) are informed via robots.txt that they are not
> welcome, and their indexes are cleared out; the ill-behaved ones are blocked
> and upon sight rigorously reported to blacklists.
>
> Until there is something available which gives website proprietors
> (especially the small to medium sized ones!) a trivial and effective means
> to control which content is accessible for storage and further processing in
> the cloud, the internet will continue to shrink.
>
> Indeed, with heavy heart. But I don't have the resources to keep my
> web-based property open to "play nice" with worthwhile endeavors such as
> Google App Engine, while a notorious minority of criminals (I openly prefer
> the "terrorist" moniker) runs amok with virtual impunity. And so, I set a
> tight regime for wrapper security scripts (e.g. ZB Block, which I find quite
> effective and flexible).
>
> Hopefully you now understand better; it's not that I mistrust Google, or
> Google App Engine in particular. I just can't afford to be available for
> well-intended fun and games while carrying the weight of incidental abuse at
> my own expense.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>



-- 
Ross M Karchner

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to