On Thursday, June 6, 2013 3:06:58 PM UTC-5, LC wrote:
>
> Well, my question is quite general... So, I want to write a web app that 
> every time the user clicks on a button it connects to 5 or 6 other 
> websites, reads 2 or 3 pages from each one, does some processing on the 
> data read from these sites and then show some result to the user. At first 
> I thought to create a jsp do to the job, but then I thought that maybe 
> doing everything (the connection to the websites and data processing) on 
> the client side using javascript would save me a lot of server side 
> resources (and money)



Depends on what you're processing.

If these other sites are sending back JSONP/AJAX ( 
http://en.wikipedia.org/wiki/Ajax_(programming) ) responses, then you can 
process it in Javascript on the client side. It's quite easy to do (there 
are a bunch of AJAX tutorials on the web) and relatively fast since you're 
using the user's computer to do the processing and data retrievals.

However, if the sites you're reading from are serving regular web pages 
(not JSONP responses) then you have to use server side code. The reason is 
that Javascript code is sandboxed; it's not allowed to pull just any 
arbitrary document on the web. You'll need to write a scraper (
http://en.wikipedia.org/wiki/Web_scraping) to urlfetch the sites and 
process the HTML. Fortunately there are libraries for this: BeautifulSoup ( 
http://www.crummy.com/software/BeautifulSoup/ ) for Python and HTMLUnit ( 
http://htmlunit.sourceforge.net/ ) for Java.


-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

My Go side project: http://invalidmail.com/

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to