I like using phantomjs for scraping, though I haven't used it on heroku yet.

I see there is a buildpack
https://github.com/stomita/heroku-buildpack-phantomjs

this example might be helpful
http://benjaminbenben.com/2013/07/28/phantomjs-webserver/


On Mon, Dec 20, 2010 at 4:11 AM, Josal <[email protected]> wrote:

> Hi, guys.
>
> I want to scrap an HTML site which is using javascript to generate the
> contents. So, I can't use mechanize gem or similar ones. I've tried
> rdom and taka with johnson, but still some problems (I could give you
> more details). The best and easiest option I have at the moment is to
> use watir (or selenium or celerity for jruby). I've selected watir,
> it's simple, the watir gem or even the watir-webdriver gem. I like
> them. But I have two problems:
>
> - I want to deploy the app in heroku but I get the error: "Could not
> find Firefox binary (os=linux)".
> - I don't know if it's possible to access to the watir logic without
> the need of the browser binary (and without open it in background).
>
> I currently have an answer here:
>
> http://stackoverflow.com/questions/3597118/can-you-deploy-watir-on-heroku-to-generate-html-snapshots-if-so-how
> ,
> but I just wanted to confirm the options I have.
>
> I write a watir-webdriver example, working well in local, to ilustrate
> the simple process (in this case html is not dynamically generated, of
> course, it's only an example):
>
>   require "rubygems"
>   require "watir-webdriver"
>   require "watir-webdriver/extensions/wait"
>
>   browser = Watir::Browser.new :firefox
>   browser.goto "http://google.com";
>   browser.text_field(:name, 'q').set "watir-webdriver"
>   browser.button(:name, 'btnG').click
>
> Maybe the only option I have is to use EC2, but it's a pitty because I
> only need to scrap javascript-generated HTML and I want to keep on
> using heroku, I love it!!!
>
> What do you think is the best gem for me to do it on heroku? Or
> there's no option and I have to use EC2 just to open a browser, losing
> the heroku goodness?
>
> Thanks in advance
>
> --
> You received this message because you are subscribed to the Google Groups
> "Heroku" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en.
>
>


-- 
Thanks,
-John

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"Heroku Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to