Re: Any plans for some sort of shared storage?
Yuri, great question. It's mostly an issue of scale. Let me explain a bit about how heroku works. At a high level, each dyno is it's own process on a railgun (that's our EC2 server instance for ruby processing). Your app can have 1+ dyno, running across a large cluster of railgun servers. The individual dyno process and railgun process are volatile based on load, ec2 fuzzyness, etc. To use EBS, we would have to somehow connect the right store to the right dyno at the right time, with that changing second by second across many tens of thousands of dynos. On top of that, when one app starts getting high traffic, you'd find that the EBS would be WAY too slow to actually write to and still maintain performance, so adding dynos would actually start to slow your application down. An EBS can only handle ~ 200 writes/second. Next we'd be talking about raiding EBS (which is what we do for our database), but that just gets crazy if you need to constantly move the EBS mount point. This boils down to a "share nothing" discussion. (http://en.wikipedia.org/wiki/Shared_nothing_architecture ). Yes, the DB is a shared point, but we've been able to come up with some constraints that make it performant. We haven't figured out a way to do that with file systems yet. Oren On Oct 7, 2009, at 12:53 PM, Yuri Niyazov wrote: > > Heroku runs on ec2, no? What prevents exposure of an EBS block to an > app? > > On Tue, Oct 6, 2009 at 5:32 PM, Oren Teich wrote: >> >> Hi Neil, >> There aren't any plans for shared storage right now. I wish I could >> say otherwise. The big challenge is that we're able to achieve our >> scaling and overall cool features by imposing certain constraints. A >> read-only file system is one of those. If we had a shared >> filesystem, >> we wouldn't be able to offer the scalability or performance that >> makes >> our platform so unique. There are some interesting plugins that some >> 3rd partys have in development that will capture filesystem requests >> and send them to S3 instead. Hopefully as these come out they might >> help address some of the constraints for you. >> >> Believe me, I wish we could figure out a way to give you a writeable >> filesystem! >> >> Good luck, >> Oren >> >> On Oct 6, 2009, at 2:53 AM, Neil wrote: >> >>> >>> One of the big issues we have with Heroku as a business at the >>> moment >>> is the read-only file system and the fact that this then renders >>> some >>> of our legacy apps, and our CMS of choice, BrowserCMS, unable to be >>> deployed without some serious hacker-age. >>> >>> Question is, are there any plans at all for some sort of shared >>> storage, where I can somehow define a directory or two that should >>> be >>> shared across all dyno's serving my application, as well as across >>> all >>> deploys of my application? >>> >>> For me, this is a major thing, and it would be good to remove one of >>> the big barriers that people might have. >> >> >>> >> > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: Any plans for some sort of shared storage?
Heroku runs on ec2, no? What prevents exposure of an EBS block to an app? On Tue, Oct 6, 2009 at 5:32 PM, Oren Teich wrote: > > Hi Neil, > There aren't any plans for shared storage right now. I wish I could > say otherwise. The big challenge is that we're able to achieve our > scaling and overall cool features by imposing certain constraints. A > read-only file system is one of those. If we had a shared filesystem, > we wouldn't be able to offer the scalability or performance that makes > our platform so unique. There are some interesting plugins that some > 3rd partys have in development that will capture filesystem requests > and send them to S3 instead. Hopefully as these come out they might > help address some of the constraints for you. > > Believe me, I wish we could figure out a way to give you a writeable > filesystem! > > Good luck, > Oren > > On Oct 6, 2009, at 2:53 AM, Neil wrote: > >> >> One of the big issues we have with Heroku as a business at the moment >> is the read-only file system and the fact that this then renders some >> of our legacy apps, and our CMS of choice, BrowserCMS, unable to be >> deployed without some serious hacker-age. >> >> Question is, are there any plans at all for some sort of shared >> storage, where I can somehow define a directory or two that should be >> shared across all dyno's serving my application, as well as across all >> deploys of my application? >> >> For me, this is a major thing, and it would be good to remove one of >> the big barriers that people might have. >> > > > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: Reading Email
Cron runs on a separate single process. It doesn't matter how many dyno's you have, you'll only have one cron process ever running. If you're seeing other behavior, let us know! Oren On Oct 7, 2009, at 8:13 AM, Yuri Niyazov wrote: > > Also, I forgot the following fun fact about Heroku's cron service. > This was true when I investigated it; might still be true now - not > sure. > > Since your app runs on X Heroku VMs, where X is often > 1, then, when > you use Heroku's cron, the cronjob is executed on each box > simultaneously -> unless you do something clever (and I was unable to > figure out what that something clever is), X email processor instances > run at the same time. If you need guarantee that each email is > processed once only, this will screw it up for you. > > On Wed, Oct 7, 2009 at 11:05 AM, Yuri Niyazov > wrote: >> I haven't checked out the online cron services yet, but there's >> another issue that I had to solve, and I don't know whether they >> would >> support this or not: >> >> Heroku limits the execution time of every request to 30 seconds each, >> and a request that takes longer than that is abruptly interrupted. >> This means that the magic URL handler has to be written in such a way >> that it doesn't take longer than 30 secs; I decided to take the >> dirty-hack approach to this: the URL handler processes two emails >> at a >> time (let's say that 30 seconds is almost always enough to open an >> IMAP connection, do a search, and download the text of two emails). >> However, the URL handler checks the total number of messages to be >> processed, and returns a status code for same. So: >> >> upto = 2 >> msg_id_list = imap.search(["NOT", "DELETED"]) >> msg_id_list = msg_id_list[0, upto] if upto >> msg_id_list.each do |msg_id| >>m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"] >>process m >> end >> render :json => msg_id_list.to_json >> >> >> and then in the script on the cron-box: >> >> do >> msg_id_list = call_url.parse_json >> until msg_id_list.empty? >> >> >> As far as the Google indexing your URL issue: make sure that the GET >> request returns a blank page, and the POST actually executes the >> cronjob. And, of course, you can always protect that URL via >> basic-auth or authenticity-token. >> >> On Wed, Oct 7, 2009 at 7:42 AM, Wojciech wrote: >>> so I have a separate box with actual crond on it, and it has a script that hits a specific URL on my app on heroku every x minutes to process email. >>> >>> There are services that do it for you (i.e. periodically call your >>> magic URL): >>> http://www.onlinecronservices.com/ >>> >>> But be careful: this URL could be called by anybody and could even >>> get >>> indexed by Google. You might allow only certain IPs (ip of your >>> online >>> cron service) to call this URL to protect the app. >>> >>> There's also this "poor man's cron" approach, I've seen in Drupal: >>> http://drupal.org/project/poormanscron - but it's a bit crazy. >>> >>> Cheers, >>> Wojciech >>> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe wrote: > Rails makes it so easy to send emails. Recieving emails isn't that > difficult either, but requires a cron or daemon. What is the > best way > to do this on Heroku today? > Carl > >>> >> > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: Reading Email
It almost looks like you should have cron on another box create Delayed Jobs to do this. This solves your "30 second timeout" problem but introduces another complexity. Maybe this is a necessary complexity. As for the Google get issue, I would just not put a link on any page to the URL, and put a Disallow in your robots.txt User-agent: * Disallow: /process-emails.html Or just leave it out entirely, that way no one knows it exists. Security through obscurity is usually "not good enough" but in this case I am not sure that it matters. On Wed, Oct 7, 2009 at 11:13 AM, Yuri Niyazov wrote: > > Also, I forgot the following fun fact about Heroku's cron service. > This was true when I investigated it; might still be true now - not > sure. > > Since your app runs on X Heroku VMs, where X is often > 1, then, when > you use Heroku's cron, the cronjob is executed on each box > simultaneously -> unless you do something clever (and I was unable to > figure out what that something clever is), X email processor instances > run at the same time. If you need guarantee that each email is > processed once only, this will screw it up for you. > > On Wed, Oct 7, 2009 at 11:05 AM, Yuri Niyazov wrote: >> I haven't checked out the online cron services yet, but there's >> another issue that I had to solve, and I don't know whether they would >> support this or not: >> >> Heroku limits the execution time of every request to 30 seconds each, >> and a request that takes longer than that is abruptly interrupted. >> This means that the magic URL handler has to be written in such a way >> that it doesn't take longer than 30 secs; I decided to take the >> dirty-hack approach to this: the URL handler processes two emails at a >> time (let's say that 30 seconds is almost always enough to open an >> IMAP connection, do a search, and download the text of two emails). >> However, the URL handler checks the total number of messages to be >> processed, and returns a status code for same. So: >> >> upto = 2 >> msg_id_list = imap.search(["NOT", "DELETED"]) >> msg_id_list = msg_id_list[0, upto] if upto >> msg_id_list.each do |msg_id| >> m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"] >> process m >> end >> render :json => msg_id_list.to_json >> >> >> and then in the script on the cron-box: >> >> do >> msg_id_list = call_url.parse_json >> until msg_id_list.empty? >> >> >> As far as the Google indexing your URL issue: make sure that the GET >> request returns a blank page, and the POST actually executes the >> cronjob. And, of course, you can always protect that URL via >> basic-auth or authenticity-token. >> >> On Wed, Oct 7, 2009 at 7:42 AM, Wojciech wrote: >>> so I have a separate box with actual crond on it, and it has a script that hits a specific URL on my app on heroku every x minutes to process email. >>> >>> There are services that do it for you (i.e. periodically call your >>> magic URL): >>> http://www.onlinecronservices.com/ >>> >>> But be careful: this URL could be called by anybody and could even get >>> indexed by Google. You might allow only certain IPs (ip of your online >>> cron service) to call this URL to protect the app. >>> >>> There's also this "poor man's cron" approach, I've seen in Drupal: >>> http://drupal.org/project/poormanscron - but it's a bit crazy. >>> >>> Cheers, >>> Wojciech >>> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe wrote: > Rails makes it so easy to send emails. Recieving emails isn't that > difficult either, but requires a cron or daemon. What is the best way > to do this on Heroku today? > Carl >>> >> >>> >> > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: Reading Email
Also, I forgot the following fun fact about Heroku's cron service. This was true when I investigated it; might still be true now - not sure. Since your app runs on X Heroku VMs, where X is often > 1, then, when you use Heroku's cron, the cronjob is executed on each box simultaneously -> unless you do something clever (and I was unable to figure out what that something clever is), X email processor instances run at the same time. If you need guarantee that each email is processed once only, this will screw it up for you. On Wed, Oct 7, 2009 at 11:05 AM, Yuri Niyazov wrote: > I haven't checked out the online cron services yet, but there's > another issue that I had to solve, and I don't know whether they would > support this or not: > > Heroku limits the execution time of every request to 30 seconds each, > and a request that takes longer than that is abruptly interrupted. > This means that the magic URL handler has to be written in such a way > that it doesn't take longer than 30 secs; I decided to take the > dirty-hack approach to this: the URL handler processes two emails at a > time (let's say that 30 seconds is almost always enough to open an > IMAP connection, do a search, and download the text of two emails). > However, the URL handler checks the total number of messages to be > processed, and returns a status code for same. So: > > upto = 2 > msg_id_list = imap.search(["NOT", "DELETED"]) > msg_id_list = msg_id_list[0, upto] if upto > msg_id_list.each do |msg_id| > m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"] > process m > end > render :json => msg_id_list.to_json > > > and then in the script on the cron-box: > > do > msg_id_list = call_url.parse_json > until msg_id_list.empty? > > > As far as the Google indexing your URL issue: make sure that the GET > request returns a blank page, and the POST actually executes the > cronjob. And, of course, you can always protect that URL via > basic-auth or authenticity-token. > > On Wed, Oct 7, 2009 at 7:42 AM, Wojciech wrote: >> >>> so I have a separate box with actual crond on it, and >>> it has a script that hits a specific URL on my app on heroku every x >>> minutes to process email. >> >> There are services that do it for you (i.e. periodically call your >> magic URL): >> http://www.onlinecronservices.com/ >> >> But be careful: this URL could be called by anybody and could even get >> indexed by Google. You might allow only certain IPs (ip of your online >> cron service) to call this URL to protect the app. >> >> There's also this "poor man's cron" approach, I've seen in Drupal: >> http://drupal.org/project/poormanscron - but it's a bit crazy. >> >> Cheers, >> Wojciech >> >>> >>> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe wrote: >>> >>> > Rails makes it so easy to send emails. Recieving emails isn't that >>> > difficult either, but requires a cron or daemon. What is the best way >>> > to do this on Heroku today? >>> >>> > Carl >>> >>> >> >> >> > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: Reading Email
I haven't checked out the online cron services yet, but there's another issue that I had to solve, and I don't know whether they would support this or not: Heroku limits the execution time of every request to 30 seconds each, and a request that takes longer than that is abruptly interrupted. This means that the magic URL handler has to be written in such a way that it doesn't take longer than 30 secs; I decided to take the dirty-hack approach to this: the URL handler processes two emails at a time (let's say that 30 seconds is almost always enough to open an IMAP connection, do a search, and download the text of two emails). However, the URL handler checks the total number of messages to be processed, and returns a status code for same. So: upto = 2 msg_id_list = imap.search(["NOT", "DELETED"]) msg_id_list = msg_id_list[0, upto] if upto msg_id_list.each do |msg_id| m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"] process m end render :json => msg_id_list.to_json and then in the script on the cron-box: do msg_id_list = call_url.parse_json until msg_id_list.empty? As far as the Google indexing your URL issue: make sure that the GET request returns a blank page, and the POST actually executes the cronjob. And, of course, you can always protect that URL via basic-auth or authenticity-token. On Wed, Oct 7, 2009 at 7:42 AM, Wojciech wrote: > >> so I have a separate box with actual crond on it, and >> it has a script that hits a specific URL on my app on heroku every x >> minutes to process email. > > There are services that do it for you (i.e. periodically call your > magic URL): > http://www.onlinecronservices.com/ > > But be careful: this URL could be called by anybody and could even get > indexed by Google. You might allow only certain IPs (ip of your online > cron service) to call this URL to protect the app. > > There's also this "poor man's cron" approach, I've seen in Drupal: > http://drupal.org/project/poormanscron - but it's a bit crazy. > > Cheers, > Wojciech > >> >> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe wrote: >> >> > Rails makes it so easy to send emails. Recieving emails isn't that >> > difficult either, but requires a cron or daemon. What is the best way >> > to do this on Heroku today? >> >> > Carl >> >> > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: Reading Email
> so I have a separate box with actual crond on it, and > it has a script that hits a specific URL on my app on heroku every x > minutes to process email. There are services that do it for you (i.e. periodically call your magic URL): http://www.onlinecronservices.com/ But be careful: this URL could be called by anybody and could even get indexed by Google. You might allow only certain IPs (ip of your online cron service) to call this URL to protect the app. There's also this "poor man's cron" approach, I've seen in Drupal: http://drupal.org/project/poormanscron - but it's a bit crazy. Cheers, Wojciech > > On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe wrote: > > > Rails makes it so easy to send emails. Recieving emails isn't that > > difficult either, but requires a cron or daemon. What is the best way > > to do this on Heroku today? > > > Carl > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---