Re: Any plans for some sort of shared storage?

2009-10-07 Thread Oren Teich

Yuri,

great question.  It's mostly an issue of scale.  Let me explain a bit  
about how heroku works.  At a high level, each dyno is it's own  
process on a railgun (that's our EC2 server instance for ruby  
processing).  Your app can have 1+ dyno, running across a large  
cluster of railgun servers.  The individual dyno process and railgun  
process are volatile based on load, ec2 fuzzyness, etc.  To use EBS,  
we would have to somehow connect the right store to the right dyno at  
the right time, with that changing second by second across many tens  
of thousands of dynos.  On top of that, when one app starts getting  
high traffic, you'd find that the EBS would be WAY too slow to  
actually write to and still maintain performance, so adding dynos  
would actually start to slow your application down.  An EBS can only  
handle ~ 200 writes/second.  Next we'd be talking about raiding EBS  
(which is what we do for our database), but that just gets crazy if  
you need to constantly move the EBS mount point.

This boils down to a "share nothing" discussion.  
(http://en.wikipedia.org/wiki/Shared_nothing_architecture 
).  Yes, the DB is a shared point, but we've been able to come up with  
some constraints that make it performant.  We haven't figured out a  
way to do that with file systems yet.

Oren

On Oct 7, 2009, at 12:53 PM, Yuri Niyazov wrote:

>
> Heroku runs on ec2, no? What prevents exposure of an EBS block to an  
> app?
>
> On Tue, Oct 6, 2009 at 5:32 PM, Oren Teich  wrote:
>>
>> Hi Neil,
>> There aren't any plans for shared storage right now.  I wish I could
>> say otherwise.  The big challenge is that we're able to achieve our
>> scaling and overall cool features by imposing certain constraints.  A
>> read-only file system is one of those.  If we had a shared  
>> filesystem,
>> we wouldn't be able to offer the scalability or performance that  
>> makes
>> our platform so unique.  There are some interesting plugins that some
>> 3rd partys have in development that will capture filesystem requests
>> and send them to S3 instead.  Hopefully as these come out they might
>> help address some of the constraints for you.
>>
>> Believe me, I wish we could figure out a way to give you a writeable
>> filesystem!
>>
>> Good luck,
>> Oren
>>
>> On Oct 6, 2009, at 2:53 AM, Neil wrote:
>>
>>>
>>> One of the big issues we have with Heroku as a business at the  
>>> moment
>>> is the read-only file system and the fact that this then renders  
>>> some
>>> of our legacy apps, and our CMS of choice, BrowserCMS, unable to be
>>> deployed without some serious hacker-age.
>>>
>>> Question is, are there any plans at all for some sort of shared
>>> storage, where I can somehow define a directory or two that should  
>>> be
>>> shared across all dyno's serving my application, as well as across  
>>> all
>>> deploys of my application?
>>>
>>> For me, this is a major thing, and it would be good to remove one of
>>> the big barriers that people might have.

>>
>>
>>>
>>
>
> >


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: Any plans for some sort of shared storage?

2009-10-07 Thread Yuri Niyazov

Heroku runs on ec2, no? What prevents exposure of an EBS block to an app?

On Tue, Oct 6, 2009 at 5:32 PM, Oren Teich  wrote:
>
> Hi Neil,
> There aren't any plans for shared storage right now.  I wish I could
> say otherwise.  The big challenge is that we're able to achieve our
> scaling and overall cool features by imposing certain constraints.  A
> read-only file system is one of those.  If we had a shared filesystem,
> we wouldn't be able to offer the scalability or performance that makes
> our platform so unique.  There are some interesting plugins that some
> 3rd partys have in development that will capture filesystem requests
> and send them to S3 instead.  Hopefully as these come out they might
> help address some of the constraints for you.
>
> Believe me, I wish we could figure out a way to give you a writeable
> filesystem!
>
> Good luck,
> Oren
>
> On Oct 6, 2009, at 2:53 AM, Neil wrote:
>
>>
>> One of the big issues we have with Heroku as a business at the moment
>> is the read-only file system and the fact that this then renders some
>> of our legacy apps, and our CMS of choice, BrowserCMS, unable to be
>> deployed without some serious hacker-age.
>>
>> Question is, are there any plans at all for some sort of shared
>> storage, where I can somehow define a directory or two that should be
>> shared across all dyno's serving my application, as well as across all
>> deploys of my application?
>>
>> For me, this is a major thing, and it would be good to remove one of
>> the big barriers that people might have.
>> >
>
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: Reading Email

2009-10-07 Thread Oren Teich

Cron runs on a separate single process.  It doesn't matter how many  
dyno's you have, you'll only have one cron process ever running.

If you're seeing other behavior, let us know!

Oren

On Oct 7, 2009, at 8:13 AM, Yuri Niyazov wrote:

>
> Also, I forgot the following fun fact about Heroku's cron service.
> This was true when I investigated it; might still be true now - not
> sure.
>
> Since your app runs on X Heroku VMs, where X is often > 1, then, when
> you use Heroku's cron, the cronjob is executed on each box
> simultaneously -> unless you do something clever (and I was unable to
> figure out what that something clever is), X email processor instances
> run at the same time. If you need guarantee that each email is
> processed once only, this will screw it up for you.
>
> On Wed, Oct 7, 2009 at 11:05 AM, Yuri Niyazov  
>  wrote:
>> I haven't checked out the online cron services yet, but there's
>> another issue that I had to solve, and I don't know whether they  
>> would
>> support this or not:
>>
>> Heroku limits the execution time of every request to 30 seconds each,
>> and a request that takes longer than that is abruptly interrupted.
>> This means that the magic URL handler has to be written in such a way
>> that it doesn't take longer than 30 secs; I decided to take the
>> dirty-hack approach to this: the URL handler processes two emails  
>> at a
>> time (let's say that 30 seconds is almost always enough to open an
>> IMAP connection, do a search, and download the text of two emails).
>> However, the URL handler checks the total number of messages to be
>> processed, and returns a status code for same. So:
>>
>>  upto = 2
>>  msg_id_list = imap.search(["NOT", "DELETED"])
>>  msg_id_list = msg_id_list[0, upto] if upto
>>  msg_id_list.each do |msg_id|
>>m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"]
>>process m
>>  end
>>  render :json => msg_id_list.to_json
>>
>>
>> and then in the script on the cron-box:
>>
>>  do
>> msg_id_list = call_url.parse_json
>>  until msg_id_list.empty?
>>
>>
>> As far as the Google indexing your URL issue: make sure that the GET
>> request returns a blank page, and the POST actually executes the
>> cronjob. And, of course, you can always protect that URL via
>> basic-auth or authenticity-token.
>>
>> On Wed, Oct 7, 2009 at 7:42 AM, Wojciech  wrote:
>>>
 so I have a separate box with actual crond on it, and
 it has a script that hits a specific URL on my app on heroku  
 every x
 minutes to process email.
>>>
>>> There are services that do it for you (i.e. periodically call your
>>> magic URL):
>>> http://www.onlinecronservices.com/
>>>
>>> But be careful: this URL could be called by anybody and could even  
>>> get
>>> indexed by Google. You might allow only certain IPs (ip of your  
>>> online
>>> cron service) to call this URL to protect the app.
>>>
>>> There's also this "poor man's cron" approach, I've seen in Drupal:
>>> http://drupal.org/project/poormanscron - but it's a bit crazy.
>>>
>>> Cheers,
>>> Wojciech
>>>

 On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe   
 wrote:

> Rails makes it so easy to send emails. Recieving emails isn't that
> difficult either, but requires a cron or daemon. What is the  
> best way
> to do this on Heroku today?

> Carl


>
>>>
>>
>
> >


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: Reading Email

2009-10-07 Thread Carl Fyffe

It almost looks like you should have cron on another box create
Delayed Jobs to do this. This solves your "30 second timeout" problem
but introduces another complexity. Maybe this is a necessary
complexity. As for the Google get issue, I would just not put a link
on any page to the URL, and put a Disallow in your robots.txt

User-agent: *
Disallow: /process-emails.html

Or just leave it out entirely, that way no one knows it exists.
Security through obscurity is usually "not good enough" but in this
case I am not sure that it matters.


On Wed, Oct 7, 2009 at 11:13 AM, Yuri Niyazov  wrote:
>
> Also, I forgot the following fun fact about Heroku's cron service.
> This was true when I investigated it; might still be true now - not
> sure.
>
> Since your app runs on X Heroku VMs, where X is often > 1, then, when
> you use Heroku's cron, the cronjob is executed on each box
> simultaneously -> unless you do something clever (and I was unable to
> figure out what that something clever is), X email processor instances
> run at the same time. If you need guarantee that each email is
> processed once only, this will screw it up for you.
>
> On Wed, Oct 7, 2009 at 11:05 AM, Yuri Niyazov  wrote:
>> I haven't checked out the online cron services yet, but there's
>> another issue that I had to solve, and I don't know whether they would
>> support this or not:
>>
>> Heroku limits the execution time of every request to 30 seconds each,
>> and a request that takes longer than that is abruptly interrupted.
>> This means that the magic URL handler has to be written in such a way
>> that it doesn't take longer than 30 secs; I decided to take the
>> dirty-hack approach to this: the URL handler processes two emails at a
>> time (let's say that 30 seconds is almost always enough to open an
>> IMAP connection, do a search, and download the text of two emails).
>> However, the URL handler checks the total number of messages to be
>> processed, and returns a status code for same. So:
>>
>>      upto = 2
>>      msg_id_list = imap.search(["NOT", "DELETED"])
>>      msg_id_list = msg_id_list[0, upto] if upto
>>      msg_id_list.each do |msg_id|
>>        m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"]
>>        process m
>>      end
>>      render :json => msg_id_list.to_json
>>
>>
>> and then in the script on the cron-box:
>>
>>      do
>>         msg_id_list = call_url.parse_json
>>      until msg_id_list.empty?
>>
>>
>> As far as the Google indexing your URL issue: make sure that the GET
>> request returns a blank page, and the POST actually executes the
>> cronjob. And, of course, you can always protect that URL via
>> basic-auth or authenticity-token.
>>
>> On Wed, Oct 7, 2009 at 7:42 AM, Wojciech  wrote:
>>>
 so I have a separate box with actual crond on it, and
 it has a script that hits a specific URL on my app on heroku every x
 minutes to process email.
>>>
>>> There are services that do it for you (i.e. periodically call your
>>> magic URL):
>>> http://www.onlinecronservices.com/
>>>
>>> But be careful: this URL could be called by anybody and could even get
>>> indexed by Google. You might allow only certain IPs (ip of your online
>>> cron service) to call this URL to protect the app.
>>>
>>> There's also this "poor man's cron" approach, I've seen in Drupal:
>>> http://drupal.org/project/poormanscron - but it's a bit crazy.
>>>
>>> Cheers,
>>> Wojciech
>>>

 On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe  wrote:

 > Rails makes it so easy to send emails. Recieving emails isn't that
 > difficult either, but requires a cron or daemon. What is the best way
 > to do this on Heroku today?

 > Carl


>>> >>
>>>
>>
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: Reading Email

2009-10-07 Thread Yuri Niyazov

Also, I forgot the following fun fact about Heroku's cron service.
This was true when I investigated it; might still be true now - not
sure.

Since your app runs on X Heroku VMs, where X is often > 1, then, when
you use Heroku's cron, the cronjob is executed on each box
simultaneously -> unless you do something clever (and I was unable to
figure out what that something clever is), X email processor instances
run at the same time. If you need guarantee that each email is
processed once only, this will screw it up for you.

On Wed, Oct 7, 2009 at 11:05 AM, Yuri Niyazov  wrote:
> I haven't checked out the online cron services yet, but there's
> another issue that I had to solve, and I don't know whether they would
> support this or not:
>
> Heroku limits the execution time of every request to 30 seconds each,
> and a request that takes longer than that is abruptly interrupted.
> This means that the magic URL handler has to be written in such a way
> that it doesn't take longer than 30 secs; I decided to take the
> dirty-hack approach to this: the URL handler processes two emails at a
> time (let's say that 30 seconds is almost always enough to open an
> IMAP connection, do a search, and download the text of two emails).
> However, the URL handler checks the total number of messages to be
> processed, and returns a status code for same. So:
>
>      upto = 2
>      msg_id_list = imap.search(["NOT", "DELETED"])
>      msg_id_list = msg_id_list[0, upto] if upto
>      msg_id_list.each do |msg_id|
>        m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"]
>        process m
>      end
>      render :json => msg_id_list.to_json
>
>
> and then in the script on the cron-box:
>
>      do
>         msg_id_list = call_url.parse_json
>      until msg_id_list.empty?
>
>
> As far as the Google indexing your URL issue: make sure that the GET
> request returns a blank page, and the POST actually executes the
> cronjob. And, of course, you can always protect that URL via
> basic-auth or authenticity-token.
>
> On Wed, Oct 7, 2009 at 7:42 AM, Wojciech  wrote:
>>
>>> so I have a separate box with actual crond on it, and
>>> it has a script that hits a specific URL on my app on heroku every x
>>> minutes to process email.
>>
>> There are services that do it for you (i.e. periodically call your
>> magic URL):
>> http://www.onlinecronservices.com/
>>
>> But be careful: this URL could be called by anybody and could even get
>> indexed by Google. You might allow only certain IPs (ip of your online
>> cron service) to call this URL to protect the app.
>>
>> There's also this "poor man's cron" approach, I've seen in Drupal:
>> http://drupal.org/project/poormanscron - but it's a bit crazy.
>>
>> Cheers,
>> Wojciech
>>
>>>
>>> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe  wrote:
>>>
>>> > Rails makes it so easy to send emails. Recieving emails isn't that
>>> > difficult either, but requires a cron or daemon. What is the best way
>>> > to do this on Heroku today?
>>>
>>> > Carl
>>>
>>>
>> >>
>>
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: Reading Email

2009-10-07 Thread Yuri Niyazov

I haven't checked out the online cron services yet, but there's
another issue that I had to solve, and I don't know whether they would
support this or not:

Heroku limits the execution time of every request to 30 seconds each,
and a request that takes longer than that is abruptly interrupted.
This means that the magic URL handler has to be written in such a way
that it doesn't take longer than 30 secs; I decided to take the
dirty-hack approach to this: the URL handler processes two emails at a
time (let's say that 30 seconds is almost always enough to open an
IMAP connection, do a search, and download the text of two emails).
However, the URL handler checks the total number of messages to be
processed, and returns a status code for same. So:

  upto = 2
  msg_id_list = imap.search(["NOT", "DELETED"])
  msg_id_list = msg_id_list[0, upto] if upto
  msg_id_list.each do |msg_id|
m = imap.fetch(msg_id, "RFC822")[0].attr["RFC822"]
process m
  end
  render :json => msg_id_list.to_json


and then in the script on the cron-box:

  do
 msg_id_list = call_url.parse_json
  until msg_id_list.empty?


As far as the Google indexing your URL issue: make sure that the GET
request returns a blank page, and the POST actually executes the
cronjob. And, of course, you can always protect that URL via
basic-auth or authenticity-token.

On Wed, Oct 7, 2009 at 7:42 AM, Wojciech  wrote:
>
>> so I have a separate box with actual crond on it, and
>> it has a script that hits a specific URL on my app on heroku every x
>> minutes to process email.
>
> There are services that do it for you (i.e. periodically call your
> magic URL):
> http://www.onlinecronservices.com/
>
> But be careful: this URL could be called by anybody and could even get
> indexed by Google. You might allow only certain IPs (ip of your online
> cron service) to call this URL to protect the app.
>
> There's also this "poor man's cron" approach, I've seen in Drupal:
> http://drupal.org/project/poormanscron - but it's a bit crazy.
>
> Cheers,
> Wojciech
>
>>
>> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe  wrote:
>>
>> > Rails makes it so easy to send emails. Recieving emails isn't that
>> > difficult either, but requires a cron or daemon. What is the best way
>> > to do this on Heroku today?
>>
>> > Carl
>>
>>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: Reading Email

2009-10-07 Thread Wojciech

> so I have a separate box with actual crond on it, and
> it has a script that hits a specific URL on my app on heroku every x
> minutes to process email.

There are services that do it for you (i.e. periodically call your
magic URL):
http://www.onlinecronservices.com/

But be careful: this URL could be called by anybody and could even get
indexed by Google. You might allow only certain IPs (ip of your online
cron service) to call this URL to protect the app.

There's also this "poor man's cron" approach, I've seen in Drupal:
http://drupal.org/project/poormanscron - but it's a bit crazy.

Cheers,
Wojciech

>
> On Tue, Oct 6, 2009 at 3:06 PM, Carl Fyffe  wrote:
>
> > Rails makes it so easy to send emails. Recieving emails isn't that
> > difficult either, but requires a cron or daemon. What is the best way
> > to do this on Heroku today?
>
> > Carl
>
>
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---