[google-appengine] Implementing a scalable streaming data collector in GAE

2013-09-11 Thread Ezequiel Muns
The application I want to implement is extremely simple conceptually: calls to a HTTP endpoint will record all GET parameters plus a timestamp as a line into a huge log file. My key performance requirements are scalability in a range from 400QPS to 10kQPS while maintaining fairly low latency (<

Re: [google-appengine] Implementing a scalable streaming data collector in GAE

2013-09-11 Thread Barry Hunter
A possibly off the wall idea. Have a NOOP handler. Literally it does nothing aside from return the correct HTTP Status code. Then extract the data from the standard App Engine Logs! Google have obviouslly built a scalable logging infrastructure. Use it. The logs include the GET paramaters and more

Re: [google-appengine] Implementing a scalable streaming data collector in GAE

2013-09-11 Thread Barry Hunter
Also maybe BigQuery could be used for the 'long term storage'? http://blog.streak.com/2012/07/export-your-google-app-engine-logs-to.html -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emai

Re: [google-appengine] Implementing a scalable streaming data collector in GAE

2013-09-11 Thread Vinny P
On Wed, Sep 11, 2013 at 8:43 AM, Barry Hunter wrote: > Also maybe BigQuery could be used for the 'long term storage'? > > I use BigQuery myself, and I can second Barry's comment. It's easy to integrate with App Engine logs and analyze them. On Sun, Sep 8, 2013 at 10:20 PM, Ezequiel Muns wrot

Re: [google-appengine] Implementing a scalable streaming data collector in GAE

2013-09-11 Thread Nick
Like appengine, big query is a bit of a work in progress and always moving. We use big query to store data for reports, however it does have some gotchas: It doesn't like many small writes (I.e streaming records), this has a huge performance impact on queries. (Note this may also have changed sinc

Re: [google-appengine] Implementing a scalable streaming data collector in GAE

2013-09-12 Thread Ezequiel Muns
Using the logs sounds like an interesting possibility. They would be amazingly scalable (if one thinks about the fact that every single request on App Engine is saving a record). I may have to look at that as memcache (even dedicated) seems to occasionally hit a long enough snag to loose a cons