[google-appengine] Re: 1 application, multiple datastores
> Ok - I understand (maybe), I don't think it matches what 106 is asking > for though It doesn't support 106, but that wasn't the goal. The goal was to show that one could support application--driven datastore choice with an appropriate amount of security. The call to support 106 would be different, but its existence would not mean that an application using change_to_application_userstore() was any less secure. Both (and others) require different application configuration as well. For sharing a datastore between aps, I'd go through an app that managed said shared datastore, but that's something best left up to the designer - it isn't a platform level decision. > Which of the above are you proposing? I'm still not proposing anything. I'm pointing out that GAE can reasonably support a wide range of application to datastore access patterns. On Jan 7, 5:28 am, hawkett wrote: > > Huh? How can you make a "wrong call" that doesn't have any > > parameters? > > > Here's the application code: > > {operations on application-wide datastore} > > change_to_application_userstore() # note - no parameters > > {operations on user-specific datastore} > > {return to user} > > Ok - I understand (maybe), I don't think it matches what 106 is asking > for though - none of these data stores appear to be accessible between > applications - they all appear to be tied to a single application - or > are you saying the user specific data store is portable between > applications? i.e. my application can access it via db APIs, and so > can yours, provided the user is logged in? > > If you don't intend portability of the user store, I agree that the > risk is different, and much lower, because the partitioning mechanism > does at least exist, and the chance of a bug is *much* lower because > the actual db query is likely to be different. When we were talking > about cross app queries, the db schemas in each data store were liekly > to be the same, which made the risk of data exposure very high. In > the implementation you now describe, the user data store and the > application data store probably have substantially different schemas. > The datastores with the same schema (user) is partitioned. I can see > value in this approach, although it does add complexity. > > Essentially you are recommending strict data partitioning (aka 945) > plus a shared application datastore? > > If you intend for the user data store to be portable between apps, > then I have problems with that approach. I think it should use a > specific data API, and not db level access. There's too much > unwarranted trust involved between the apps - i.e. you have to trust > that I read/write the db properly, as does everyone else - I imagine > over time such a shared database would get very 'dirty'. If you use > an API then it can enforce structure and data integrity through > validation. The portable user datastore (if that is what you are > suggesting) is a good idea, but I think it is something that google > has already implemented to some degree with their social data API - > i.e. a bunch of data attached to your identity. I guess it depends on > your implementation how useful this is. > > To me, the portability of data and data partitioning should be treated > separately. > > The other thing to note is that in order to map users to data > partitions, you need one of two things - > 1. An API that your application can use to do so - accidently map the > wrong user to the wrong data store = data exposure problem. > 2. Some form of platform supplied user provisioing - aka 945 > > Which of the above are you proposing? > > On Jan 6, 2:46 pm, Andy Freeman wrote: > > > > > > I guess one of us will be surprised then :) - I would be surprised if > > > gmail, sites, blogger, picassa, orkut etc. all operated in an open > > > space and avoided data exposure through code implemented in each of > > > those applications. > > > If the separation is by name and ordinary "file" access control, the > > "code implemented" consists of the name of the datastore for the > > application plus some application configuration that has to happen > > regardless. I'm pretty sure that google thinks that their folks can > > open an application-specific datastore name reliably. And, if they > > fail, they're talking to a datastore with the wrong structure. > > > Or, are you thinking that those applications use a different datastore > > per external user? (If "separate datastore per user" is the usage > > pattern, bigtable requires far less concurrency support than the > > report mentions.) > > > > - and does not give DB level access to it. So I think just by > > > observing google's current architecture, it makes sense that they > > > wouldn't break with that tradition at the application level for GAE. > > > And not just because its tradition, but because it is rooted in sound > > > architectural principles > > > What "db level access" are you talking abou
[google-appengine] Re: 1 application, multiple datastores
Hi Joseph, I've previously made this point here - http://groups.google.com/group/google-appengine/browse_thread/thread/6319dceae6ec73e7/4d4d464c25537bda?lnk=gst&q=#4d4d464c25537bda - so Google - people really do recommend against GAE because your rodmap is so impenetrable. Joseph, you may recognise a little of Bob in yourself :) I am a C -> C++ -> Java -> J2EE guy over many years, so python is not my language of choice - I only started using python with app engine. I'm not here because of comfort with a technology I know, I'm here because of business requirements. I thik you'll find as the next few years progress that IT houses everywhere will be pushing back against massive in house platforms. Amaszon manages the hardware side of that pushback, Google manages hardware *and* sofware. The killer feature with GAE is the platform offering - with Amazon you have to roll your own software platform, which is time consuming and maintenance intensive. Yes Amazon provides a hardware virtualisation service that lets you do that, and I use it for a number if things. But for a lean architecture, which leads to agility, which makes business happy, you want as much as possible in the not-my-problem basket - i.e. in GAE. If you've spent much time in the world of relational databases, then BigTable is an absolute killer feature. It collapses Scalability, Performance, Fault Tolerance, and ORM into a simple offering with sensible mechanisms for transaction management. All of this stuff ends up as not-my-problem. With Amazon, most of it is my problem to some degree. To a degree, I think the argument that if it isn't on the (laughably undefined) roadmap then you have to go elsewhere is a bit hasty for a beta software platform. Granted, google is always beta, but GAE is in early beta by their standard - you can't even pay for it yet. This is an ideal time to try and influence the architectural direction of the platform. That said, I think google should realise that if they don't publish a proper roadmap (especially for must-have platform features - e.g. SLA, asynch etc.), then people are going to recommend dumping GAE in favour of something else. The only feature request (that I am proposing) here is customer data partitioning. It is not a massive big deal, especially considering that this is already in place, you just have to deploy one app per customer, which is a bit tedious. It doesn't seem like a sufficient reason to jump ship and roll my own everything at Amazon. It seems to me like an opportunity to say 'Hey, could you make this aspect, which already exists, a little easier?' It seems to be like a ludicrously obvious step for Google to set up an application market place, like the iphone app store, and open that market to all of their google apps customers. It is a license for Google to print money - more so than the iphone, because we are talking business customers, not joe-public. If it isn't on their (secret) roadmap, then my name is Fred. This is the direction I have suggested for the customer based data partitioning in http://code.google.com/p/googleappengine/issues/detail?can=2&q=945 Finally, it is obvious that Andy and I are coming at this from different (strongly held) positions. That is usually a good thing, as long as both parties are honest in their attempts to understand the other. Colin (not Fred) On Jan 7, 2:17 pm, "bowman.jos...@gmail.com" wrote: > Guys, I think you need to take a step back and look at this from a > higher level. > > Appengine supplies you with an instance in a cloud that includes a > customized python set, and a BigTable backend. It does not support > multiple BigTable backends and design wise I doubt it ever will. There > comes a time when you have to look at your application and determine > what is the right environment for it to be built in to meet your > business requirements. In this case, it does not sound like appengine > in and of itself is going to meet those requirements. > > Generally business requirements dictate the speed at which your > product must become available for use. Google has a published roadmap > for appengine, and support for multiple BigTable instances per > application is not on it, and they have not even implied it's > something they have any interest in implementing. > > So, at this point, I'd suggest you look at other alternatives in order > to meet your business requirements. > > - Separate it by table within BigTable as has been suggested. > - Pull everything back inhouse and build server(s) capable of > supporting your application with the requirements you have. Such as > with MySQL running a different database for each of your users. > - Examine other cloud db storage options to see if they can meet your > requirements, such as the offering from Amazon. Though, while you > could use appengine combined with that solution, I would question how > quick you'd hit the urlfetch quota limits. > - Examine all
[google-appengine] Re: 1 application, multiple datastores
Guys, I think you need to take a step back and look at this from a higher level. Appengine supplies you with an instance in a cloud that includes a customized python set, and a BigTable backend. It does not support multiple BigTable backends and design wise I doubt it ever will. There comes a time when you have to look at your application and determine what is the right environment for it to be built in to meet your business requirements. In this case, it does not sound like appengine in and of itself is going to meet those requirements. Generally business requirements dictate the speed at which your product must become available for use. Google has a published roadmap for appengine, and support for multiple BigTable instances per application is not on it, and they have not even implied it's something they have any interest in implementing. So, at this point, I'd suggest you look at other alternatives in order to meet your business requirements. - Separate it by table within BigTable as has been suggested. - Pull everything back inhouse and build server(s) capable of supporting your application with the requirements you have. Such as with MySQL running a different database for each of your users. - Examine other cloud db storage options to see if they can meet your requirements, such as the offering from Amazon. Though, while you could use appengine combined with that solution, I would question how quick you'd hit the urlfetch quota limits. - Examine all the offerings at Amazon and other cloud providers such as Aptana to see if any of them are a better fit for your requirements. Sometimes you have to stop and realize that business/security requirements will dictate the technology you need to use, rather than personal preference/comfort with technology you know. Over a decade of preaching Linux while supporting Exchange and Citrix on Windows has pounded this into my head. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
> Huh? How can you make a "wrong call" that doesn't have any > parameters? > > Here's the application code: > {operations on application-wide datastore} > change_to_application_userstore() # note - no parameters > {operations on user-specific datastore} > {return to user} Ok - I understand (maybe), I don't think it matches what 106 is asking for though - none of these data stores appear to be accessible between applications - they all appear to be tied to a single application - or are you saying the user specific data store is portable between applications? i.e. my application can access it via db APIs, and so can yours, provided the user is logged in? If you don't intend portability of the user store, I agree that the risk is different, and much lower, because the partitioning mechanism does at least exist, and the chance of a bug is *much* lower because the actual db query is likely to be different. When we were talking about cross app queries, the db schemas in each data store were liekly to be the same, which made the risk of data exposure very high. In the implementation you now describe, the user data store and the application data store probably have substantially different schemas. The datastores with the same schema (user) is partitioned. I can see value in this approach, although it does add complexity. Essentially you are recommending strict data partitioning (aka 945) plus a shared application datastore? If you intend for the user data store to be portable between apps, then I have problems with that approach. I think it should use a specific data API, and not db level access. There's too much unwarranted trust involved between the apps - i.e. you have to trust that I read/write the db properly, as does everyone else - I imagine over time such a shared database would get very 'dirty'. If you use an API then it can enforce structure and data integrity through validation. The portable user datastore (if that is what you are suggesting) is a good idea, but I think it is something that google has already implemented to some degree with their social data API - i.e. a bunch of data attached to your identity. I guess it depends on your implementation how useful this is. To me, the portability of data and data partitioning should be treated separately. The other thing to note is that in order to map users to data partitions, you need one of two things - 1. An API that your application can use to do so - accidently map the wrong user to the wrong data store = data exposure problem. 2. Some form of platform supplied user provisioing - aka 945 Which of the above are you proposing? On Jan 6, 2:46 pm, Andy Freeman wrote: > > I guess one of us will be surprised then :) - I would be surprised if > > gmail, sites, blogger, picassa, orkut etc. all operated in an open > > space and avoided data exposure through code implemented in each of > > those applications. > > If the separation is by name and ordinary "file" access control, the > "code implemented" consists of the name of the datastore for the > application plus some application configuration that has to happen > regardless. I'm pretty sure that google thinks that their folks can > open an application-specific datastore name reliably. And, if they > fail, they're talking to a datastore with the wrong structure. > > Or, are you thinking that those applications use a different datastore > per external user? (If "separate datastore per user" is the usage > pattern, bigtable requires far less concurrency support than the > report mentions.) > > > - and does not give DB level access to it. So I think just by > > observing google's current architecture, it makes sense that they > > wouldn't break with that tradition at the application level for GAE. > > And not just because its tradition, but because it is rooted in sound > > architectural principles > > What "db level access" are you talking about? The result of that open > call is used by every other bigtable operation, including all db > operations performed at the datastore. Unless GAE works differently, > the runtime has access to that result. > > > > Not so fast. Who said anything about application visible tokens? In > > > fact, it could be just "change_to_application_userstore", where a > > > userstore is an ordinary GAE datastore. This could easily be written > > > so it doesn't take any parameters from application code, which makes > > > it just as secure as an "open datastore" call done at process startup. > > And regardless, you can easily introduce the cited bug based on your > > clarification. Simply make the wrong call to 'change_to_datastore', > > and you still have the exposure problem. When your code is > > responsible for selecting the datastore, you can introduce the bug. > > This is fairly obvious. > > Huh? How can you make a "wrong call" that doesn't have any > parameters? > > Here's the application code: > {operations on application-
[google-appengine] Re: 1 application, multiple datastores
> I guess one of us will be surprised then :) - I would be surprised if > gmail, sites, blogger, picassa, orkut etc. all operated in an open > space and avoided data exposure through code implemented in each of > those applications. If the separation is by name and ordinary "file" access control, the "code implemented" consists of the name of the datastore for the application plus some application configuration that has to happen regardless. I'm pretty sure that google thinks that their folks can open an application-specific datastore name reliably. And, if they fail, they're talking to a datastore with the wrong structure. Or, are you thinking that those applications use a different datastore per external user? (If "separate datastore per user" is the usage pattern, bigtable requires far less concurrency support than the report mentions.) > - and does not give DB level access to it. So I think just by > observing google's current architecture, it makes sense that they > wouldn't break with that tradition at the application level for GAE. > And not just because its tradition, but because it is rooted in sound > architectural principles What "db level access" are you talking about? The result of that open call is used by every other bigtable operation, including all db operations performed at the datastore. Unless GAE works differently, the runtime has access to that result. > > Not so fast. Who said anything about application visible tokens? In > > fact, it could be just "change_to_application_userstore", where a > > userstore is an ordinary GAE datastore. This could easily be written > > so it doesn't take any parameters from application code, which makes > > it just as secure as an "open datastore" call done at process startup. > And regardless, you can easily introduce the cited bug based on your > clarification. Simply make the wrong call to 'change_to_datastore', > and you still have the exposure problem. When your code is > responsible for selecting the datastore, you can introduce the bug. > This is fairly obvious. Huh? How can you make a "wrong call" that doesn't have any parameters? Here's the application code: {operations on application-wide datastore} change_to_application_userstore() # note - no parameters {operations on user-specific datastore} {return to user} The runtime knows what user and the mapping from said user to an application-specific datastore. The application doesn't specify the user and doesn't even know the name of the datastore. There are only two mistakes that the application writer can make - calling change_to_application_userstore too early or too late. If the change_to_application_userstore() call is too late, the application will try to perform some user-specific operations on the application-wide datastore, but those will likely fail because its structure is completely different. Note that the application doesn't have access to any data from the user's datastore at that point. If the change_to_application_userstore() call is too early, the application will try to perform some application-generic operations on the user's datastore, but those will likely fail for the same reason as above. Moreover, this can't leak user data because the application only has access to the user's datastore at that point. > You are still asserting that application code carries the same > robustness profile as a platform code. No, I'm not. I'm pointing out that the platform includes the run-time and that run-time can provide meaningful services in this area. If it's already providing related services, and I'm pretty sure that it is calling "open_application_datastore" with some application-specific key on startup, this doesn't change the risk profile. Do you really want to argue that the platform code in the run-time has a significantly different "robustness profile" than platform code running on a different server? (If I'm correct about it already providing related services, you're actually arguing about the relative robustness of related run-time code.) Would platform code running in a different process on the same machine have yet another robustness profile? On Jan 6, 4:57 am, hawkett wrote: > > > > How do you know how the current GAE code actually works? > > > > I read the API docs - how do you manage it? > > > I'm not the one asserting that there are hard boundaries between GAE > > datastores that the GAE run-time can't pierce. > > Neither am I - I am asserting that there are hard boundaries that you > or I can't pierce, and that is a feature of the security > architecture. The API docs bear out that assertion. I do *expect* > that data partitioning is a DB layer feature, but as I said > previously, I don't know that. > > > It is generally believed that GAE is built on top of BigTable, which > > has a lot of internal Google users. I don't know that all of them can > > work with only one datastore; I'd guess that several require to
[google-appengine] Re: 1 application, multiple datastores
> > > How do you know how the current GAE code actually works? > > > I read the API docs - how do you manage it? > > I'm not the one asserting that there are hard boundaries between GAE > datastores that the GAE run-time can't pierce. Neither am I - I am asserting that there are hard boundaries that you or I can't pierce, and that is a feature of the security architecture. The API docs bear out that assertion. I do *expect* that data partitioning is a DB layer feature, but as I said previously, I don't know that. > It is generally believed that GAE is built on top of BigTable, which > has a lot of internal Google users. I don't know that all of them can > work with only one datastore; I'd guess that several require to access > multiple datastores simultaneously. So, if there is a BigTable-level > "only one datastore" and/or "can't switch" restriction, I'd be very > surprised if was universal or could only be pierced by suid > applications. I guess one of us will be surprised then :) - I would be surprised if gmail, sites, blogger, picassa, orkut etc. all operated in an open space and avoided data exposure through code implemented in each of those applications. That seems a ludicrous architecture to me - which is my point in this thread I guess. It makes much more sense to me to have the partitioning logic at the DB level (like a standard database tablespace), and for those applications to leverage that. Then they expose API's to access their data at the application level - not use the DB API's. Google does, in fact, expose API's for data access - http://code.google.com/apis/gdata/ - and does not give DB level access to it. So I think just by observing google's current architecture, it makes sense that they wouldn't break with that tradition at the application level for GAE. And not just because its tradition, but because it is rooted in sound architectural principles. > Not so fast. Who said anything about application visible tokens? In > fact, it could be just "change_to_application_userstore", where a > userstore is an ordinary GAE datastore. This could easily be written > so it doesn't take any parameters from application code, which makes > it just as secure as an "open datastore" call done at process startup. > > Or, it could support one token, so the application has access to the > "default" datastore and a datastore determined by such a call. Again, > that call need not take parameters from application code. I think this is getting away from the 106 proposal now, which states - 'This feature request is about allowing cross app queries using the db APIs only' And regardless, you can easily introduce the cited bug based on your clarification. Simply make the wrong call to 'change_to_datastore', and you still have the exposure problem. When your code is responsible for selecting the datastore, you can introduce the bug. This is fairly obvious. > This could easily be written so it doesn't take any parameters from > application code, which makes > it just as secure as an "open datastore" call done at process startup. You are still asserting that application code carries the same robustness profile as a platform code. This is clearly not the case. If there are N applications implementing the application API, vs just the platform implementing the platform API, then it is a simple matter of statistics to show that you will get at least N times as many bugs. In fact it will be much more than N, because the volume of testing on the platform will be N times greater, and ther implementation process will be much more rigourous than most application. Without doing the analysis, I would expect the platform fragility (e.g. fragility = defects per month) to decrease exponentially as N increases. Using the application API, I expect fragility would remain roughly constant, and unrelated to N. But there is a hidden bigger probem - if fragility remains constant on a per app basis, then customers see app engine as a minefield - which apps are well implemented? The one they choose could be a broken one. How would they know? This means across the board, the risk of data exposure _from the customer perpsective_ is much worse if partitioning logic is performed in application code. What do you think of the possibility of being able to decide when you deploy your app how strict the data partitioning should be? In the marketplace concept, the customer could be made aware of the strictness of data partitioing when they sign up. My main concern is protecting customer data, and giving customers confidence in the data security of the GAE platform. This is how I read the intent of the original poster as well. On Jan 6, 3:40 am, Andy Freeman wrote: > > > > As it stands GAE does not allow cross data store queries, > > > > and from my perspective that is an aspect of the security > > > > architecture. 106 wants that aspect 'relaxed'. > > > > How do you know how the current GAE code actually works
[google-appengine] Re: 1 application, multiple datastores
> > > As it stands GAE does not allow cross data store queries, > > > and from my perspective that is an aspect of the security > > > architecture. 106 wants that aspect 'relaxed'. > > > How do you know how the current GAE code actually works? > > I read the API docs - how do you manage it? I'm not the one asserting that there are hard boundaries between GAE datastores that the GAE run-time can't pierce. It is generally believed that GAE is built on top of BigTable, which has a lot of internal Google users. I don't know that all of them can work with only one datastore; I'd guess that several require to access multiple datastores simultaneously. So, if there is a BigTable-level "only one datastore" and/or "can't switch" restriction, I'd be very surprised if was universal or could only be pierced by suid applications. Why are you so certain that there are enough any google internal applications that require "just one datastore" and/or "can't switch" that they'd bake that option into BigTable? If there aren't any, you get to argue why they'd add it just for GAE even though GAE can provide that segregation in its run-time FWIW, while I haven't seen Google's BigTable API, the published info labs.google.com/papers/bigtable-osdi06.pdf, mentions an "open" call. Yes, there are probably access restrictions on said open call, but what are the odds that there's a user per GAE application and said application's datastore is only accessible to said user? > > 106 or any of the variants that I've mentioned would merely make "open > > datastore" available through some appropriate safeguards and would be > > just as secure as the current system. > > Let's examine the token idea - and assume you have obtained N tokens > securely. Not so fast. Who said anything about application visible tokens? In fact, it could be just "change_to_application_userstore", where a userstore is an ordinary GAE datastore. This could easily be written so it doesn't take any parameters from application code, which makes it just as secure as an "open datastore" call done at process startup. Or, it could support one token, so the application has access to the "default" datastore and a datastore determined by such a call. Again, that call need not take parameters from application code. On Jan 5, 6:34 am, hawkett wrote: > On Dec 30 2008, 3:14 pm, Andy Freeman wrote: > > > > No, I would prefer GAE to implement the system completely, using > > > existing elements. > > > I was unaware of the weight that your preferences have. > > Isn't that what a feature request is? Should I raise feature requests > for other people's preferences? What a strange statement. > > > I note that your implementation requires new elements, namely > > additions to app.yaml. > > And you fail to note that I said it was not a requirement. e.g. you > could achieve the same thing when you deploy the app (e.g. when you > choose to tie it to a domain or not), or via configuration in admin > console. > > > > As it stands GAE does not allow cross data store queries, > > > and from my perspective that is an aspect of the security > > > architecture. 106 wants that aspect 'relaxed'. > > > How do you know how the current GAE code actually works? > > I read the API docs - how do you manage it? > > > 106 or any of the variants that I've mentioned would merely make "open > > datastore" available through some appropriate safeguards and would be > > just as secure as the current system. > > Let's examine the token idea - and assume you have obtained N tokens > securely. You can easily introduce a bug in your application code > that uses the wrong token for the wrong end-user. Secure access, > buggy exposure of customer data. Your idea does not prevent the cited > bug, because it is not an alternative to strict data partitioning. > This is not a solution to the concerns of the original poster. > Perhaps I have misunderstood your implementation? > > > I don't know Google's code either, but it is generally believed that > > BigTable is used in many internal Google applications. The easy way > > to make BigTable available to applications is via such a routine > > called byapplication-space code. To the extent that GAE's datastore > > is "just" a BigTable wrapper > > I think you are probably over-simplifying the meaning of BigTable. > BigTable is indeed used by many internal applications (as I understand > it), and as previously stated, I would expect (don't know) that the > data segregation required to achieve this would not be implemented by > each of those internal applications, but by lower level features in > BigTable. Move common use cases to the platform level. > > On Dec 30 2008, 3:14 pm, Andy Freeman wrote: > > > > > > No, I would prefer GAE to implement the system completely, using > > > existing elements. > > > I was unaware of the weight that your preferences have. > > > I note that your implementation requires new elements, namely > > additions to a
[google-appengine] Re: 1 application, multiple datastores
On Dec 30 2008, 3:14 pm, Andy Freeman wrote: > > No, I would prefer GAE to implement the system completely, using > > existing elements. > > I was unaware of the weight that your preferences have. Isn't that what a feature request is? Should I raise feature requests for other people's preferences? What a strange statement. > I note that your implementation requires new elements, namely > additions to app.yaml. And you fail to note that I said it was not a requirement. e.g. you could achieve the same thing when you deploy the app (e.g. when you choose to tie it to a domain or not), or via configuration in admin console. > > As it stands GAE does not allow cross data store queries, > > and from my perspective that is an aspect of the security > > architecture. 106 wants that aspect 'relaxed'. > > How do you know how the current GAE code actually works? I read the API docs - how do you manage it? > 106 or any of the variants that I've mentioned would merely make "open > datastore" available through some appropriate safeguards and would be > just as secure as the current system. Let's examine the token idea - and assume you have obtained N tokens securely. You can easily introduce a bug in your application code that uses the wrong token for the wrong end-user. Secure access, buggy exposure of customer data. Your idea does not prevent the cited bug, because it is not an alternative to strict data partitioning. This is not a solution to the concerns of the original poster. Perhaps I have misunderstood your implementation? > I don't know Google's code either, but it is generally believed that > BigTable is used in many internal Google applications. The easy way > to make BigTable available to applications is via such a routine > called byapplication-space code. To the extent that GAE's datastore > is "just" a BigTable wrapper I think you are probably over-simplifying the meaning of BigTable. BigTable is indeed used by many internal applications (as I understand it), and as previously stated, I would expect (don't know) that the data segregation required to achieve this would not be implemented by each of those internal applications, but by lower level features in BigTable. Move common use cases to the platform level. On Dec 30 2008, 3:14 pm, Andy Freeman wrote: > > No, I would prefer GAE to implement the system completely, using > > existing elements. > > I was unaware of the weight that your preferences have. > > I note that your implementation requires new elements, namely > additions to app.yaml. > > > It would allow some great additions, such as a > > common billing and payment engine - something most app developers > > would love to have taken off their plate. > > There are lots of other implementations that have that property, as > well as the others described below. > > > As it stands GAE does not allow cross data store queries, > > and from my perspective that is an aspect of the security > > architecture. 106 wants that aspect 'relaxed'. > > How do you know how the current GAE code actually works? > > One possible implementation that satisfies every currently observable > behavior involves an "open datastore" routine that is passed the name > of the relevant datastore and called by Google code that lives > inapplicationspace. This routine returns a token that is used by every > datastore access routine. (A given process may access the datastore > on behalf of urls that require login as well as ones that don't so > whatever mechanism connects a process to a datastore probably does not > require any user credentials. However, "open datastore" may use app- > specific credentials baked into theapplicationby google's set up > code.) There are a number of places where "open datastore" could be > called. > > 106 or any of the variants that I've mentioned would merely make "open > datastore" available through some appropriate safeguards and would be > just as secure as the current system. > > I don't know Google's code either, but it is generally believed that > BigTable is used in many internal Google applications. The easy way > to make BigTable available to applications is via such a routine > called byapplication-space code. To the extent that GAE's datastore > is "just" a BigTable wrapper > > On Dec 30, 6:17 am, hawkett wrote: > > > > "The system" in this case is the combination of the GAE platform and > > > anapplicationrunning on said platform. > > > No, I would prefer GAE to implement the system completely, using > > existing elements. How? In app.yaml, you specify that your > >applicationsupports mapping multiple google apps user spaces to your > > app. Currently it only allows one. This is anapplication > > marketplace type concept. When my app is added, a new data partition > > is created for their users. Most importantly I am the administrator > > of the app and all of the data partitions - this is different to > > deploying my app to their GAE account -
[google-appengine] Re: 1 application, multiple datastores
> No, I would prefer GAE to implement the system completely, using > existing elements. I was unaware of the weight that your preferences have. I note that your implementation requires new elements, namely additions to app.yaml. > It would allow some great additions, such as a > common billing and payment engine - something most app developers > would love to have taken off their plate. There are lots of other implementations that have that property, as well as the others described below. > As it stands GAE does not allow cross data store queries, > and from my perspective that is an aspect of the security > architecture. 106 wants that aspect 'relaxed'. How do you know how the current GAE code actually works? One possible implementation that satisfies every currently observable behavior involves an "open datastore" routine that is passed the name of the relevant datastore and called by Google code that lives in application space. This routine returns a token that is used by every datastore access routine. (A given process may access the datastore on behalf of urls that require login as well as ones that don't so whatever mechanism connects a process to a datastore probably does not require any user credentials. However, "open datastore" may use app- specific credentials baked into the application by google's set up code.) There are a number of places where "open datastore" could be called. 106 or any of the variants that I've mentioned would merely make "open datastore" available through some appropriate safeguards and would be just as secure as the current system. I don't know Google's code either, but it is generally believed that BigTable is used in many internal Google applications. The easy way to make BigTable available to applications is via such a routine called by application-space code. To the extent that GAE's datastore is "just" a BigTable wrapper On Dec 30, 6:17 am, hawkett wrote: > > "The system" in this case is the combination of the GAE platform and > > an application running on said platform. > > No, I would prefer GAE to implement the system completely, using > existing elements. How? In app.yaml, you specify that your > application supports mapping multiple google apps user spaces to your > app. Currently it only allows one. This is an application > marketplace type concept. When my app is added, a new data partition > is created for their users. Most importantly I am the administrator > of the app and all of the data partitions - this is different to > deploying my app to their GAE account - it still resides in my > account, and the customer has no administrative rights beyond what I > give the in my application code. I need only have one app deployed. > > With this model, registration, user provisioning, authentication and > data partitioning are all handled external to my application code > using building blocks that are already present in the GAE offering. > The only change to my application code from right now is an entry in > app.yaml. It's not even particularly complicated - especially for me, > the application developer. I can imagine implementations that don't > even require the app.yaml entry. > > I'll admit (as I'm sure you will) that this thread has led me to think > more deeply about the implementation of the use case from my original > post, but the above is not excessive, and is much preferable to > application code. It would allow some great additions, such as a > common billing and payment engine - something most app developers > would love to have taken off their plate. > > Yet another feature this would allow - version migration for > customers. I deploy separate versions of my app, and have the ability > to move customer data partitions between app deployments. An obvious > use case is that some customers may be happy to try new features in > beta, others may want to wait for release versions. It is worth > noting that google apps essentially supports this feature currently > with the checkbox indicating that you want the latest features. > > These are all major development efforts that carry significant risks > to your customers, and are mostly diversions to core creative > application development. Common use cases should be moved to the > platform layer, freeing the developer to actually build their > application. This, I think, is a good summary of the stated goals of > GAE platform. > > I'll add the above implementation as a suggestion to 945 to clear up > any misunderstanding about platform vs application. > > > The GAE security architecture is not based on "not allowing cross data > > store queries". It's based on authenticated access to partitioned > > datastores, which is a very different thing. > > I did say - > > '...security architecture of GAE is based on trustable external > authentication, data partitioning, mapping that data partition to the > authenticated entity, and not allowing cross data store queries' > > I
[google-appengine] Re: 1 application, multiple datastores
> "The system" in this case is the combination of the GAE platform and > an application running on said platform. No, I would prefer GAE to implement the system completely, using existing elements. How? In app.yaml, you specify that your application supports mapping multiple google apps user spaces to your app. Currently it only allows one. This is an application marketplace type concept. When my app is added, a new data partition is created for their users. Most importantly I am the administrator of the app and all of the data partitions - this is different to deploying my app to their GAE account - it still resides in my account, and the customer has no administrative rights beyond what I give the in my application code. I need only have one app deployed. With this model, registration, user provisioning, authentication and data partitioning are all handled external to my application code using building blocks that are already present in the GAE offering. The only change to my application code from right now is an entry in app.yaml. It's not even particularly complicated - especially for me, the application developer. I can imagine implementations that don't even require the app.yaml entry. I'll admit (as I'm sure you will) that this thread has led me to think more deeply about the implementation of the use case from my original post, but the above is not excessive, and is much preferable to application code. It would allow some great additions, such as a common billing and payment engine - something most app developers would love to have taken off their plate. Yet another feature this would allow - version migration for customers. I deploy separate versions of my app, and have the ability to move customer data partitions between app deployments. An obvious use case is that some customers may be happy to try new features in beta, others may want to wait for release versions. It is worth noting that google apps essentially supports this feature currently with the checkbox indicating that you want the latest features. These are all major development efforts that carry significant risks to your customers, and are mostly diversions to core creative application development. Common use cases should be moved to the platform layer, freeing the developer to actually build their application. This, I think, is a good summary of the stated goals of GAE platform. I'll add the above implementation as a suggestion to 945 to clear up any misunderstanding about platform vs application. > The GAE security architecture is not based on "not allowing cross data > store queries". It's based on authenticated access to partitioned > datastores, which is a very different thing. I did say - '...security architecture of GAE is based on trustable external authentication, data partitioning, mapping that data partition to the authenticated entity, and not allowing cross data store queries' I realise they are different things, that's why I listed them separately. As it stands GAE does not allow cross data store queries, and from my perspective that is an aspect of the security architecture. 106 wants that aspect 'relaxed'. While I don't think GAE will implement cross data store queries using the data API (I still think exposing an application API to access said data, or supporting one data partition for many apps is the right choice), a possible implementation that would be acceptable to me is adding an entry to app.yaml specifying how strict data partitioning should be for an application. For my use case I would choose the strictest option, and for yours something less so. It's not ideal, as an error in app.yaml could lead to the cited bug, but the risk profile much less, and more easily auditable. On Dec 30, 12:53 am, Andy Freeman wrote: > > 'The system spawns a virtual instance of the app - or at least allows > > mapping a single datastore partition to the authenticated entity. You > > coudl extend it by allowing multiple datastores per authenticated > > entity and choosing the appropriate one at authentication time.' > > > I haven't mentioned application code at all. If you have interpreted > > 'the system'' to mean my application code, then I think you are being > > disingenuous. > > "The system" in this case is the combination of the GAE platform and > an application running on said platform. > > > What's the point of a feature request for my own application code? > > Oh really? The reason that this requires a feature request is that it > isn't (currently) possible for an application running on GAE to > request the creation of another datastore. (One could call an outside > agent to request another application, but ) > > > Do you support request 106? > > Yes. > > > Do you oppose 945? > > Not sure. > > > At the moment, I am getting the idea you support 106, > > but not the implication that it would support queries across > > datastores. > > 106 allows an application to access multiple datas
[google-appengine] Re: 1 application, multiple datastores
> 'The system spawns a virtual instance of the app - or at least allows > mapping a single datastore partition to the authenticated entity. You > coudl extend it by allowing multiple datastores per authenticated > entity and choosing the appropriate one at authentication time.' > > I haven't mentioned application code at all. If you have interpreted > 'the system'' to mean my application code, then I think you are being > disingenuous. "The system" in this case is the combination of the GAE platform and an application running on said platform. > What's the point of a feature request for my own application code? Oh really? The reason that this requires a feature request is that it isn't (currently) possible for an application running on GAE to request the creation of another datastore. (One could call an outside agent to request another application, but ) > Do you support request 106? Yes. > Do you oppose 945? Not sure. > At the moment, I am getting the idea you support 106, > but not the implication that it would support queries across > datastores. 106 allows an application to access multiple datastores, so why would I think that it doesn't? Note that the ability of an application to access multiple datastores does not imply the ability to access arbitrary datastores. Note also that the ability to access multiple datastores could be satisfied via a "datastore login" API used by the application which would be as secure as anything by the platform before the application starts. (Both schemes can be exploited by malicious code. Both are only as secure as the platform's login.) > I am also understanding that you oppose the data > segregation from 945 because you think it doesn't serve a purpose. I'm skeptical of 945 because it's a lot of mechanism. There are many ways to get data segregation using the existing partitioning. > This is despite the fact that the entire security architecture of GAE > is based on trustable external authentication, data partitioning, > mapping that data partition to the authenticated entity, and not > allowing cross data store queries. The GAE security architecture is not based on "not allowing cross data store queries". It's based on authenticated access to partitioned datastores, which is a very different thing. One could have authenticated access to partitioned datastores AND cross datastore queries. One could have authenticated access to choice of partitioned datastore but not have cross datastore queries. One have authenticated access to choice of partitioned datastores and allow cross datastore queries. One could even have an "authenticated choice" mechanism that allowed cross datastore queries for some datastores and not others. > Are you saying the current GAE security architecture is wrong? No. > Or just that they should get rid of the data partitioning to deliver feature > 106? No. On Dec 26, 5:50 am, hawkett wrote: > > Huh? You were requesting the ability to spawn a new datastore and to > > have the login scheme for a given pile of application code pick the > > datastore. The above is about methods for separating datastores and > > whether the method for separating them should depend on how the > > datastore is chose > > I assume you are talking about this statement from my first post? - > > 'The system spawns a virtual instance of the app - or at least allows > mapping a single datastore partition to the authenticated entity. You > coudl extend it by allowing multiple datastores per authenticated > entity and choosing the appropriate one at authentication time.' > > I haven't mentioned application code at all. If you have interpreted > 'the system'' to mean my application code, then I think you are being > disingenuous. What's the point of a feature request for my own > application code? The feature request has the term 'data segregation' > in its title, and doesn't include the proposed extension (as this > would add significant additional complexity). Anyway, when I request > functionality in 'the system' in a GAE feature request, I am talking > about GAE, not my own application code. If you're talking about some > other statement I made, then please say what it is. > > > With no post-login check, the application runs using whatever > > datastore the login procedure finds acceptable. > > Yes, it does. This is not our application code, and we trust it. If > you don't trust it, modify it or choose a different authentication > mechanism that you do trust. > > > If the login > > procedure fails or the datastore layer serves up the wrong datastore, > > the application still does its thing. > > Raise and fix the bug in the authentication/db layer. > > What is your actual position Andy? Do you support request 106? Do > you oppose 945? At the moment, I am getting the idea you support 106, > but not the implication that it would support queries across > datastores. I am also understanding that you oppose the data > segregatio
[google-appengine] Re: 1 application, multiple datastores
> They're trusting you with their data. Why should they trust your code > not to email all of the data they input to malicious people if they > can't trust you to write code that keeps their data separate enough > from other people's data? Because one is a malicious attack, and one is a bug. From a security perspective you address them in totally different ways. > If they don't trust you buy use your > product anyway, they're stupid, whether Google provides you the tools > to do better security than you can do yourself or not. You're kidding, right? On Dec 23, 4:51 pm, Geoffrey Spear wrote: > On Dec 22, 5:03 pm, hawkett wrote: > > > 2. Not seeing it from the customer's perspective. What they see is > > that every app on GAE is a roll your own data security effort - what a > > nightmare - how are they to tell which app was written well and which > > wasn't? How would they even begin to assess the risk profile - do > > they have to audit your company's development practices? > > They're trusting you with their data. Why should they trust your code > not to email all of the data they input to malicious people if they > can't trust you to write code that keeps their data separate enough > from other people's data? If they don't trust you buy use your > product anyway, they're stupid, whether Google provides you the tools > to do better security than you can do yourself or not. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
> Huh? You were requesting the ability to spawn a new datastore and to > have the login scheme for a given pile of application code pick the > datastore. The above is about methods for separating datastores and > whether the method for separating them should depend on how the > datastore is chose I assume you are talking about this statement from my first post? - 'The system spawns a virtual instance of the app - or at least allows mapping a single datastore partition to the authenticated entity. You coudl extend it by allowing multiple datastores per authenticated entity and choosing the appropriate one at authentication time.' I haven't mentioned application code at all. If you have interpreted 'the system'' to mean my application code, then I think you are being disingenuous. What's the point of a feature request for my own application code? The feature request has the term 'data segregation' in its title, and doesn't include the proposed extension (as this would add significant additional complexity). Anyway, when I request functionality in 'the system' in a GAE feature request, I am talking about GAE, not my own application code. If you're talking about some other statement I made, then please say what it is. > With no post-login check, the application runs using whatever > datastore the login procedure finds acceptable. Yes, it does. This is not our application code, and we trust it. If you don't trust it, modify it or choose a different authentication mechanism that you do trust. > If the login > procedure fails or the datastore layer serves up the wrong datastore, > the application still does its thing. Raise and fix the bug in the authentication/db layer. What is your actual position Andy? Do you support request 106? Do you oppose 945? At the moment, I am getting the idea you support 106, but not the implication that it would support queries across datastores. I am also understanding that you oppose the data segregation from 945 because you think it doesn't serve a purpose. This is despite the fact that the entire security architecture of GAE is based on trustable external authentication, data partitioning, mapping that data partition to the authenticated entity, and not allowing cross data store queries. Are you saying the current GAE security architecture is wrong? Or just that they should get rid of the data partitioning to deliver feature 106? If this is your position, then it seems totally unsustainable to me. On Dec 24, 11:45 pm, Andy Freeman wrote: > >> You're hoping that the partitioning for a given datastore depends on > >> how google allows access to said datastore > > Exactly - that is the feature request I am proposing. > > Huh? You were requesting the ability to spawn a new datastore and to > have the login scheme for a given pile of application code pick the > datastore. The above is about methods for separating datastores and > whether the method for separating them should depend on how the > datastore is chosen. > > > I don't agree. You should trust your authentication mechanism - this > > is a trust relationship. If you don't trust it, then you need to > > address that problem, not write additional application code which adds > > to the complexity of your security implementation. Complexity in your > > security implementation increases risk, not decreases it. Note this > > is not an argument against defense in depth - it is an argument for > > simplicity in each implementation layer. > > Let's look at these alternatives. > > With no post-login check, the application runs using whatever > datastore the login procedure finds acceptable. If the login > procedure fails or the datastore layer serves up the wrong datastore, > the application still does its thing. > > Post-validate may catch either of those errors. (Of course, the post- > validate could fail as well and allow access when it shouldn't, but > that just leaves you no worse off than you were without the check.) > Yes, the post-validate may block execution when it shouldn't, but > that's likely to be because the datastore layer is misbehaving, > delivering wrong data. The application may have failed eventually > anyway when running with a misbehaving datastore layer, but detection > during validation is better because the application doesn't get a > chance to corrupt user-data. > > On Dec 24, 10:17 am, hawkett wrote: > > > > You're hoping that the partitioning for a given datastore depends on > > > how google allows access to said datastore > > > Exactly - that is the feature request I am proposing. It seems > > likely to me that GAE uses a data partitioning feature of BigTable > > (maybe not, I don't know, but to me it seems the right place to > > implement a data partitioning function) - they should expand the way > > GAE uses that BigTable feature to offer the functionality I am > > requesting. > > > > If your customers are serious, they must, regardless of how your > > > applicatio
[google-appengine] Re: 1 application, multiple datastores
>> You're hoping that the partitioning for a given datastore depends on >> how google allows access to said datastore > Exactly - that is the feature request I am proposing. Huh? You were requesting the ability to spawn a new datastore and to have the login scheme for a given pile of application code pick the datastore. The above is about methods for separating datastores and whether the method for separating them should depend on how the datastore is chosen. > I don't agree. You should trust your authentication mechanism - this > is a trust relationship. If you don't trust it, then you need to > address that problem, not write additional application code which adds > to the complexity of your security implementation. Complexity in your > security implementation increases risk, not decreases it. Note this > is not an argument against defense in depth - it is an argument for > simplicity in each implementation layer. Let's look at these alternatives. With no post-login check, the application runs using whatever datastore the login procedure finds acceptable. If the login procedure fails or the datastore layer serves up the wrong datastore, the application still does its thing. Post-validate may catch either of those errors. (Of course, the post- validate could fail as well and allow access when it shouldn't, but that just leaves you no worse off than you were without the check.) Yes, the post-validate may block execution when it shouldn't, but that's likely to be because the datastore layer is misbehaving, delivering wrong data. The application may have failed eventually anyway when running with a misbehaving datastore layer, but detection during validation is better because the application doesn't get a chance to corrupt user-data. On Dec 24, 10:17 am, hawkett wrote: > > You're hoping that the partitioning for a given datastore depends on > > how google allows access to said datastore > > Exactly - that is the feature request I am proposing. It seems > likely to me that GAE uses a data partitioning feature of BigTable > (maybe not, I don't know, but to me it seems the right place to > implement a data partitioning function) - they should expand the way > GAE uses that BigTable feature to offer the functionality I am > requesting. > > > If your customers are serious, they must, regardless of how your > > application is deployed, regardless of who handles login/access > > management. Login code isn't the only risk. > > Perhaps, but the threshold is significantly lowered - customers are > more likely to undertake an audit (rather than go to a competitor) if > they can see you are using platform features for security - I stand by > the assertion that 100% of customers who engage you for the first time > will prefer you to be using the platform over custom application code > - especially for security. > > > And, if you're serious about login code, you must validate the login > > result. That is, once it is determined that a given user running your > > application should use a given datastore, the application then must > > look the datastore that it is trying to use and verify that it is > > actually the correct datastore for that user > > I don't agree. You should trust your authentication mechanism - this > is a trust relationship. If you don't trust it, then you need to > address that problem, not write additional application code which adds > to the complexity of your security implementation. Complexity in your > security implementation increases risk, not decreases it. Note this > is not an argument against defense in depth - it is an argument for > simplicity in each implementation layer. We are talking about the > authentication layer, and the db access layer, and both should be > platform concerns, not application concerns (at least from my > perspective) - certainly they are currently platform concerns in GAE, > and I would like them to stay that way. > > It is very important to note that the functionality is *nearly* there > already - i.e. restricting access to users from a google apps account > - it has strict data partitioning, authentication and db access are > platform concerns, user provisioning administration etc. is already > there in google apps. The only thing missing is a method of > automatically spawning a new application in response to a customer > registration (and the 10 app limit). > > The architecture of GAE right now is totally in line with what I am > talking about, and I have no doubt that this is for all the reasons I > have listed, and many I haven't even thought of. Consequently I doubt > that you will ever be given the functionality you are looking for - > i.e. accessing multiple datastores from the same application instance. > > I'll ask again - would a feature that allowed you to map the same > datastore to multiple application instances satisfy your use-case? It > does stretch the data partitioning thing a bit, but might be workable > from a pl
[google-appengine] Re: 1 application, multiple datastores
> You're hoping that the partitioning for a given datastore depends on > how google allows access to said datastore Exactly - that is the feature request I am proposing. It seems likely to me that GAE uses a data partitioning feature of BigTable (maybe not, I don't know, but to me it seems the right place to implement a data partitioning function) - they should expand the way GAE uses that BigTable feature to offer the functionality I am requesting. > If your customers are serious, they must, regardless of how your > application is deployed, regardless of who handles login/access > management. Login code isn't the only risk. Perhaps, but the threshold is significantly lowered - customers are more likely to undertake an audit (rather than go to a competitor) if they can see you are using platform features for security - I stand by the assertion that 100% of customers who engage you for the first time will prefer you to be using the platform over custom application code - especially for security. > And, if you're serious about login code, you must validate the login > result. That is, once it is determined that a given user running your > application should use a given datastore, the application then must > look the datastore that it is trying to use and verify that it is > actually the correct datastore for that user I don't agree. You should trust your authentication mechanism - this is a trust relationship. If you don't trust it, then you need to address that problem, not write additional application code which adds to the complexity of your security implementation. Complexity in your security implementation increases risk, not decreases it. Note this is not an argument against defense in depth - it is an argument for simplicity in each implementation layer. We are talking about the authentication layer, and the db access layer, and both should be platform concerns, not application concerns (at least from my perspective) - certainly they are currently platform concerns in GAE, and I would like them to stay that way. It is very important to note that the functionality is *nearly* there already - i.e. restricting access to users from a google apps account - it has strict data partitioning, authentication and db access are platform concerns, user provisioning administration etc. is already there in google apps. The only thing missing is a method of automatically spawning a new application in response to a customer registration (and the 10 app limit). The architecture of GAE right now is totally in line with what I am talking about, and I have no doubt that this is for all the reasons I have listed, and many I haven't even thought of. Consequently I doubt that you will ever be given the functionality you are looking for - i.e. accessing multiple datastores from the same application instance. I'll ask again - would a feature that allowed you to map the same datastore to multiple application instances satisfy your use-case? It does stretch the data partitioning thing a bit, but might be workable from a platform configuration perspective. On Dec 23, 7:22 pm, Andy Freeman wrote: > > In fact, given that Google already > > have a data partitioning mechanism for applications, I wouldn't be > > surprised if it was even lower level than the GAE platform, and part > > of the BigTable implementation. > > You're hoping that the partitioning for a given datastore depends on > how google allows access to said datastore. In particular, you're > hoping that the partitioning for datastores using a feature where a > given application can pick between a set of datastores is different > than the partitioning when a given application has access to exactly > one datastore. > > That's unlikely. If google decides to implement such a feature, it > would be silly to also introduce a different mechanism for > partitioning datastores. > > > How would they even begin to assess the risk profile - do > > they have to audit your company's development practices? > > If your customers are serious, they must, regardless of how your > application is deployed, regardless of who handles login/access > management. Login code isn't the only risk. > > And, if you're serious about login code, you must validate the login > result. That is, once it is determined that a given user running your > application should use a given datastore, the application then must > look the datastore that it is trying to use and verify that it is > actually the correct datastore for that user. Platform login code > can't do that check. And, the platform's login doesn't provide much > information to the application for it to do such a check. > > Yes, I realize that customers have different risk and cost > sensitivities so there must be some right around the points that you > like. However, that's a long way from saying that such points > dominate. > > On Dec 22, 2:03 pm, hawkett wrote: > > > > No, I don't agree. Even if we ignore the admin
[google-appengine] Re: 1 application, multiple datastores
> In fact, given that Google already > have a data partitioning mechanism for applications, I wouldn't be > surprised if it was even lower level than the GAE platform, and part > of the BigTable implementation. You're hoping that the partitioning for a given datastore depends on how google allows access to said datastore. In particular, you're hoping that the partitioning for datastores using a feature where a given application can pick between a set of datastores is different than the partitioning when a given application has access to exactly one datastore. That's unlikely. If google decides to implement such a feature, it would be silly to also introduce a different mechanism for partitioning datastores. > How would they even begin to assess the risk profile - do > they have to audit your company's development practices? If your customers are serious, they must, regardless of how your application is deployed, regardless of who handles login/access management. Login code isn't the only risk. And, if you're serious about login code, you must validate the login result. That is, once it is determined that a given user running your application should use a given datastore, the application then must look the datastore that it is trying to use and verify that it is actually the correct datastore for that user. Platform login code can't do that check. And, the platform's login doesn't provide much information to the application for it to do such a check. Yes, I realize that customers have different risk and cost sensitivities so there must be some right around the points that you like. However, that's a long way from saying that such points dominate. On Dec 22, 2:03 pm, hawkett wrote: > > No, I don't agree. Even if we ignore the admin console hole, "strict > > data partitioning" is a fantasy in an environment where data lives on > > the same hardware. The google code for handling multiple datastores > > could go wonky. Or, their user login code could do the wrong thing. > > I think you are making a number of mistakes - > > 1. Believing that the risk profile of platform code is the same as > your application code (and the 1000's of other developers that roll > their own data security solution because it isn't part of the > platform). If Google offers it as part of the platform, then it has > been tested by those 1000's of developers, by their customers and by > Google as a major part of a strategic platform offering by a company > with enormous resources. You state - 'Since the risk of your login > code doing the wrong thing is unacceptable, it's unclear why the risk > of their code doing the wrong thing is any more acceptable' - it is > *absolutely* clear that the risks are not even in the same ballpark. > > 2. Not seeing it from the customer's perspective. What they see is > that every app on GAE is a roll your own data security effort - what a > nightmare - how are they to tell which app was written well and which > wasn't? How would they even begin to assess the risk profile - do > they have to audit your company's development practices? If Google > offers it as part of the platform the customer knows that every app > shares the same implementation, and (assuming you agree with the first > point) a far less risky one. Maybe you can write more or equally > stable code than Google, but the customer has know way of knowing that > - I'll bet you that 100% of customer's that approach you for the first > time would rather hear that data security is supplied as part of the > Google platform than by your application code and not because "that's > nice" - but because this feature *does* affect the real security of > their data. > > 3. Thinking strict data partitioning "is a fantasy in an environment > where data lives on the same hardware". I did say strict, not > physical. I think you are missing the value of the software platform > again - it is not the same thing as application code that runs on the > platform. You appear to be making a simple distinction between > hardware and software. The application platform is inherently more > robust than your application code. In fact, given that Google already > have a data partitioning mechanism for applications, I wouldn't be > surprised if it was even lower level than the GAE platform, and part > of the BigTable implementation. That would make it even more robust > than the GAE platform code. Which is ludicrously robust compared to > your application code. > > 4. Thinking risk reduction is not valuable unless it results in total > risk removal. Security is all about the management and mitigation of > risk - not necessarily the removal of it. Remove it if you can, > obviously - but generally this is an unlikely outcome. We've > identified three threats (admin console error, external app doing > admin, bug in application code) - and you are saying that unless we > remove the first two, why remove the third. It's false logic, and > poo
[google-appengine] Re: 1 application, multiple datastores
On Dec 22, 5:03 pm, hawkett wrote: > 2. Not seeing it from the customer's perspective. What they see is > that every app on GAE is a roll your own data security effort - what a > nightmare - how are they to tell which app was written well and which > wasn't? How would they even begin to assess the risk profile - do > they have to audit your company's development practices? They're trusting you with their data. Why should they trust your code not to email all of the data they input to malicious people if they can't trust you to write code that keeps their data separate enough from other people's data? If they don't trust you buy use your product anyway, they're stupid, whether Google provides you the tools to do better security than you can do yourself or not. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
> No, I don't agree. Even if we ignore the admin console hole, "strict > data partitioning" is a fantasy in an environment where data lives on > the same hardware. The google code for handling multiple datastores > could go wonky. Or, their user login code could do the wrong thing. I think you are making a number of mistakes - 1. Believing that the risk profile of platform code is the same as your application code (and the 1000's of other developers that roll their own data security solution because it isn't part of the platform). If Google offers it as part of the platform, then it has been tested by those 1000's of developers, by their customers and by Google as a major part of a strategic platform offering by a company with enormous resources. You state - 'Since the risk of your login code doing the wrong thing is unacceptable, it's unclear why the risk of their code doing the wrong thing is any more acceptable' - it is *absolutely* clear that the risks are not even in the same ballpark. 2. Not seeing it from the customer's perspective. What they see is that every app on GAE is a roll your own data security effort - what a nightmare - how are they to tell which app was written well and which wasn't? How would they even begin to assess the risk profile - do they have to audit your company's development practices? If Google offers it as part of the platform the customer knows that every app shares the same implementation, and (assuming you agree with the first point) a far less risky one. Maybe you can write more or equally stable code than Google, but the customer has know way of knowing that - I'll bet you that 100% of customer's that approach you for the first time would rather hear that data security is supplied as part of the Google platform than by your application code and not because "that's nice" - but because this feature *does* affect the real security of their data. 3. Thinking strict data partitioning "is a fantasy in an environment where data lives on the same hardware". I did say strict, not physical. I think you are missing the value of the software platform again - it is not the same thing as application code that runs on the platform. You appear to be making a simple distinction between hardware and software. The application platform is inherently more robust than your application code. In fact, given that Google already have a data partitioning mechanism for applications, I wouldn't be surprised if it was even lower level than the GAE platform, and part of the BigTable implementation. That would make it even more robust than the GAE platform code. Which is ludicrously robust compared to your application code. 4. Thinking risk reduction is not valuable unless it results in total risk removal. Security is all about the management and mitigation of risk - not necessarily the removal of it. Remove it if you can, obviously - but generally this is an unlikely outcome. We've identified three threats (admin console error, external app doing admin, bug in application code) - and you are saying that unless we remove the first two, why remove the third. It's false logic, and poor security. 5. Confusing a admin error or malicious attack with a software bug. You state - "My point is that that's not true. If I have access to multiple admin consoles (for maintenance reasons), I can combine the results that I get from each of the consoles, effectively giving me the ability to query against multiple datastores. I can do this with a program that 'runs' the admin consoles or I can do it by hand". Can you point me to the admin console API you'd use to 'run' it via a program - or are you talking about screen scraping? Both situations are totally different scenarios, and risk profiles, to being able to introduce a bug into your application code that exposes customer data. 6. Thinking that with strict data partitioning you *will* be able to introduce a bug into *your* application code that exposes multiple customers data to multiple other customers via the Datastore API. And this is the key - from the customer's perspective - yes they have to worry about an admin error, or some secondary application you might write, or a malicious attack, or a defect in the platform code, or a disgruntled employee - but they don't have to worry about the application code they use every day, and that is one of their biggest risk points eliminated - the largest part of your company offering. Security is about risk mitigation and management. If you still disagree, can you please explain to me how the bug would manifest in your application code (I'm talking about code that runs on GAE, with strict data partitioning). In the end, I think it all comes down to point 1, and an understanding that software security is all about risk mitigation and management. Control what you can, have contingency for what you can't. If you agree with point 1, then you understand my position. If you don't agree w
[google-appengine] Re: 1 application, multiple datastores
> For the admin console, I'm saying you can only use this feature to run > against each datastore in isolation. My point is that that's not true. If I have access to multiple admin consoles (for maintenance reasons), I can combine the results that I get from each of the consoles, effectively giving me the ability to query against multiple datastores. I can do this with a program that "runs" the admin consoles or I can do it by hand. > And I can't do that if administrators can run ad-hoc unsecured queries > across customer data stores. (well, maybe they are secured, but only > by your application code) Since the console is application code > Do you agree that the cited bug would not occur with strict data > partitioning, and could occur if issue 106 was actioned? No, I don't agree. Even if we ignore the admin console hole, "strict data partitioning" is a fantasy in an environment where data lives on the same hardware. The google code for handling multiple datastores could go wonky. Or, their user login code could do the wrong thing. Since the risk of your login code doing the wrong thing is unacceptable, it's unclear why the risk of their code doing the wrong thing is any more acceptable. > And finally - I am looking for features that allow me to give my > customers confidence, not erode it. That's nice, but the feature in question doesn't affect the real security of your customer's data. If multiple customers have data on the same piece of hardware, some code has to manage the separation. If it's unacceptable for your code to do so > It seems to me you are saying that if there is *any* mechanism that > could compromise customer data, then why bother worrying about it? Not at all. I'm saying that if a given mechanism is an unacceptable risk under one name, it's an unacceptable risk under all names. I'm also saying that if you're putting in a screen door (admin consoles), it's somewhat silly to worry about weatherstripping said door. On Dec 22, 10:21 am, hawkett wrote: > > I'm okay with that constraint. My point is that if the application > > has an admin console or an admin user, one can write a query that runs > > across multiple datastores by writing code that accesses said > > datastores through their admin consoles and/or users. > > For the admin console, I'm saying you can only use this feature to run > against each datastore in isolation. Pick the datastore, run the > query. It's fair to say the admin console security model is another > problem that GAE needs to sort out > (e.g.http://code.google.com/p/googleappengine/issues/detail?id=91), but I > would hope that when it is sorted out, I can assign admin rights on > different data stores to different users in my organisation. > > For the admin user option, I am expecting that the admin user is > unique to each data store, not one admin user for all customers. The > picture I am painting is that you administer your customer data > instances individually, not as an aggregate. > > I want to be able to make a statement like this about my application > running on GAE (note especially the data security section at the > bottom) > > http://www.rallydev.com/products/deployment_solutions/security/ > > And I can't do that if administrators can run ad-hoc unsecured queries > across customer data stores. (well, maybe they are secured, but only > by your application code) > > And finally - I am looking for features that allow me to give my > customers confidence, not erode it. Saying that their data is > partitioned from other customer's achieves that goal. That doesn't > mean their data is perfectly safe - there would be any number of other > means by which their data could be exposed to their competitors, but I > can guarantee them that they their business plan is not going to > suddenly appear on the welcome screen of a competitor due to a bug in > my application code. The cited bug is a perfect example of this sort > of thing actually happening, and of a situation that would be closed > off with effective data partitioning. > > Do you agree that the cited bug would not occur with strict data > partitioning, and could occur if issue 106 was actioned? If you are > looking for a distinction, then this is it. To be perfectly clear, I > see this bug as an example of multiple customers having their data > exposed to multiple other customers - this is not a bug that would > occur by someone making a mistake in admin console (when you can only > query customer datastores in isolation). > > It seems to me you are saying that if there is *any* mechanism that > could compromise customer data, then why bother worrying about it? > > There is a *lot* of work for GAE to do to get to the point where an > app on their infrastructure can make a claim like that - e.g. I can't > believe only 6 people have starred this issue for example - > > http://code.google.com/p/googleappengine/issues/detail?id=501 > > I suspect it is because people
[google-appengine] Re: 1 application, multiple datastores
> I'm okay with that constraint. My point is that if the application > has an admin console or an admin user, one can write a query that runs > across multiple datastores by writing code that accesses said > datastores through their admin consoles and/or users. For the admin console, I'm saying you can only use this feature to run against each datastore in isolation. Pick the datastore, run the query. It's fair to say the admin console security model is another problem that GAE needs to sort out (e.g. http://code.google.com/p/googleappengine/issues/detail?id=91), but I would hope that when it is sorted out, I can assign admin rights on different data stores to different users in my organisation. For the admin user option, I am expecting that the admin user is unique to each data store, not one admin user for all customers. The picture I am painting is that you administer your customer data instances individually, not as an aggregate. I want to be able to make a statement like this about my application running on GAE (note especially the data security section at the bottom) http://www.rallydev.com/products/deployment_solutions/security/ And I can't do that if administrators can run ad-hoc unsecured queries across customer data stores. (well, maybe they are secured, but only by your application code) And finally - I am looking for features that allow me to give my customers confidence, not erode it. Saying that their data is partitioned from other customer's achieves that goal. That doesn't mean their data is perfectly safe - there would be any number of other means by which their data could be exposed to their competitors, but I can guarantee them that they their business plan is not going to suddenly appear on the welcome screen of a competitor due to a bug in my application code. The cited bug is a perfect example of this sort of thing actually happening, and of a situation that would be closed off with effective data partitioning. Do you agree that the cited bug would not occur with strict data partitioning, and could occur if issue 106 was actioned? If you are looking for a distinction, then this is it. To be perfectly clear, I see this bug as an example of multiple customers having their data exposed to multiple other customers - this is not a bug that would occur by someone making a mistake in admin console (when you can only query customer datastores in isolation). It seems to me you are saying that if there is *any* mechanism that could compromise customer data, then why bother worrying about it? There is a *lot* of work for GAE to do to get to the point where an app on their infrastructure can make a claim like that - e.g. I can't believe only 6 people have starred this issue for example - http://code.google.com/p/googleappengine/issues/detail?id=501 I suspect it is because people only look at the first page of issues - which completely debunks the idea that Google should be using stars in its issues list to prioritise its work schedule. On Dec 22, 4:47 pm, Andy Freeman wrote: > > > Are you ok with the constraint that a query can not be run across > > multiple data stores? If we can agree on that, then I'd say we are > > doing pretty well. > > I'm okay with that constraint. My point is that if the application > has an admin console or an admin user, one can write a query that runs > across multiple datastores by writing code that accesses said > datastores through their admin consoles and/or users. > > No, such a query doesn't run in the application itself. However, a > query in an application that validates the user, determines which > datastore to use, and then runs all queries within that datastore also > doesn't access multiple datastores even if it does use an API feature > that could be used to access multiple datastores if said application > were written differently. > > I still have no interest in running a query across multiple datastores > and have never suggested otherwise. > > I'm trying understand why a feature that lets the application > programmer determine which datastore to use is an unacceptable way to > support "one code base, customer-specific datastores" if it's okay to > have an admin console and/or applications that have an admin user. > > Yes, it's convenient to have google manage all login stuff, but that > means that you don't have any control. If they're your customers > > On Dec 22, 4:25 am, hawkett wrote: > > > > > You use the example of maintenance and fixes > > > > on behalf of customers - when would that require querying across two > > > > customer's data stores? > > > > I never said or implied that it did. > > > Issue 106 proposes '...cross app queries using the db APIs only' - > > which to me means you can easily introduce a bug like the one > > originally posted - i.e. querying across two customer's data stores. > > Apologies if I understood your responses to be in support of this > > approach when they were not. Perhaps you could
[google-appengine] Re: 1 application, multiple datastores
> > Are you ok with the constraint that a query can not be run across > multiple data stores? If we can agree on that, then I'd say we are > doing pretty well. I'm okay with that constraint. My point is that if the application has an admin console or an admin user, one can write a query that runs across multiple datastores by writing code that accesses said datastores through their admin consoles and/or users. No, such a query doesn't run in the application itself. However, a query in an application that validates the user, determines which datastore to use, and then runs all queries within that datastore also doesn't access multiple datastores even if it does use an API feature that could be used to access multiple datastores if said application were written differently. I still have no interest in running a query across multiple datastores and have never suggested otherwise. I'm trying understand why a feature that lets the application programmer determine which datastore to use is an unacceptable way to support "one code base, customer-specific datastores" if it's okay to have an admin console and/or applications that have an admin user. Yes, it's convenient to have google manage all login stuff, but that means that you don't have any control. If they're your customers On Dec 22, 4:25 am, hawkett wrote: > > > You use the example of maintenance and fixes > > > on behalf of customers - when would that require querying across two > > > customer's data stores? > > > I never said or implied that it did. > > Issue 106 proposes '...cross app queries using the db APIs only' - > which to me means you can easily introduce a bug like the one > originally posted - i.e. querying across two customer's data stores. > Apologies if I understood your responses to be in support of this > approach when they were not. Perhaps you could elaborate your use > case in a little more detail. > > Are you ok with the constraint that a query can not be run across > multiple data stores? If we can agree on that, then I'd say we are > doing pretty well. > > For accessing another application's data store from your code, I would > (and have) recommended exposing an API that you can access via HTTP. > I believe this is what Google has suggested in this post > > http://groups.google.com/group/google-appengine/browse_thread/thread/... > > which is quoted in Issue 106. > > If you do have a use case where you do want/need to run queries across > customer data stores, then I would have that customer data in the same > data store - i.e. what do you need the partition for in the first > place? > > Unfortunately the idea of a data partition and an application > partition are the same thing at the moment with GAE, so perhaps you > need the partition for quota and billing purposes, which forces you to > have separate data stores when you don't want them. In that case I > would raise a feature request for multiple applications to be able to > share a single data store - would this satisfy what you are trying to > achieve? > > On Dec 22, 3:03 am, Andy Freeman wrote: > > > > > I'm paraphrasing you. You've written repeatedly that a feature that > > allows an application to choose the datastore on which it operates can > > not be used for your purposes. The argument appears to be that an > > application that uses such a feature can theoretically access multiple > > datastores and is therefore unacceptable, even if that application is > > written so it validates the user and then chooses which datastore to > > access and only accesses one datastore after doing so. > > > However, you're happy if a user's data can be accessed through a > > google admin console or via an admin user. > > > The reason that I find that distinction strained is that GAE > > applications and the google admin console can be driven > > programmatically. As a result, one can easily write code using those > > facilities that simultaneously accesses multiple datastores, which is > > your reason for rejecting the "choose which datastore to access" > > feature. > > > > You use the example of maintenance and fixes > > > on behalf of customers - when would that require querying across two > > > customer's data stores? > > > I never said or implied that it did. > > > On Dec 21, 4:13 pm, hawkett wrote: > > > > Who are you quoting? > > > > The Google admin console should not be capable of querying across > > > multiple customer data stores. I repeat - application code can not > > > execute a query across multiple customer data stores - did I offer a > > > distinction somewhere? Admin console *would* allow you to run queries > > > against each of your customer data stores in isolation. I expect it > > > would use a common, non-public, platform API (i.e. making data > > > security part of the platform) to access the logical partitions. > > > > What is your use-case? You use the example of maintenance and fixes > > > on behalf of customers - when would that require
[google-appengine] Re: 1 application, multiple datastores
> > You use the example of maintenance and fixes > > on behalf of customers - when would that require querying across two > > customer's data stores? > > I never said or implied that it did. Issue 106 proposes '...cross app queries using the db APIs only' - which to me means you can easily introduce a bug like the one originally posted - i.e. querying across two customer's data stores. Apologies if I understood your responses to be in support of this approach when they were not. Perhaps you could elaborate your use case in a little more detail. Are you ok with the constraint that a query can not be run across multiple data stores? If we can agree on that, then I'd say we are doing pretty well. For accessing another application's data store from your code, I would (and have) recommended exposing an API that you can access via HTTP. I believe this is what Google has suggested in this post http://groups.google.com/group/google-appengine/browse_thread/thread/12eb676e98a25293/f5cfaad4e0d79ac8 which is quoted in Issue 106. If you do have a use case where you do want/need to run queries across customer data stores, then I would have that customer data in the same data store - i.e. what do you need the partition for in the first place? Unfortunately the idea of a data partition and an application partition are the same thing at the moment with GAE, so perhaps you need the partition for quota and billing purposes, which forces you to have separate data stores when you don't want them. In that case I would raise a feature request for multiple applications to be able to share a single data store - would this satisfy what you are trying to achieve? On Dec 22, 3:03 am, Andy Freeman wrote: > I'm paraphrasing you. You've written repeatedly that a feature that > allows an application to choose the datastore on which it operates can > not be used for your purposes. The argument appears to be that an > application that uses such a feature can theoretically access multiple > datastores and is therefore unacceptable, even if that application is > written so it validates the user and then chooses which datastore to > access and only accesses one datastore after doing so. > > However, you're happy if a user's data can be accessed through a > google admin console or via an admin user. > > The reason that I find that distinction strained is that GAE > applications and the google admin console can be driven > programmatically. As a result, one can easily write code using those > facilities that simultaneously accesses multiple datastores, which is > your reason for rejecting the "choose which datastore to access" > feature. > > > You use the example of maintenance and fixes > > on behalf of customers - when would that require querying across two > > customer's data stores? > > I never said or implied that it did. > > On Dec 21, 4:13 pm, hawkett wrote: > > > Who are you quoting? > > > The Google admin console should not be capable of querying across > > multiple customer data stores. I repeat - application code can not > > execute a query across multiple customer data stores - did I offer a > > distinction somewhere? Admin console *would* allow you to run queries > > against each of your customer data stores in isolation. I expect it > > would use a common, non-public, platform API (i.e. making data > > security part of the platform) to access the logical partitions. > > > What is your use-case? You use the example of maintenance and fixes > > on behalf of customers - when would that require querying across two > > customer's data stores? It's a recipe for disaster. > > > On Dec 21, 11:53 pm, Andy Freeman wrote: > > > > The distinction between "application code that can access multiple > > > datastores" and "code that can access multiple datastores" seems > > > strained at best. > > > > If there's code that can get to a user's data (and both the admin > > > console and an admin user are code that can get to the user's data), > > > does it really matter what you call it? > > > > On Dec 21, 3:15 pm, hawkett wrote: > > > > > Via the admin console. Google provides this application code, and it > > > > is common - part of the platform offering. This is one possibility. > > > > Another is that an admin user for that customer is made available to > > > > you for administration purposes. You could initialise the customer > > > > data space with this user profile. It may depend how you map the > > > > authenticated entity to logical identities in your application. > > > > Whichever, you do not have application code capable of querying across > > > > customer data stores, because the platform does not allow it. > > > > > On Dec 21, 10:49 pm, Andy Freeman wrote:> As I > > > > promised, now I'm going to ask how you plan to do maintenance and > > > > > fixes on behalf of your customers if you can't get to their data. > > > > > > If you have access to the customer's data, they're trusting your code > > > > > and Googl
[google-appengine] Re: 1 application, multiple datastores
I'm paraphrasing you. You've written repeatedly that a feature that allows an application to choose the datastore on which it operates can not be used for your purposes. The argument appears to be that an application that uses such a feature can theoretically access multiple datastores and is therefore unacceptable, even if that application is written so it validates the user and then chooses which datastore to access and only accesses one datastore after doing so. However, you're happy if a user's data can be accessed through a google admin console or via an admin user. The reason that I find that distinction strained is that GAE applications and the google admin console can be driven programmatically. As a result, one can easily write code using those facilities that simultaneously accesses multiple datastores, which is your reason for rejecting the "choose which datastore to access" feature. > You use the example of maintenance and fixes > on behalf of customers - when would that require querying across two > customer's data stores? I never said or implied that it did. On Dec 21, 4:13 pm, hawkett wrote: > Who are you quoting? > > The Google admin console should not be capable of querying across > multiple customer data stores. I repeat - application code can not > execute a query across multiple customer data stores - did I offer a > distinction somewhere? Admin console *would* allow you to run queries > against each of your customer data stores in isolation. I expect it > would use a common, non-public, platform API (i.e. making data > security part of the platform) to access the logical partitions. > > What is your use-case? You use the example of maintenance and fixes > on behalf of customers - when would that require querying across two > customer's data stores? It's a recipe for disaster. > > On Dec 21, 11:53 pm, Andy Freeman wrote: > > > > > The distinction between "application code that can access multiple > > datastores" and "code that can access multiple datastores" seems > > strained at best. > > > If there's code that can get to a user's data (and both the admin > > console and an admin user are code that can get to the user's data), > > does it really matter what you call it? > > > On Dec 21, 3:15 pm, hawkett wrote: > > > > Via the admin console. Google provides this application code, and it > > > is common - part of the platform offering. This is one possibility. > > > Another is that an admin user for that customer is made available to > > > you for administration purposes. You could initialise the customer > > > data space with this user profile. It may depend how you map the > > > authenticated entity to logical identities in your application. > > > Whichever, you do not have application code capable of querying across > > > customer data stores, because the platform does not allow it. > > > > On Dec 21, 10:49 pm, Andy Freeman wrote:> As I > > > promised, now I'm going to ask how you plan to do maintenance and > > > > fixes on behalf of your customers if you can't get to their data. > > > > > If you have access to the customer's data, they're trusting your code > > > > and Google is not protecting their data. > > > > > On Dec 21, 2:13 pm, hawkett wrote: > > > > > > > Yes, there is the issue that application code has to manage the > > > > > > customer-specific datastores, but if multiple customers are hosted > > > > > > on > > > > > > the same hardware, someone's code has to do that work and it's > > > > > > unclear > > > > > > why application code can't be part of that process. If the response > > > > > > is that application code isn't trusted by customers to maintain > > > > > > separation, I'm going to ask how you do maintenance and fixes on > > > > > > their > > > > > > behalf. > > > > > > If data segregation is a fundamental feature of the platform, then it > > > > > is inherently more trustable that N pieces of application code all > > > > > attempting the same thing. Me saying 'My code will keep your data > > > > > private' carries nothing like the weight that Google saying 'It is not > > > > > possible to run a query across two data stores' does. I would only > > > > > need to say 'Your data will be stored in a separate partition', and > > > > > that has tangible meaning to the customer from a data security > > > > > perspective. They are then placing their trust more in Google for > > > > > this feature than in my application. > > > > > > From a maintenance, reliability, trustability, transparency etc. > > > > > perspective, moving a common feature (especially a security feature) > > > > > from the application layer to platform layer is a major advantage, and > > > > > something a good architecture should always try to achieve. > > > > > > I want as little application code as possible to express my > > > > > application. This is already one of the key wins of the GAE platform, > > > > > and moving something as fundamental as data partitioning out of
[google-appengine] Re: 1 application, multiple datastores
Who are you quoting? The Google admin console should not be capable of querying across multiple customer data stores. I repeat - application code can not execute a query across multiple customer data stores - did I offer a distinction somewhere? Admin console *would* allow you to run queries against each of your customer data stores in isolation. I expect it would use a common, non-public, platform API (i.e. making data security part of the platform) to access the logical partitions. What is your use-case? You use the example of maintenance and fixes on behalf of customers - when would that require querying across two customer's data stores? It's a recipe for disaster. On Dec 21, 11:53 pm, Andy Freeman wrote: > The distinction between "application code that can access multiple > datastores" and "code that can access multiple datastores" seems > strained at best. > > If there's code that can get to a user's data (and both the admin > console and an admin user are code that can get to the user's data), > does it really matter what you call it? > > On Dec 21, 3:15 pm, hawkett wrote: > > > Via the admin console. Google provides this application code, and it > > is common - part of the platform offering. This is one possibility. > > Another is that an admin user for that customer is made available to > > you for administration purposes. You could initialise the customer > > data space with this user profile. It may depend how you map the > > authenticated entity to logical identities in your application. > > Whichever, you do not have application code capable of querying across > > customer data stores, because the platform does not allow it. > > > On Dec 21, 10:49 pm, Andy Freeman wrote:> As I > > promised, now I'm going to ask how you plan to do maintenance and > > > fixes on behalf of your customers if you can't get to their data. > > > > If you have access to the customer's data, they're trusting your code > > > and Google is not protecting their data. > > > > On Dec 21, 2:13 pm, hawkett wrote: > > > > > > Yes, there is the issue that application code has to manage the > > > > > customer-specific datastores, but if multiple customers are hosted on > > > > > the same hardware, someone's code has to do that work and it's unclear > > > > > why application code can't be part of that process. If the response > > > > > is that application code isn't trusted by customers to maintain > > > > > separation, I'm going to ask how you do maintenance and fixes on their > > > > > behalf. > > > > > If data segregation is a fundamental feature of the platform, then it > > > > is inherently more trustable that N pieces of application code all > > > > attempting the same thing. Me saying 'My code will keep your data > > > > private' carries nothing like the weight that Google saying 'It is not > > > > possible to run a query across two data stores' does. I would only > > > > need to say 'Your data will be stored in a separate partition', and > > > > that has tangible meaning to the customer from a data security > > > > perspective. They are then placing their trust more in Google for > > > > this feature than in my application. > > > > > From a maintenance, reliability, trustability, transparency etc. > > > > perspective, moving a common feature (especially a security feature) > > > > from the application layer to platform layer is a major advantage, and > > > > something a good architecture should always try to achieve. > > > > > I want as little application code as possible to express my > > > > application. This is already one of the key wins of the GAE platform, > > > > and moving something as fundamental as data partitioning out of the > > > > application platform will enhance this capability. > > > > > On Dec 21, 5:55 pm, Andy Freeman wrote: > > > > > > > One suggests it > > > > > > should be impossible for the same piece of code to access separate > > > > > > datastore instances, the other suggests that this is a desirable > > > > > > feature. I don't see how you consider them the same - are you > > > > > > saying > > > > > > that you can't see how the cited bug is caused by multiple customers > > > > > > sharing the same data space? > > > > > > Right now, separate applications have separate code and separate > > > > > datastores. If management issues are the only obstacle to using > > > > > separate applications for different users, that tells us that separate > > > > > datastores do not share the same data space for these purposes. > > > > > > Yes, there is the issue that application code has to manage the > > > > > customer-specific datastores, but if multiple customers are hosted on > > > > > the same hardware, someone's code has to do that work and it's unclear > > > > > why application code can't be part of that process. If the response > > > > > is that application code isn't trusted by customers to maintain > > > > > separation, I'm going to ask how you do maintenance and fixes on thei
[google-appengine] Re: 1 application, multiple datastores
The distinction between "application code that can access multiple datastores" and "code that can access multiple datastores" seems strained at best. If there's code that can get to a user's data (and both the admin console and an admin user are code that can get to the user's data), does it really matter what you call it? On Dec 21, 3:15 pm, hawkett wrote: > Via the admin console. Google provides this application code, and it > is common - part of the platform offering. This is one possibility. > Another is that an admin user for that customer is made available to > you for administration purposes. You could initialise the customer > data space with this user profile. It may depend how you map the > authenticated entity to logical identities in your application. > Whichever, you do not have application code capable of querying across > customer data stores, because the platform does not allow it. > > On Dec 21, 10:49 pm, Andy Freeman wrote:> As I > promised, now I'm going to ask how you plan to do maintenance and > > fixes on behalf of your customers if you can't get to their data. > > > If you have access to the customer's data, they're trusting your code > > and Google is not protecting their data. > > > On Dec 21, 2:13 pm, hawkett wrote: > > > > > Yes, there is the issue that application code has to manage the > > > > customer-specific datastores, but if multiple customers are hosted on > > > > the same hardware, someone's code has to do that work and it's unclear > > > > why application code can't be part of that process. If the response > > > > is that application code isn't trusted by customers to maintain > > > > separation, I'm going to ask how you do maintenance and fixes on their > > > > behalf. > > > > If data segregation is a fundamental feature of the platform, then it > > > is inherently more trustable that N pieces of application code all > > > attempting the same thing. Me saying 'My code will keep your data > > > private' carries nothing like the weight that Google saying 'It is not > > > possible to run a query across two data stores' does. I would only > > > need to say 'Your data will be stored in a separate partition', and > > > that has tangible meaning to the customer from a data security > > > perspective. They are then placing their trust more in Google for > > > this feature than in my application. > > > > From a maintenance, reliability, trustability, transparency etc. > > > perspective, moving a common feature (especially a security feature) > > > from the application layer to platform layer is a major advantage, and > > > something a good architecture should always try to achieve. > > > > I want as little application code as possible to express my > > > application. This is already one of the key wins of the GAE platform, > > > and moving something as fundamental as data partitioning out of the > > > application platform will enhance this capability. > > > > On Dec 21, 5:55 pm, Andy Freeman wrote: > > > > > > One suggests it > > > > > should be impossible for the same piece of code to access separate > > > > > datastore instances, the other suggests that this is a desirable > > > > > feature. I don't see how you consider them the same - are you saying > > > > > that you can't see how the cited bug is caused by multiple customers > > > > > sharing the same data space? > > > > > Right now, separate applications have separate code and separate > > > > datastores. If management issues are the only obstacle to using > > > > separate applications for different users, that tells us that separate > > > > datastores do not share the same data space for these purposes. > > > > > Yes, there is the issue that application code has to manage the > > > > customer-specific datastores, but if multiple customers are hosted on > > > > the same hardware, someone's code has to do that work and it's unclear > > > > why application code can't be part of that process. If the response > > > > is that application code isn't trusted by customers to maintain > > > > separation, I'm going to ask how you do maintenance and fixes on their > > > > behalf. > > > > > Note that customers don't write application code in this model, > > > > whether they use separate applications or one that uses customer- > > > > specific datastores. > > > > > Here's how it would work. Customer accesses system, system figures > > > > out which datastore to use, system acts upon datastore on customer's > > > > behalf using application code. > > > > > Note that this is exactly the same way that any scheme with shared > > > > hardware would accomplish the same separation. The only difference is > > > > whether the "figure out" is done by Google or by you. > > > > > On Dec 20, 7:30 pm, hawkett wrote: > > > > > > Andy - they are essentially mutually exclusive. One suggests it > > > > > should be impossible for the same piece of code to access separate > > > > > datastore instances, the other suggests that this is
[google-appengine] Re: 1 application, multiple datastores
Via the admin console. Google provides this application code, and it is common - part of the platform offering. This is one possibility. Another is that an admin user for that customer is made available to you for administration purposes. You could initialise the customer data space with this user profile. It may depend how you map the authenticated entity to logical identities in your application. Whichever, you do not have application code capable of querying across customer data stores, because the platform does not allow it. On Dec 21, 10:49 pm, Andy Freeman wrote: > As I promised, now I'm going to ask how you plan to do maintenance and > fixes on behalf of your customers if you can't get to their data. > > If you have access to the customer's data, they're trusting your code > and Google is not protecting their data. > > On Dec 21, 2:13 pm, hawkett wrote: > > > > Yes, there is the issue that application code has to manage the > > > customer-specific datastores, but if multiple customers are hosted on > > > the same hardware, someone's code has to do that work and it's unclear > > > why application code can't be part of that process. If the response > > > is that application code isn't trusted by customers to maintain > > > separation, I'm going to ask how you do maintenance and fixes on their > > > behalf. > > > If data segregation is a fundamental feature of the platform, then it > > is inherently more trustable that N pieces of application code all > > attempting the same thing. Me saying 'My code will keep your data > > private' carries nothing like the weight that Google saying 'It is not > > possible to run a query across two data stores' does. I would only > > need to say 'Your data will be stored in a separate partition', and > > that has tangible meaning to the customer from a data security > > perspective. They are then placing their trust more in Google for > > this feature than in my application. > > > From a maintenance, reliability, trustability, transparency etc. > > perspective, moving a common feature (especially a security feature) > > from the application layer to platform layer is a major advantage, and > > something a good architecture should always try to achieve. > > > I want as little application code as possible to express my > > application. This is already one of the key wins of the GAE platform, > > and moving something as fundamental as data partitioning out of the > > application platform will enhance this capability. > > > On Dec 21, 5:55 pm, Andy Freeman wrote: > > > > > One suggests it > > > > should be impossible for the same piece of code to access separate > > > > datastore instances, the other suggests that this is a desirable > > > > feature. I don't see how you consider them the same - are you saying > > > > that you can't see how the cited bug is caused by multiple customers > > > > sharing the same data space? > > > > Right now, separate applications have separate code and separate > > > datastores. If management issues are the only obstacle to using > > > separate applications for different users, that tells us that separate > > > datastores do not share the same data space for these purposes. > > > > Yes, there is the issue that application code has to manage the > > > customer-specific datastores, but if multiple customers are hosted on > > > the same hardware, someone's code has to do that work and it's unclear > > > why application code can't be part of that process. If the response > > > is that application code isn't trusted by customers to maintain > > > separation, I'm going to ask how you do maintenance and fixes on their > > > behalf. > > > > Note that customers don't write application code in this model, > > > whether they use separate applications or one that uses customer- > > > specific datastores. > > > > Here's how it would work. Customer accesses system, system figures > > > out which datastore to use, system acts upon datastore on customer's > > > behalf using application code. > > > > Note that this is exactly the same way that any scheme with shared > > > hardware would accomplish the same separation. The only difference is > > > whether the "figure out" is done by Google or by you. > > > > On Dec 20, 7:30 pm, hawkett wrote: > > > > > Andy - they are essentially mutually exclusive. One suggests it > > > > should be impossible for the same piece of code to access separate > > > > datastore instances, the other suggests that this is a desirable > > > > feature. I don't see how you consider them the same - are you saying > > > > that you can't see how the cited bug is caused by multiple customers > > > > sharing the same data space? I don't understand your perspective - > > > > the difference seems utterly obvious to me. > > > > > I *can* see that depending on the use case, one or the other would be > > > > good. In most cases I would say access between different customer > > > > data spaces is better modelled through an A
[google-appengine] Re: 1 application, multiple datastores
As I promised, now I'm going to ask how you plan to do maintenance and fixes on behalf of your customers if you can't get to their data. If you have access to the customer's data, they're trusting your code and Google is not protecting their data. On Dec 21, 2:13 pm, hawkett wrote: > > Yes, there is the issue that application code has to manage the > > customer-specific datastores, but if multiple customers are hosted on > > the same hardware, someone's code has to do that work and it's unclear > > why application code can't be part of that process. If the response > > is that application code isn't trusted by customers to maintain > > separation, I'm going to ask how you do maintenance and fixes on their > > behalf. > > If data segregation is a fundamental feature of the platform, then it > is inherently more trustable that N pieces of application code all > attempting the same thing. Me saying 'My code will keep your data > private' carries nothing like the weight that Google saying 'It is not > possible to run a query across two data stores' does. I would only > need to say 'Your data will be stored in a separate partition', and > that has tangible meaning to the customer from a data security > perspective. They are then placing their trust more in Google for > this feature than in my application. > > From a maintenance, reliability, trustability, transparency etc. > perspective, moving a common feature (especially a security feature) > from the application layer to platform layer is a major advantage, and > something a good architecture should always try to achieve. > > I want as little application code as possible to express my > application. This is already one of the key wins of the GAE platform, > and moving something as fundamental as data partitioning out of the > application platform will enhance this capability. > > On Dec 21, 5:55 pm, Andy Freeman wrote: > > > > > > One suggests it > > > should be impossible for the same piece of code to access separate > > > datastore instances, the other suggests that this is a desirable > > > feature. I don't see how you consider them the same - are you saying > > > that you can't see how the cited bug is caused by multiple customers > > > sharing the same data space? > > > Right now, separate applications have separate code and separate > > datastores. If management issues are the only obstacle to using > > separate applications for different users, that tells us that separate > > datastores do not share the same data space for these purposes. > > > Yes, there is the issue that application code has to manage the > > customer-specific datastores, but if multiple customers are hosted on > > the same hardware, someone's code has to do that work and it's unclear > > why application code can't be part of that process. If the response > > is that application code isn't trusted by customers to maintain > > separation, I'm going to ask how you do maintenance and fixes on their > > behalf. > > > Note that customers don't write application code in this model, > > whether they use separate applications or one that uses customer- > > specific datastores. > > > Here's how it would work. Customer accesses system, system figures > > out which datastore to use, system acts upon datastore on customer's > > behalf using application code. > > > Note that this is exactly the same way that any scheme with shared > > hardware would accomplish the same separation. The only difference is > > whether the "figure out" is done by Google or by you. > > > On Dec 20, 7:30 pm, hawkett wrote: > > > > Andy - they are essentially mutually exclusive. One suggests it > > > should be impossible for the same piece of code to access separate > > > datastore instances, the other suggests that this is a desirable > > > feature. I don't see how you consider them the same - are you saying > > > that you can't see how the cited bug is caused by multiple customers > > > sharing the same data space? I don't understand your perspective - > > > the difference seems utterly obvious to me. > > > > I *can* see that depending on the use case, one or the other would be > > > good. In most cases I would say access between different customer > > > data spaces is better modelled through an API accessible by HTTP. > > > > Perhaps you have a different use case where you have the same app > > > deployed multiple times and do not have the customer data segregation > > > issue, but that is not what the original poster is talking about. The > > > original poster is *clearly* and *unambiguously* talking about > > > avoiding bugs like the one cited, and doing so through a low level > > > data partition. > > > > On Dec 21, 12:30 am, Andy Freeman wrote: > > > > > Neither of the cited discussions nor your comments explain why it's > > > > different that Bill's "access to separate datastore" request. In > > > > fact, his request is essentially "at least allows mapping a single > > > > datastore par
[google-appengine] Re: 1 application, multiple datastores
Apologies - the following fragment '...moving something as fundamental as data partitioning out of the application platform will enhance this capability.' should read '...moving something as fundamental as data partitioning out of the application layer will enhance this capability.' On Dec 21, 10:13 pm, hawkett wrote: > > Yes, there is the issue that application code has to manage the > > customer-specific datastores, but if multiple customers are hosted on > > the same hardware, someone's code has to do that work and it's unclear > > why application code can't be part of that process. If the response > > is that application code isn't trusted by customers to maintain > > separation, I'm going to ask how you do maintenance and fixes on their > > behalf. > > If data segregation is a fundamental feature of the platform, then it > is inherently more trustable that N pieces of application code all > attempting the same thing. Me saying 'My code will keep your data > private' carries nothing like the weight that Google saying 'It is not > possible to run a query across two data stores' does. I would only > need to say 'Your data will be stored in a separate partition', and > that has tangible meaning to the customer from a data security > perspective. They are then placing their trust more in Google for > this feature than in my application. > > From a maintenance, reliability, trustability, transparency etc. > perspective, moving a common feature (especially a security feature) > from the application layer to platform layer is a major advantage, and > something a good architecture should always try to achieve. > > I want as little application code as possible to express my > application. This is already one of the key wins of the GAE platform, > and moving something as fundamental as data partitioning out of the > application platform will enhance this capability. > > On Dec 21, 5:55 pm, Andy Freeman wrote: > > > > One suggests it > > > should be impossible for the same piece of code to access separate > > > datastore instances, the other suggests that this is a desirable > > > feature. I don't see how you consider them the same - are you saying > > > that you can't see how the cited bug is caused by multiple customers > > > sharing the same data space? > > > Right now, separate applications have separate code and separate > > datastores. If management issues are the only obstacle to using > > separate applications for different users, that tells us that separate > > datastores do not share the same data space for these purposes. > > > Yes, there is the issue that application code has to manage the > > customer-specific datastores, but if multiple customers are hosted on > > the same hardware, someone's code has to do that work and it's unclear > > why application code can't be part of that process. If the response > > is that application code isn't trusted by customers to maintain > > separation, I'm going to ask how you do maintenance and fixes on their > > behalf. > > > Note that customers don't write application code in this model, > > whether they use separate applications or one that uses customer- > > specific datastores. > > > Here's how it would work. Customer accesses system, system figures > > out which datastore to use, system acts upon datastore on customer's > > behalf using application code. > > > Note that this is exactly the same way that any scheme with shared > > hardware would accomplish the same separation. The only difference is > > whether the "figure out" is done by Google or by you. > > > On Dec 20, 7:30 pm, hawkett wrote: > > > > Andy - they are essentially mutually exclusive. One suggests it > > > should be impossible for the same piece of code to access separate > > > datastore instances, the other suggests that this is a desirable > > > feature. I don't see how you consider them the same - are you saying > > > that you can't see how the cited bug is caused by multiple customers > > > sharing the same data space? I don't understand your perspective - > > > the difference seems utterly obvious to me. > > > > I *can* see that depending on the use case, one or the other would be > > > good. In most cases I would say access between different customer > > > data spaces is better modelled through an API accessible by HTTP. > > > > Perhaps you have a different use case where you have the same app > > > deployed multiple times and do not have the customer data segregation > > > issue, but that is not what the original poster is talking about. The > > > original poster is *clearly* and *unambiguously* talking about > > > avoiding bugs like the one cited, and doing so through a low level > > > data partition. > > > > On Dec 21, 12:30 am, Andy Freeman wrote: > > > > > Neither of the cited discussions nor your comments explain why it's > > > > different that Bill's "access to separate datastore" request. In > > > > fact, his request is essentially "at least allows mappi
[google-appengine] Re: 1 application, multiple datastores
> Yes, there is the issue that application code has to manage the > customer-specific datastores, but if multiple customers are hosted on > the same hardware, someone's code has to do that work and it's unclear > why application code can't be part of that process. If the response > is that application code isn't trusted by customers to maintain > separation, I'm going to ask how you do maintenance and fixes on their > behalf. If data segregation is a fundamental feature of the platform, then it is inherently more trustable that N pieces of application code all attempting the same thing. Me saying 'My code will keep your data private' carries nothing like the weight that Google saying 'It is not possible to run a query across two data stores' does. I would only need to say 'Your data will be stored in a separate partition', and that has tangible meaning to the customer from a data security perspective. They are then placing their trust more in Google for this feature than in my application. >From a maintenance, reliability, trustability, transparency etc. perspective, moving a common feature (especially a security feature) from the application layer to platform layer is a major advantage, and something a good architecture should always try to achieve. I want as little application code as possible to express my application. This is already one of the key wins of the GAE platform, and moving something as fundamental as data partitioning out of the application platform will enhance this capability. On Dec 21, 5:55 pm, Andy Freeman wrote: > > One suggests it > > should be impossible for the same piece of code to access separate > > datastore instances, the other suggests that this is a desirable > > feature. I don't see how you consider them the same - are you saying > > that you can't see how the cited bug is caused by multiple customers > > sharing the same data space? > > Right now, separate applications have separate code and separate > datastores. If management issues are the only obstacle to using > separate applications for different users, that tells us that separate > datastores do not share the same data space for these purposes. > > Yes, there is the issue that application code has to manage the > customer-specific datastores, but if multiple customers are hosted on > the same hardware, someone's code has to do that work and it's unclear > why application code can't be part of that process. If the response > is that application code isn't trusted by customers to maintain > separation, I'm going to ask how you do maintenance and fixes on their > behalf. > > Note that customers don't write application code in this model, > whether they use separate applications or one that uses customer- > specific datastores. > > Here's how it would work. Customer accesses system, system figures > out which datastore to use, system acts upon datastore on customer's > behalf using application code. > > Note that this is exactly the same way that any scheme with shared > hardware would accomplish the same separation. The only difference is > whether the "figure out" is done by Google or by you. > > On Dec 20, 7:30 pm, hawkett wrote: > > > Andy - they are essentially mutually exclusive. One suggests it > > should be impossible for the same piece of code to access separate > > datastore instances, the other suggests that this is a desirable > > feature. I don't see how you consider them the same - are you saying > > that you can't see how the cited bug is caused by multiple customers > > sharing the same data space? I don't understand your perspective - > > the difference seems utterly obvious to me. > > > I *can* see that depending on the use case, one or the other would be > > good. In most cases I would say access between different customer > > data spaces is better modelled through an API accessible by HTTP. > > > Perhaps you have a different use case where you have the same app > > deployed multiple times and do not have the customer data segregation > > issue, but that is not what the original poster is talking about. The > > original poster is *clearly* and *unambiguously* talking about > > avoiding bugs like the one cited, and doing so through a low level > > data partition. > > > On Dec 21, 12:30 am, Andy Freeman wrote: > > > > Neither of the cited discussions nor your comments explain why it's > > > different that Bill's "access to separate datastore" request. In > > > fact, his request is essentially "at least allows mapping a single > > > datastore partition to the authenticated entity". > > > > There are some issues with accounting, but if your app can do its > > > accounting in the user's datastore, you get that too. > > > > On Dec 20, 5:09 am, hawkett wrote: > > > > > This is a required feature for a commercial SaaS/PaaS offering, and is > > > > not the same as Bill's issue in previous thread entry (Issue 06). > > > > This discussion can help you understand why - > > > > >http://bl
[google-appengine] Re: 1 application, multiple datastores
> One suggests it > should be impossible for the same piece of code to access separate > datastore instances, the other suggests that this is a desirable > feature. I don't see how you consider them the same - are you saying > that you can't see how the cited bug is caused by multiple customers > sharing the same data space? Right now, separate applications have separate code and separate datastores. If management issues are the only obstacle to using separate applications for different users, that tells us that separate datastores do not share the same data space for these purposes. Yes, there is the issue that application code has to manage the customer-specific datastores, but if multiple customers are hosted on the same hardware, someone's code has to do that work and it's unclear why application code can't be part of that process. If the response is that application code isn't trusted by customers to maintain separation, I'm going to ask how you do maintenance and fixes on their behalf. Note that customers don't write application code in this model, whether they use separate applications or one that uses customer- specific datastores. Here's how it would work. Customer accesses system, system figures out which datastore to use, system acts upon datastore on customer's behalf using application code. Note that this is exactly the same way that any scheme with shared hardware would accomplish the same separation. The only difference is whether the "figure out" is done by Google or by you. On Dec 20, 7:30 pm, hawkett wrote: > Andy - they are essentially mutually exclusive. One suggests it > should be impossible for the same piece of code to access separate > datastore instances, the other suggests that this is a desirable > feature. I don't see how you consider them the same - are you saying > that you can't see how the cited bug is caused by multiple customers > sharing the same data space? I don't understand your perspective - > the difference seems utterly obvious to me. > > I *can* see that depending on the use case, one or the other would be > good. In most cases I would say access between different customer > data spaces is better modelled through an API accessible by HTTP. > > Perhaps you have a different use case where you have the same app > deployed multiple times and do not have the customer data segregation > issue, but that is not what the original poster is talking about. The > original poster is *clearly* and *unambiguously* talking about > avoiding bugs like the one cited, and doing so through a low level > data partition. > > On Dec 21, 12:30 am, Andy Freeman wrote: > > > > > Neither of the cited discussions nor your comments explain why it's > > different that Bill's "access to separate datastore" request. In > > fact, his request is essentially "at least allows mapping a single > > datastore partition to the authenticated entity". > > > There are some issues with accounting, but if your app can do its > > accounting in the user's datastore, you get that too. > > > On Dec 20, 5:09 am, hawkett wrote: > > > > This is a required feature for a commercial SaaS/PaaS offering, and is > > > not the same as Bill's issue in previous thread entry (Issue 06). > > > This discussion can help you understand why - > > > >http://blogs.zdnet.com/service-oriented/?p=1236 > > > > as can bugs like this > > > >http://forum.assembla.com/forums/3/topics/256 > > > > We need it to be as close to impossible for one customer's data to be > > > made available to another customer, without having to deploy a new > > > instance of the application. > > > > Let's call it data segregation. A concept of 'virtual instances' > > > would be a possible approach - so we can aggregate billing & quota > > > stats across multiple instances, and also identify individual instance > > > billing and quota. > > > > Use Case: > > > 1. Customer comes to my site > > > 2. Clicks the 'Sign up now' button > > > 3. Enters their details > > > 4. Starts using the system > > > > You can't get a more 'core' use-case than that for a SaaS/PaaS > > > platform. Notice there is no requirement to deploy a new version of > > > the app for this customer. The system spawns a virtual instance of > > > the app - or at least allows mapping a single datastore partition to > > > the authenticated entity. You coudl extend it by allowing multiple > > > datastores per authenticated entity and choosing the appropriate one > > > at authentication time. > > > > The key requirement is that we can on-board a customer without manual > > > intervention, and accurately understand a single customer's usage > > > profile. Data corruption for one customer does not equal data > > > corruption for another customer. > > > > This feature is in some ways the *opposite* of the feature request > > > identified by the previous poster - we *do not* want to be able to > > > access data in another partition - even if we tried to, and especially > > > via
[google-appengine] Re: 1 application, multiple datastores
Andy - they are essentially mutually exclusive. One suggests it should be impossible for the same piece of code to access separate datastore instances, the other suggests that this is a desirable feature. I don't see how you consider them the same - are you saying that you can't see how the cited bug is caused by multiple customers sharing the same data space? I don't understand your perspective - the difference seems utterly obvious to me. I *can* see that depending on the use case, one or the other would be good. In most cases I would say access between different customer data spaces is better modelled through an API accessible by HTTP. Perhaps you have a different use case where you have the same app deployed multiple times and do not have the customer data segregation issue, but that is not what the original poster is talking about. The original poster is *clearly* and *unambiguously* talking about avoiding bugs like the one cited, and doing so through a low level data partition. On Dec 21, 12:30 am, Andy Freeman wrote: > Neither of the cited discussions nor your comments explain why it's > different that Bill's "access to separate datastore" request. In > fact, his request is essentially "at least allows mapping a single > datastore partition to the authenticated entity". > > There are some issues with accounting, but if your app can do its > accounting in the user's datastore, you get that too. > > On Dec 20, 5:09 am, hawkett wrote: > > > This is a required feature for a commercial SaaS/PaaS offering, and is > > not the same as Bill's issue in previous thread entry (Issue 06). > > This discussion can help you understand why - > > >http://blogs.zdnet.com/service-oriented/?p=1236 > > > as can bugs like this > > >http://forum.assembla.com/forums/3/topics/256 > > > We need it to be as close to impossible for one customer's data to be > > made available to another customer, without having to deploy a new > > instance of the application. > > > Let's call it data segregation. A concept of 'virtual instances' > > would be a possible approach - so we can aggregate billing & quota > > stats across multiple instances, and also identify individual instance > > billing and quota. > > > Use Case: > > 1. Customer comes to my site > > 2. Clicks the 'Sign up now' button > > 3. Enters their details > > 4. Starts using the system > > > You can't get a more 'core' use-case than that for a SaaS/PaaS > > platform. Notice there is no requirement to deploy a new version of > > the app for this customer. The system spawns a virtual instance of > > the app - or at least allows mapping a single datastore partition to > > the authenticated entity. You coudl extend it by allowing multiple > > datastores per authenticated entity and choosing the appropriate one > > at authentication time. > > > The key requirement is that we can on-board a customer without manual > > intervention, and accurately understand a single customer's usage > > profile. Data corruption for one customer does not equal data > > corruption for another customer. > > > This feature is in some ways the *opposite* of the feature request > > identified by the previous poster - we *do not* want to be able to > > access data in another partition - even if we tried to, and especially > > via a bug in our code. > > > Here it is, please star it :) > > >http://code.google.com/p/googleappengine/issues/detail?id=945 > > > Any chance someone at Google has something to say about it? > > > Thanks, > > > Colin > > > On Dec 20, 5:17 am, Ben Bishop wrote: > > > > Not sure what you mean by "in case something happens" - your app and > > > its datastore is served by the same network of servers that serve > > > other apps, so separate accounts won't help, (unless you're going > > > against the Terms of Service, running the risk of having an account > > > banned). > > > > One App Engine account can have 10 apps, each with its own datastore > > > and quota. You could deploy a single app's codebase to multiple app > > > slots, simply by changing the app name in the app.yaml for each > > > instance. That way you could test on a production "test" app or one of > > > your client apps before rolling out updates to your other client apps. > > > > You still maintain a single codebase, each client app has its own > > > datastore, and you can control updates. > > > > On Dec 20, 2:05 am, GTako wrote: > > > > > Hi, is it possible to maintain under 1 application, multiple > > > > datastores that each datastore will be as if it is different app > > > > engine account? > > > > for example: i have a web application that should serve 2 companies, A > > > > and B. I would want to open a google app engine account for the web > > > > application files. the datastores for A and B could be 2 different > > > > deployments under the same app engine account or under seperate > > > > accounts. now assume i have N companies. what should i do? > > > > the reason for seperation is that i
[google-appengine] Re: 1 application, multiple datastores
Neither of the cited discussions nor your comments explain why it's different that Bill's "access to separate datastore" request. In fact, his request is essentially "at least allows mapping a single datastore partition to the authenticated entity". There are some issues with accounting, but if your app can do its accounting in the user's datastore, you get that too. On Dec 20, 5:09 am, hawkett wrote: > This is a required feature for a commercial SaaS/PaaS offering, and is > not the same as Bill's issue in previous thread entry (Issue 06). > This discussion can help you understand why - > > http://blogs.zdnet.com/service-oriented/?p=1236 > > as can bugs like this > > http://forum.assembla.com/forums/3/topics/256 > > We need it to be as close to impossible for one customer's data to be > made available to another customer, without having to deploy a new > instance of the application. > > Let's call it data segregation. A concept of 'virtual instances' > would be a possible approach - so we can aggregate billing & quota > stats across multiple instances, and also identify individual instance > billing and quota. > > Use Case: > 1. Customer comes to my site > 2. Clicks the 'Sign up now' button > 3. Enters their details > 4. Starts using the system > > You can't get a more 'core' use-case than that for a SaaS/PaaS > platform. Notice there is no requirement to deploy a new version of > the app for this customer. The system spawns a virtual instance of > the app - or at least allows mapping a single datastore partition to > the authenticated entity. You coudl extend it by allowing multiple > datastores per authenticated entity and choosing the appropriate one > at authentication time. > > The key requirement is that we can on-board a customer without manual > intervention, and accurately understand a single customer's usage > profile. Data corruption for one customer does not equal data > corruption for another customer. > > This feature is in some ways the *opposite* of the feature request > identified by the previous poster - we *do not* want to be able to > access data in another partition - even if we tried to, and especially > via a bug in our code. > > Here it is, please star it :) > > http://code.google.com/p/googleappengine/issues/detail?id=945 > > Any chance someone at Google has something to say about it? > > Thanks, > > Colin > > On Dec 20, 5:17 am, Ben Bishop wrote: > > > > > Not sure what you mean by "in case something happens" - your app and > > its datastore is served by the same network of servers that serve > > other apps, so separate accounts won't help, (unless you're going > > against the Terms of Service, running the risk of having an account > > banned). > > > One App Engine account can have 10 apps, each with its own datastore > > and quota. You could deploy a single app's codebase to multiple app > > slots, simply by changing the app name in the app.yaml for each > > instance. That way you could test on a production "test" app or one of > > your client apps before rolling out updates to your other client apps. > > > You still maintain a single codebase, each client app has its own > > datastore, and you can control updates. > > > On Dec 20, 2:05 am, GTako wrote: > > > > Hi, is it possible to maintain under 1 application, multiple > > > datastores that each datastore will be as if it is different app > > > engine account? > > > for example: i have a web application that should serve 2 companies, A > > > and B. I would want to open a google app engine account for the web > > > application files. the datastores for A and B could be 2 different > > > deployments under the same app engine account or under seperate > > > accounts. now assume i have N companies. what should i do? > > > the reason for seperation is that i dont want the datastores will be > > > dependent and under same account in case soemthing happens. please > > > advise.- Hide quoted text - > > - Show quoted text - --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
You can just use separate tables for each customer. I haven't tried it myself yet, so I don't know all the problems you will have to deal with. Of course than the dynamic handling with class names for the models gets more complex (and I am not aware of any existing framework for that) and porting existing python web apps to appengine also gets much more complicated. An easier approach is just to prepend keys with a customer identifier, but then you still have the possibility to query on all customers, and if things go wrong, possibly breaking the segregation. I do not know how Google has solved this problem at "Google Apps for your domain", but of course would love to hear how they have done it. regards Roberto On Dec 20, 10:09 am, hawkett wrote: > This is a required feature for a commercial SaaS/PaaS offering, and is > not the same as Bill's issue in previous thread entry (Issue 06). > This discussion can help you understand why - > > http://blogs.zdnet.com/service-oriented/?p=1236 > > as can bugs like this > > http://forum.assembla.com/forums/3/topics/256 > > We need it to be as close to impossible for one customer's data to be > made available to another customer, without having to deploy a new > instance of the application. > > Let's call it data segregation. A concept of 'virtual instances' > would be a possible approach - so we can aggregate billing & quota > stats across multiple instances, and also identify individual instance > billing and quota. > > Use Case: > 1. Customer comes to my site > 2. Clicks the 'Sign up now' button > 3. Enters their details > 4. Starts using the system > > You can't get a more 'core' use-case than that for a SaaS/PaaS > platform. Notice there is no requirement to deploy a new version of > the app for this customer. The system spawns a virtual instance of > the app - or at least allows mapping a single datastore partition to > the authenticated entity. You coudl extend it by allowing multiple > datastores per authenticated entity and choosing the appropriate one > at authentication time. > > The key requirement is that we can on-board a customer without manual > intervention, and accurately understand a single customer's usage > profile. Data corruption for one customer does not equal data > corruption for another customer. > > This feature is in some ways the *opposite* of the feature request > identified by the previous poster - we *do not* want to be able to > access data in another partition - even if we tried to, and especially > via a bug in our code. > > Here it is, please star it :) > > http://code.google.com/p/googleappengine/issues/detail?id=945 > > Any chance someone at Google has something to say about it? > > Thanks, > > Colin > > On Dec 20, 5:17 am, Ben Bishop wrote: > > > Not sure what you mean by "in case something happens" - your app and > > its datastore is served by the same network of servers that serve > > other apps, so separate accounts won't help, (unless you're going > > against the Terms of Service, running the risk of having an account > > banned). > > > One App Engine account can have 10 apps, each with its own datastore > > and quota. You could deploy a single app's codebase to multiple app > > slots, simply by changing the app name in the app.yaml for each > > instance. That way you could test on a production "test" app or one of > > your client apps before rolling out updates to your other client apps. > > > You still maintain a single codebase, each client app has its own > > datastore, and you can control updates. > > > On Dec 20, 2:05 am, GTako wrote: > > > > Hi, is it possible to maintain under 1 application, multiple > > > datastores that each datastore will be as if it is different app > > > engine account? > > > for example: i have a web application that should serve 2 companies, A > > > and B. I would want to open a google app engine account for the web > > > application files. the datastores for A and B could be 2 different > > > deployments under the same app engine account or under seperate > > > accounts. now assume i have N companies. what should i do? > > > the reason for seperation is that i dont want the datastores will be > > > dependent and under same account in case soemthing happens. please > > > advise. > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
This is a required feature for a commercial SaaS/PaaS offering, and is not the same as Bill's issue in previous thread entry (Issue 06). This discussion can help you understand why - http://blogs.zdnet.com/service-oriented/?p=1236 as can bugs like this http://forum.assembla.com/forums/3/topics/256 We need it to be as close to impossible for one customer's data to be made available to another customer, without having to deploy a new instance of the application. Let's call it data segregation. A concept of 'virtual instances' would be a possible approach - so we can aggregate billing & quota stats across multiple instances, and also identify individual instance billing and quota. Use Case: 1. Customer comes to my site 2. Clicks the 'Sign up now' button 3. Enters their details 4. Starts using the system You can't get a more 'core' use-case than that for a SaaS/PaaS platform. Notice there is no requirement to deploy a new version of the app for this customer. The system spawns a virtual instance of the app - or at least allows mapping a single datastore partition to the authenticated entity. You coudl extend it by allowing multiple datastores per authenticated entity and choosing the appropriate one at authentication time. The key requirement is that we can on-board a customer without manual intervention, and accurately understand a single customer's usage profile. Data corruption for one customer does not equal data corruption for another customer. This feature is in some ways the *opposite* of the feature request identified by the previous poster - we *do not* want to be able to access data in another partition - even if we tried to, and especially via a bug in our code. Here it is, please star it :) http://code.google.com/p/googleappengine/issues/detail?id=945 Any chance someone at Google has something to say about it? Thanks, Colin On Dec 20, 5:17 am, Ben Bishop wrote: > Not sure what you mean by "in case something happens" - your app and > its datastore is served by the same network of servers that serve > other apps, so separate accounts won't help, (unless you're going > against the Terms of Service, running the risk of having an account > banned). > > One App Engine account can have 10 apps, each with its own datastore > and quota. You could deploy a single app's codebase to multiple app > slots, simply by changing the app name in the app.yaml for each > instance. That way you could test on a production "test" app or one of > your client apps before rolling out updates to your other client apps. > > You still maintain a single codebase, each client app has its own > datastore, and you can control updates. > > On Dec 20, 2:05 am, GTako wrote: > > > Hi, is it possible to maintain under 1 application, multiple > > datastores that each datastore will be as if it is different app > > engine account? > > for example: i have a web application that should serve 2 companies, A > > and B. I would want to open a google app engine account for the web > > application files. the datastores for A and B could be 2 different > > deployments under the same app engine account or under seperate > > accounts. now assume i have N companies. what should i do? > > the reason for seperation is that i dont want the datastores will be > > dependent and under same account in case soemthing happens. please > > advise. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
Not sure what you mean by "in case something happens" - your app and its datastore is served by the same network of servers that serve other apps, so separate accounts won't help, (unless you're going against the Terms of Service, running the risk of having an account banned). One App Engine account can have 10 apps, each with its own datastore and quota. You could deploy a single app's codebase to multiple app slots, simply by changing the app name in the app.yaml for each instance. That way you could test on a production "test" app or one of your client apps before rolling out updates to your other client apps. You still maintain a single codebase, each client app has its own datastore, and you can control updates. On Dec 20, 2:05 am, GTako wrote: > Hi, is it possible to maintain under 1 application, multiple > datastores that each datastore will be as if it is different app > engine account? > for example: i have a web application that should serve 2 companies, A > and B. I would want to open a google app engine account for the web > application files. the datastores for A and B could be 2 different > deployments under the same app engine account or under seperate > accounts. now assume i have N companies. what should i do? > the reason for seperation is that i dont want the datastores will be > dependent and under same account in case soemthing happens. please > advise. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: 1 application, multiple datastores
It's currently not possible to address multiple datastores. Just looking at the API, it looks like addressing datastores should be possible because the keys include an app name, etc, but the App Engine team has said this feature is not coming any time soon. Cross-app datastore queries complicate the business model when you offer free apps. I think, though, that this is an important feature and should be supported under the pay-as-you-go option, i.e., if you want your datastore to be available cross-app, you elect to forfeit your free quota. Feel free to star this enhancement request: http://code.google.com/p/googleappengine/issues/detail?id=106 -Bill On Dec 19, 10:05 am, GTako wrote: > Hi, is it possible to maintain under 1 application, multiple > datastores that each datastore will be as if it is different app > engine account? > for example: i have a web application that should serve 2 companies, A > and B. I would want to open a google app engine account for the web > application files. the datastores for A and B could be 2 different > deployments under the same app engine account or under seperate > accounts. now assume i have N companies. what should i do? > the reason for seperation is that i dont want the datastores will be > dependent and under same account in case soemthing happens. please > advise. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---