Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 03/09/2015 4:02 PM, Dan Smith wrote: - do we need to migrate the db to some handle different set of >attributes and what happens for nosql dbs? No, Nova made no schema changes as a result of moving to objects. so i don't really understand how this part works. if i have two collectors -- one collector writes v1 schema, one writes v2 schema -- how do they both write to the same db without anything changing in the db? presumably the db would only be configured to know how to store only one version? cheers, -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 28/08/15 01:54 PM, Dan Smith wrote: we store everything as primitives: floats, time, integer, etc... since we need to query on attributes. it seems like versionedobjects might not be useful to our db configuration currently. I don't think the former determines the latter -- we have lots of things stored as rows of column primitives and query them out as objects, but then you're not storing the object and version (unless you do it separately) So, if it doesn't buy you anything, then there's no reason to use it. sorry, i misunderstood this. i thought you were saying ovo may not fit into Ceilometer. i guess to give it more of a real context for us to understand, regarding the database layer, if we have an events model which consists of: - id: uuid - event_type: string - generated: timestamp - raw: dictionary value (not meant for querying, just for auditing purposes) - traits: [list of tuples (key, value, type)] given this model, each of our backend drivers take this data and using it's connection to db, stores data accordingly: - in mongodb, the attributes are all stored in documents similar to json, raw attr is stored as json - in elasticsearch, they're stored in documents as well but traits are mapped different from mongo - in hbase, the attributes and traits are all mapped to columns - in sql, the data is mapped to an Event table, traits are mapped to different traits tables depending on type, raw attribute stored as a string. considering everything is stored differently depending on db, how does ovo work? is it normalising it into a specific format pre-storage? how does different data will different schemas co-exists on the same db? - is there a some version tag applied to each item and a version schema table created somewhere? - do we need to migrate the db to some handle different set of attributes and what happens for nosql dbs? also, from api querying pov, if i want to query a db, how do you query/filter across different versions? - does ovo tell the api what versions exists in db and then you can filter across any attribute from any schema version? - are certain attributes effectively unqueryable because it may not exists across all versions? apologies on not understanding how it all works or if the above has nothing to do with ovo (i wasn't joking about the 'explain it to me like i'm 5' request :-) ) ... i think part of the wariness is that the code seemingly does nothing now (or the logic is extremely hidden) but if we merge these x hundred/thousand lines of code, it will do something later if something changes. cheers, -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
>>> we store everything as primitives: floats, time, integer, etc... since >>> we need to query on attributes. it seems like versionedobjects might not >>> be useful to our db configuration currently. >> I don't think the former determines the latter -- we have lots of things >> stored as rows of column primitives and query them out as objects, but >> then you're not storing the object and version (unless you do it >> separately) So, if it doesn't buy you anything, then there's no reason >> to use it. > sorry, i misunderstood this. i thought you were saying ovo may not fit > into Ceilometer. Nope, what I meant was: there's no reason to use the technique of storing serialized objects as blobs in the database if you don't want to store things like that. > i guess to give it more of a real context for us to understand, > regarding the database layer, if we have an events model which consists of: > > - id: uuid > - event_type: string > - generated: timestamp > - raw: dictionary value (not meant for querying, just for auditing > purposes) > - traits: [list of tuples (key, value, type)] > > given this model, each of our backend drivers take this data and using > it's connection to db, stores data accordingly: > - in mongodb, the attributes are all stored in documents similar to > json, raw attr is stored as json Right, so you could store the serialized version of the object in mongo like this very easily. When you go to pull data out of the database later, you have a strict format, and a version tied to it so that you know exactly how it was stored. If you have storage drivers that handle taking the generic thing and turning it into something appropriate for a given store, then it's entirely possible that you are best suited to be tolerant of old data there. In Nova, we treat the object schema as the interface the rest of the code uses and expects. Tolerance of the actual persistence schema moving underneath and over time is hidden in this layer so that things above don't have to know about it. > - in sql, the data is mapped to an Event table, traits are mapped to > different traits tables depending on type, raw attribute stored as a > string. Yep, so when storing in a SQL database, you'd (presumably) not store the serialized blobs, and rather pick the object apart to store it as a row (like most of the things in nova are stored). > considering everything is stored differently depending on db, how does > ovo work? is it normalising it into a specific format pre-storage? how > does different data will different schemas co-exists on the same db? This is completely up to your implementation. You could end up with a top-level object like Event that doesn't implement .save(), and then subclasses like SQLEvent and MongoEvent that do. All the structure could be defined at the top, but the implementations of how to store/retrieve them are separate. The mongo one might be very simple because it can just use the object infrastructure to get the serialized blob and store it. The SQL one would turn the object's fields into an INSERT statement (or a SQLAlchemy thing). > - is there a some version tag applied to each item and a version schema > table created somewhere? The object defines the schema as a list of tightly typed fields, a bunch of methods, and a version. In this purely DB-specific case, all it does is provide you a facade with which to hide things like storing to a different version or format of schema. For projects that send things over RPC and then dump them in the database, it's super convenient that this is all one thing. > - do we need to migrate the db to some handle different set of > attributes and what happens for nosql dbs? No, Nova made no schema changes as a result of moving to objects. > also, from api querying pov, if i want to query a db, how do you > query/filter across different versions? > - does ovo tell the api what versions exists in db and then you can > filter across any attribute from any schema version? Nope, o.vo doesn't do any of this for you magically. It merely sets up a place for you to do that work. In nova, we use them for RPC and DB storage, which means if we have an old node that receives a new object over RPC (or the opposite) we have rules that define how we handle that. Thus, we can apply the same rules to reading the DB, where some objects might be older or newer. > apologies on not understanding how it all works or if the above has > nothing to do with ovo (i wasn't joking about the 'explain it to me like > i'm 5' request :-) ) ... i think part of the wariness is that the code > seemingly does nothing now (or the logic is extremely hidden) but if we > merge these x hundred/thousand lines of code, it will do something later > if something changes. It really isn't magic and really doesn't do a huge amount of work for you. It's a pattern as much as anything, and most of the benefit comes from the serialization and version handling of things over RPC. Part
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 9/1/15, 11:31 AM, "gord chung"wrote: > > >On 28/08/2015 5:18 PM, Alec Hothan (ahothan) wrote: >> >> >> >> >> On 8/28/15, 11:39 AM, "gord chung" wrote: >> >>> i should start by saying i re-read my subject line and it arguably comes >>> off aggressive -- i should probably have dropped 'explain' :) >>> >>> On 28/08/15 01:47 PM, Alec Hothan (ahothan) wrote: On 8/28/15, 10:07 AM, "gord chung" wrote: > On 28/08/15 12:18 PM, Roman Dobosz wrote: >> So imagine we have new versions of the schema for the events, alarms or >> samples in ceilometer introduced in Mitaka release while you have all >> your ceilo services on Liberty release. To upgrade ceilometer you'll >> have to stop all services to avoid data corruption. With >> versionedobjects you can do this one by one without disrupting >> telemetry jobs. > are versions checked for every single message? has anyone considered the > overhead to validating each message? since ceilometer is queue based, we > could technically just publish to a new queue when schema changes... and > the consuming services will listen to the queue it knows of. > > ie. our notification service changes schema so it will now publish to a > v2 queue, the existing collector service consumes the v1 queue until > done at which point you can upgrade it and it will listen to v2 queue. > > this way there is no need to validate/convert anything and you can still > take services down one at a time. this support doesn't exist currently > (i just randomly thought of it) but assuming there's no flaw in my idea > (which there may be) isn't this more efficient? If high performance is a concern for ceilometer (and it should) then maybe there might be better options than JSON? JSON is great for many applications but can be inappropriate for other demanding applications. There are other popular open source encoding options that yield much more compact wire payload, more efficient encoding/decoding and handle versioning to a reasonable extent. >>> i should clarify. we let oslo.messaging serialise our dictionary how it >>> does... i believe it's JSON. i'd be interested to switch it to something >>> more efficient. maybe it's time we revive the msgpacks patch[1] or are >>> there better alternatives? (hoping i didn't just unleash a storm of >>> 'this is better' replies) >> I'd be curious to know if there is any benchmark on the oslo serializer for >> msgpack and how it compares to JSON? >> More important is to make sure we're optimizing in the right area. >> Do we have a good understanding of where ceilometer needs to improve to >> scale or is it still not quite clear cut? > >re: serialisation, that probably isn't the biggest concern for >Ceilometer performance. the main items are storage -- to be addressed by >Gnocchi/tsdb, and polling load. i just thought i'd point out an existing >serialisation patch since we were on the topic :-) Is there any data measuring the polling load on large scale deployments? Was there a plan to reduce the polling load to an acceptable level? If yes could you provide any pointer if any? > >> Queue based versioning might be less runtime overhead per message but at the expense of a potentially complex queue version management (which can become tricky if you have more than 2 versions). I think Neutron was considering to use versioned queues as well for its rolling upgrade (along with versioned objects) and I already pointed out that managing the queues could be tricky. In general, trying to provide a versioning framework that allows to do arbitrary changes between versions is quite difficult (and often bound to fail). >>> yeah, so that's what a lot of the devs are debating about right now. >>> performance is our key driver so if we do something we think/know will >>> negatively impact performance, it better bring a whole lot more of >>> something else. if queue based versioning offers comparable >>> functionalities, i'd personally be more interested to explore that route >>> first. is there a thread/patch/log that we could read to see what >>> Neutron discovered when they looked into it? >> The versioning comments are buried in this mega patch if you are brave >> enough to dig in: >> >> https://review.openstack.org/#/c/190635 >> >> The (offline) conclusion was that this was WIP and deserved more discussion >> (need to check back with Miguel and Ihar from the Neutron team). >> One option considered in that discussion was to use oslo messaging topics to >> manage flows of messages that had different versions (and still use >> versionedobjects). So if you have 3 versions in your cloud you'd end up with >> 3 topics (and as many queues when it comes to Rabbit). What is complex is to >> manage the queues/topic names (how to name them), how to
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 02/09/2015 11:25 AM, Alec Hothan (ahothan) wrote: On 9/1/15, 11:31 AM, "gord chung"wrote: re: serialisation, that probably isn't the biggest concern for Ceilometer performance. the main items are storage -- to be addressed by Gnocchi/tsdb, and polling load. i just thought i'd point out an existing serialisation patch since we were on the topic :-) Is there any data measuring the polling load on large scale deployments? Was there a plan to reduce the polling load to an acceptable level? If yes could you provide any pointer if any? i'm not sure any user has provided numbers when raising the issue -- just that it's 'high'. this should probably be done in a separate thread as i don't want it to get lost in completely unrelated subject. that said, an initial patch to minimise load was done in Liberty[1] and secondary proposal for M*[2]. conceptually, i would think only the consumers need to know about all the queues and even then, it should only really need to know about the ones it understands. the producers (polling agents) can just fire off to the correct versioned queue and be done... thanks for the above link (it'll help with discussion/spec design). When everything goes according to plan, any solution can work but this is hardly the case in production, especially at scale. Here are a few question that may help in the discussion: - how are versioned queue named? - who creates a versioned queue (producer or consumer?) and who deletes it when no more entity of that version is running? - how to make sure a producer is not producing in a queue that has no consumer (a messaging infra like rabbit is designed to decouple producers from consumers) - all corner cases of entities (consumers or producers) popping up with newer or older version, and terminating (gracefully or not) during the upgrade/downgrade, what happens to the queues... IMHO using a simple communication schema (1 topic/queue for all versions) with in-band message versioning is a much less complex proposition than juggling with versioned queues (not to say the former is simple to do). With versioned queues you're kind of trading off the per message versioning with per queue versioning but at the expense of: - a complex queue management (if you want to do it right) - a not less complex per queue message decoding (since the consumer needs to know how to decode and interpret every message depending on the version of the queue it comes from) - a more difficult debug environment (harder to debug multiple queues than 1 queue) - and added stress on oslo messaging (due to the use of transient queues) thanks, good items to think about when building spec. will be sure to add link when initial draft is ready. [1] https://blueprints.launchpad.net/ceilometer/+spec/resource-metadata-caching [2] https://review.openstack.org/#/c/209799/ -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 28/08/2015 5:18 PM, Alec Hothan (ahothan) wrote: On 8/28/15, 11:39 AM, "gord chung"wrote: i should start by saying i re-read my subject line and it arguably comes off aggressive -- i should probably have dropped 'explain' :) On 28/08/15 01:47 PM, Alec Hothan (ahothan) wrote: On 8/28/15, 10:07 AM, "gord chung" wrote: On 28/08/15 12:18 PM, Roman Dobosz wrote: So imagine we have new versions of the schema for the events, alarms or samples in ceilometer introduced in Mitaka release while you have all your ceilo services on Liberty release. To upgrade ceilometer you'll have to stop all services to avoid data corruption. With versionedobjects you can do this one by one without disrupting telemetry jobs. are versions checked for every single message? has anyone considered the overhead to validating each message? since ceilometer is queue based, we could technically just publish to a new queue when schema changes... and the consuming services will listen to the queue it knows of. ie. our notification service changes schema so it will now publish to a v2 queue, the existing collector service consumes the v1 queue until done at which point you can upgrade it and it will listen to v2 queue. this way there is no need to validate/convert anything and you can still take services down one at a time. this support doesn't exist currently (i just randomly thought of it) but assuming there's no flaw in my idea (which there may be) isn't this more efficient? If high performance is a concern for ceilometer (and it should) then maybe there might be better options than JSON? JSON is great for many applications but can be inappropriate for other demanding applications. There are other popular open source encoding options that yield much more compact wire payload, more efficient encoding/decoding and handle versioning to a reasonable extent. i should clarify. we let oslo.messaging serialise our dictionary how it does... i believe it's JSON. i'd be interested to switch it to something more efficient. maybe it's time we revive the msgpacks patch[1] or are there better alternatives? (hoping i didn't just unleash a storm of 'this is better' replies) I'd be curious to know if there is any benchmark on the oslo serializer for msgpack and how it compares to JSON? More important is to make sure we're optimizing in the right area. Do we have a good understanding of where ceilometer needs to improve to scale or is it still not quite clear cut? re: serialisation, that probably isn't the biggest concern for Ceilometer performance. the main items are storage -- to be addressed by Gnocchi/tsdb, and polling load. i just thought i'd point out an existing serialisation patch since we were on the topic :-) Queue based versioning might be less runtime overhead per message but at the expense of a potentially complex queue version management (which can become tricky if you have more than 2 versions). I think Neutron was considering to use versioned queues as well for its rolling upgrade (along with versioned objects) and I already pointed out that managing the queues could be tricky. In general, trying to provide a versioning framework that allows to do arbitrary changes between versions is quite difficult (and often bound to fail). yeah, so that's what a lot of the devs are debating about right now. performance is our key driver so if we do something we think/know will negatively impact performance, it better bring a whole lot more of something else. if queue based versioning offers comparable functionalities, i'd personally be more interested to explore that route first. is there a thread/patch/log that we could read to see what Neutron discovered when they looked into it? The versioning comments are buried in this mega patch if you are brave enough to dig in: https://review.openstack.org/#/c/190635 The (offline) conclusion was that this was WIP and deserved more discussion (need to check back with Miguel and Ihar from the Neutron team). One option considered in that discussion was to use oslo messaging topics to manage flows of messages that had different versions (and still use versionedobjects). So if you have 3 versions in your cloud you'd end up with 3 topics (and as many queues when it comes to Rabbit). What is complex is to manage the queues/topic names (how to name them), how to discover them and how to deal with all the corner cases (like a new node coming in with an arbitrary version, nodes going away at any moment, downgrade cases). conceptually, i would think only the consumers need to know about all the queues and even then, it should only really need to know about the ones it understands. the producers (polling agents) can just fire off to the correct versioned queue and be done... thanks for the above link (it'll help with discussion/spec design). cheers, -- gord __
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On Thu, 27 Aug 2015 15:37:24 -0400 gord at live.ca (gord chung) wrote: polling agent --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) or OpenStack service --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) or from Aodh/alarming pov: ceilometer-api (direct connection to db) --- http --- alarm evaluator --- rpc --- alarm notifier --- http --- [Heat/other] based on the above workflows, is there a good place for adoption of versionedobjects? and if so, what is the benefit? most of us are keen on adopting consistent design practices but none of us can honestly determine why versionedobjects would be beneficial to Ceilometer. if someone could explain it to us like we are 5 -- it's probably best to explain everything/anything like i'm 5 -- that would help immensely on moving this work forward. Hi Gordon, The first thing that come to my mind is the database schema changes - this is the area that OVO is aiming at. Even though you don't have a need for the schema changing today, it might happen in the future. So imagine we have new versions of the schema for the events, alarms or samples in ceilometer introduced in Mitaka release while you have all your ceilo services on Liberty release. To upgrade ceilometer you'll have to stop all services to avoid data corruption. With versionedobjects you can do this one by one without disrupting telemetry jobs. The other thing, maybe not so obvious, is to put versionedobject layer between application and the MongoDB driver, so that all of the schema changes will be automatically handled on ovo, and also serialization might also be done on such layer. Hope that clear your doubts. -- Cheers, Roman Dobosz __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
there was a little skeptism because it was originally sold as magic, but reading the slides from Vancouver[1], it is not magic. I think I specifically said they're not magic in my slides. Not sure who sold you them as magic, but you should leave them a less-than-five-stars review. Ceilometer functions mainly on queue-based IPC. most of the communication is async transferring of json payloads where callback is not required. the basic workflows are: This is specifically something versionedobjects should help with. The remotable RPC method calls on an object are something that nova uses heavily, but other projects don't use at all. polling agent --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) What happens if any of these components are running different versions of the ceilometer code at one point? During an upgrade, you presumably don't want to have to take all of these things down at once, and so the notification agent might get an object from the polling agent that is older or newer than it expects. More specifically, maybe the collector is writing to older schema and gets a newer object from the front of the queue with data it can't store. If you're getting versionedobjects instead of raw json, you at least have an indication that this is happening. If you get an older object, you might choose to do something specific for the fields that are now in the DB schema, but aren't in the object you received. OpenStack service --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) This is a good one. If Nova was sending notifications as objects, then the notification agent would get a version with each notification, knowing specifically when the notification is newer than it supports, instead of us just changing things (on purpose or by accident) and you breaking. From the storage in the DB perspective, I'm not sure what your persistence looks like. However, we've been storing _some_ things in our DB as serialized objects. That means that if we pull something out in a year, after which time things in the actual object implementation have changed, then we have an indication of what version it was stored in, and presumably can apply a process to update it (or handle the differences) at load time. I'm not sure if that's useful for ceilometer, but it is definitely useful for nova, where we can avoid converting everything in the database every time we add/change a field in something -- a process that is very critical to avoid in our goals for improving the upgrade experience for operators. So, I dunno if ceilometer needs to adopt versionedobjects for anything. It seems like it would apply to the cases you describe above, but if not, there's no reason to use it just because others are. Nova will (when I stop putting it off) start sending notifications as serialized and versioned objects at some point, but you may choose to just unwrap it and treat it as a json blob beyond the handler, if that's what is determined as the best course. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 28/08/15 12:18 PM, Roman Dobosz wrote: So imagine we have new versions of the schema for the events, alarms or samples in ceilometer introduced in Mitaka release while you have all your ceilo services on Liberty release. To upgrade ceilometer you'll have to stop all services to avoid data corruption. With versionedobjects you can do this one by one without disrupting telemetry jobs. are versions checked for every single message? has anyone considered the overhead to validating each message? since ceilometer is queue based, we could technically just publish to a new queue when schema changes... and the consuming services will listen to the queue it knows of. ie. our notification service changes schema so it will now publish to a v2 queue, the existing collector service consumes the v1 queue until done at which point you can upgrade it and it will listen to v2 queue. this way there is no need to validate/convert anything and you can still take services down one at a time. this support doesn't exist currently (i just randomly thought of it) but assuming there's no flaw in my idea (which there may be) isn't this more efficient? The other thing, maybe not so obvious, is to put versionedobject layer between application and the MongoDB driver, so that all of the schema changes will be automatically handled on ovo, and also serialization might also be done on such layer. i don't quite understand this, is this a mongodb specific solution? admittedly the schemaless design of mongo i can imagine causing issues but currently we're trying to avoid wasting resources on existing mongodb solution as we attempt to move to new api. if it's just a generic db solution, i'd be interested to apply it to future designs. cheers, -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 28/08/15 12:49 PM, Dan Smith wrote: there was a little skeptism because it was originally sold as magic, but reading the slides from Vancouver[1], it is not magic. I think I specifically said they're not magic in my slides. Not sure who sold you them as magic, but you should leave them a less-than-five-stars review. i like how your slides leveled it for us :) Ceilometer functions mainly on queue-based IPC. most of the communication is async transferring of json payloads where callback is not required. the basic workflows are: This is specifically something versionedobjects should help with. The remotable RPC method calls on an object are something that nova uses heavily, but other projects don't use at all. polling agent --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) What happens if any of these components are running different versions of the ceilometer code at one point? During an upgrade, you presumably don't want to have to take all of these things down at once, and so the notification agent might get an object from the polling agent that is older or newer than it expects. More specifically, maybe the collector is writing to older schema and gets a newer object from the front of the queue with data it can't store. If you're getting versionedobjects instead of raw json, you at least have an indication that this is happening. If you get an older object, you might choose to do something specific for the fields that are now in the DB schema, but aren't in the object you received. OpenStack service --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) This is a good one. If Nova was sending notifications as objects, then the notification agent would get a version with each notification, knowing specifically when the notification is newer than it supports, instead of us just changing things (on purpose or by accident) and you breaking. From the storage in the DB perspective, I'm not sure what your persistence looks like. However, we've been storing _some_ things in our DB as serialized objects. That means that if we pull something out in a year, after which time things in the actual object implementation have changed, then we have an indication of what version it was stored in, and presumably can apply a process to update it (or handle the differences) at load time. I'm not sure if that's useful for ceilometer, but it is definitely useful for nova, where we can avoid converting everything in the database every time we add/change a field in something -- a process that is very critical to avoid in our goals for improving the upgrade experience for operators. we store everything as primitives: floats, time, integer, etc... since we need to query on attributes. it seems like versionedobjects might not be useful to our db configuration currently. So, I dunno if ceilometer needs to adopt versionedobjects for anything. It seems like it would apply to the cases you describe above, but if not, there's no reason to use it just because others are. Nova will (when I stop putting it off) start sending notifications as serialized and versioned objects at some point, but you may choose to just unwrap it and treat it as a json blob beyond the handler, if that's what is determined as the best course. i'm really looking forward to this. i think the entire Ceilometer team is waiting for someone to contractualise messages. right now it's a crap shoot when we listen to messages from other services. cheers, -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
i should start by saying i re-read my subject line and it arguably comes off aggressive -- i should probably have dropped 'explain' :) On 28/08/15 01:47 PM, Alec Hothan (ahothan) wrote: On 8/28/15, 10:07 AM, gord chung g...@live.ca wrote: On 28/08/15 12:18 PM, Roman Dobosz wrote: So imagine we have new versions of the schema for the events, alarms or samples in ceilometer introduced in Mitaka release while you have all your ceilo services on Liberty release. To upgrade ceilometer you'll have to stop all services to avoid data corruption. With versionedobjects you can do this one by one without disrupting telemetry jobs. are versions checked for every single message? has anyone considered the overhead to validating each message? since ceilometer is queue based, we could technically just publish to a new queue when schema changes... and the consuming services will listen to the queue it knows of. ie. our notification service changes schema so it will now publish to a v2 queue, the existing collector service consumes the v1 queue until done at which point you can upgrade it and it will listen to v2 queue. this way there is no need to validate/convert anything and you can still take services down one at a time. this support doesn't exist currently (i just randomly thought of it) but assuming there's no flaw in my idea (which there may be) isn't this more efficient? If high performance is a concern for ceilometer (and it should) then maybe there might be better options than JSON? JSON is great for many applications but can be inappropriate for other demanding applications. There are other popular open source encoding options that yield much more compact wire payload, more efficient encoding/decoding and handle versioning to a reasonable extent. i should clarify. we let oslo.messaging serialise our dictionary how it does... i believe it's JSON. i'd be interested to switch it to something more efficient. maybe it's time we revive the msgpacks patch[1] or are there better alternatives? (hoping i didn't just unleash a storm of 'this is better' replies) Queue based versioning might be less runtime overhead per message but at the expense of a potentially complex queue version management (which can become tricky if you have more than 2 versions). I think Neutron was considering to use versioned queues as well for its rolling upgrade (along with versioned objects) and I already pointed out that managing the queues could be tricky. In general, trying to provide a versioning framework that allows to do arbitrary changes between versions is quite difficult (and often bound to fail). yeah, so that's what a lot of the devs are debating about right now. performance is our key driver so if we do something we think/know will negatively impact performance, it better bring a whole lot more of something else. if queue based versioning offers comparable functionalities, i'd personally be more interested to explore that route first. is there a thread/patch/log that we could read to see what Neutron discovered when they looked into it? [1] https://review.openstack.org/#/c/151301/ -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 8/28/15, 11:39 AM, gord chung g...@live.ca wrote: i should start by saying i re-read my subject line and it arguably comes off aggressive -- i should probably have dropped 'explain' :) On 28/08/15 01:47 PM, Alec Hothan (ahothan) wrote: On 8/28/15, 10:07 AM, gord chung g...@live.ca wrote: On 28/08/15 12:18 PM, Roman Dobosz wrote: So imagine we have new versions of the schema for the events, alarms or samples in ceilometer introduced in Mitaka release while you have all your ceilo services on Liberty release. To upgrade ceilometer you'll have to stop all services to avoid data corruption. With versionedobjects you can do this one by one without disrupting telemetry jobs. are versions checked for every single message? has anyone considered the overhead to validating each message? since ceilometer is queue based, we could technically just publish to a new queue when schema changes... and the consuming services will listen to the queue it knows of. ie. our notification service changes schema so it will now publish to a v2 queue, the existing collector service consumes the v1 queue until done at which point you can upgrade it and it will listen to v2 queue. this way there is no need to validate/convert anything and you can still take services down one at a time. this support doesn't exist currently (i just randomly thought of it) but assuming there's no flaw in my idea (which there may be) isn't this more efficient? If high performance is a concern for ceilometer (and it should) then maybe there might be better options than JSON? JSON is great for many applications but can be inappropriate for other demanding applications. There are other popular open source encoding options that yield much more compact wire payload, more efficient encoding/decoding and handle versioning to a reasonable extent. i should clarify. we let oslo.messaging serialise our dictionary how it does... i believe it's JSON. i'd be interested to switch it to something more efficient. maybe it's time we revive the msgpacks patch[1] or are there better alternatives? (hoping i didn't just unleash a storm of 'this is better' replies) I'd be curious to know if there is any benchmark on the oslo serializer for msgpack and how it compares to JSON? More important is to make sure we're optimizing in the right area. Do we have a good understanding of where ceilometer needs to improve to scale or is it still not quite clear cut? Queue based versioning might be less runtime overhead per message but at the expense of a potentially complex queue version management (which can become tricky if you have more than 2 versions). I think Neutron was considering to use versioned queues as well for its rolling upgrade (along with versioned objects) and I already pointed out that managing the queues could be tricky. In general, trying to provide a versioning framework that allows to do arbitrary changes between versions is quite difficult (and often bound to fail). yeah, so that's what a lot of the devs are debating about right now. performance is our key driver so if we do something we think/know will negatively impact performance, it better bring a whole lot more of something else. if queue based versioning offers comparable functionalities, i'd personally be more interested to explore that route first. is there a thread/patch/log that we could read to see what Neutron discovered when they looked into it? The versioning comments are buried in this mega patch if you are brave enough to dig in: https://review.openstack.org/#/c/190635 The (offline) conclusion was that this was WIP and deserved more discussion (need to check back with Miguel and Ihar from the Neutron team). One option considered in that discussion was to use oslo messaging topics to manage flows of messages that had different versions (and still use versionedobjects). So if you have 3 versions in your cloud you'd end up with 3 topics (and as many queues when it comes to Rabbit). What is complex is to manage the queues/topic names (how to name them), how to discover them and how to deal with all the corner cases (like a new node coming in with an arbitrary version, nodes going away at any moment, downgrade cases). Regards, Alec [1] https://review.openstack.org/#/c/151301/ -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
On 8/28/15, 10:07 AM, gord chung g...@live.ca wrote: On 28/08/15 12:18 PM, Roman Dobosz wrote: So imagine we have new versions of the schema for the events, alarms or samples in ceilometer introduced in Mitaka release while you have all your ceilo services on Liberty release. To upgrade ceilometer you'll have to stop all services to avoid data corruption. With versionedobjects you can do this one by one without disrupting telemetry jobs. are versions checked for every single message? has anyone considered the overhead to validating each message? since ceilometer is queue based, we could technically just publish to a new queue when schema changes... and the consuming services will listen to the queue it knows of. ie. our notification service changes schema so it will now publish to a v2 queue, the existing collector service consumes the v1 queue until done at which point you can upgrade it and it will listen to v2 queue. this way there is no need to validate/convert anything and you can still take services down one at a time. this support doesn't exist currently (i just randomly thought of it) but assuming there's no flaw in my idea (which there may be) isn't this more efficient? If high performance is a concern for ceilometer (and it should) then maybe there might be better options than JSON? JSON is great for many applications but can be inappropriate for other demanding applications. There are other popular open source encoding options that yield much more compact wire payload, more efficient encoding/decoding and handle versioning to a reasonable extent. Queue based versioning might be less runtime overhead per message but at the expense of a potentially complex queue version management (which can become tricky if you have more than 2 versions). I think Neutron was considering to use versioned queues as well for its rolling upgrade (along with versioned objects) and I already pointed out that managing the queues could be tricky. In general, trying to provide a versioning framework that allows to do arbitrary changes between versions is quite difficult (and often bound to fail). The other thing, maybe not so obvious, is to put versionedobject layer between application and the MongoDB driver, so that all of the schema changes will be automatically handled on ovo, and also serialization might also be done on such layer. i don't quite understand this, is this a mongodb specific solution? admittedly the schemaless design of mongo i can imagine causing issues but currently we're trying to avoid wasting resources on existing mongodb solution as we attempt to move to new api. if it's just a generic db solution, i'd be interested to apply it to future designs. I'd suggest to do some benchmarking to make sure the cost of serializing/deserializing with versioned objects is not prohibitive. Alec cheers, -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
we store everything as primitives: floats, time, integer, etc... since we need to query on attributes. it seems like versionedobjects might not be useful to our db configuration currently. I don't think the former determines the latter -- we have lots of things stored as rows of column primitives and query them out as objects, but then you're not storing the object and version (unless you do it separately) So, if it doesn't buy you anything, then there's no reason to use it. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [oslo][versionedobjects][ceilometer] explain the benefits of ceilometer+versionedobjects
hi, there has been a lot of work done across the community and Ceilometer relating to versionedobjects. in Ceilometer particularly, this effort has somewhat stalled as contributors are unsure of the benefits of versionedobjects and how it relates to Ceilometer. there was a little skeptism because it was originally sold as magic, but reading the slides from Vancouver[1], it is not magic. rather it seems the main purpose is to handle the evolution of schemas specifically over RPC which seems neat but conceptually doesn't seem to fit into how Ceilometer functions. looking at the patches, Chris brought up a good question in a review[2] which to summarise: If the ceilometer/aodh tools have direct connections to their data storage level (they do) and do not use storable distributed objects (on which rpc calls are made) in what sense are versioned objects useful to the service? My understanding is that in Nova (for example) versioned objects are useful because rpc calls are made on storable objects that can be in flight at any time across the distributed service and thus for their to be smooth rolling upgrades those in flight objects need to be able to be of different versions. Ceilometer functions mainly on queue-based IPC. most of the communication is async transferring of json payloads where callback is not required. the basic workflows are: polling agent --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) or OpenStack service --- topic queue --- notification agent --- topic queue --- collector (direct connection to db) or from Aodh/alarming pov: ceilometer-api (direct connection to db) --- http --- alarm evaluator --- rpc --- alarm notifier --- http --- [Heat/other] based on the above workflows, is there a good place for adoption of versionedobjects? and if so, what is the benefit? most of us are keen on adopting consistent design practices but none of us can honestly determine why versionedobjects would be beneficial to Ceilometer. if someone could explain it to us like we are 5 -- it's probably best to explain everything/anything like i'm 5 -- that would help immensely on moving this work forward. cheers, -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev