On Tue, May 1, 2012 at 11:49 AM, Doug Hellmann <doug.hellm...@dreamhost.com>wrote:
> > > On Tue, May 1, 2012 at 10:38 AM, Nick Barcet <nick.bar...@canonical.com>wrote: > >> On 05/01/2012 02:23 AM, Loic Dachary wrote: >> > On 04/30/2012 11:39 PM, Doug Hellmann wrote: >> >> >> >> >> >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <l...@enovance.com >> >> <mailto:l...@enovance.com>> wrote: >> >> >> >> On 04/30/2012 08:03 PM, Doug Hellmann wrote: >> >>> >> >>> >> >>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <l...@enovance.com >> >>> <mailto:l...@enovance.com>> wrote: >> >>> >> >>> On 04/30/2012 03:49 PM, Doug Hellmann wrote: >> >>>> >> >>>> >> >>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary >> >>>> <l...@enovance.com <mailto:l...@enovance.com>> wrote: >> >>>> >> >>>> On 04/30/2012 12:15 PM, Loic Dachary wrote: >> >>>> > We could start a discussion from the content of the >> >>>> following sections: >> >>>> > >> >>>> > http://wiki.openstack.org/EfficientMetering#Counters >> >>>> I think the rationale of the counter aggregation needs >> >>>> to be explained. My understanding is that the metering >> >>>> system will be able to deliver the following >> >>>> information: 10 floating IPv4 addresses were allocated >> >>>> to the tenant during three months and were leased from >> >>>> provider NNN. From this, the billing system could add a >> >>>> line to the invoice : 10 IPv4, $N each = $10xN because >> >>>> it has been configured to invoice each IPv4 leased from >> >>>> provider NNN for $N. >> >>>> >> >>>> It is not the purpose of the metering system to display >> >>>> each IPv4 used, therefore it only exposes the aggregated >> >>>> information. The counters define how the information >> >>>> should be aggregated. If the idea was to expose each >> >>>> resource usage individually, defining counters would be >> >>>> meaningless as they would duplicate the activity log >> >>>> from each OpenStack component. >> >>>> >> >>>> What do you think ? >> >>>> >> >>>> >> >>>> At DreamHost we are going to want to show each individual >> >>>> resource (the IPv4 address, the instance, etc.) along with >> >>>> the charge information. Having the metering system aggregate >> >>>> that data will make it difficult/impossible to present the >> >>>> bill summary and detail views that we want. It would be much >> >>>> more useful for us if it tracked the usage details for each >> >>>> resource, and let us aggregate the data ourselves. >> >>>> >> >>>> If other vendors want to show the data differently, perhaps >> >>>> we should provide separate APIs for retrieving the detailed >> >>>> and aggregate data. >> >>>> >> >>>> Doug >> >>>> >> >>> Hi, >> >>> >> >>> For the record, here is the unfinished conversation we had on >> IRC >> >>> >> >>> (04:29:06 PM) dhellmann: dachary, did you see my reply about >> >>> counter definitions on the list today? >> >>> (04:39:05 PM) dachary: It means some counters must not be >> >>> aggregated. Only the amount associated with it is but there >> >>> is one counter per IP. >> >>> (04:55:01 PM) dachary: dhellmann: what about this :the id of >> >>> the ressource controls the agregation of all counters : if it >> >>> is missing, all resources of the same kind and their measures >> >>> are aggregated. Otherwise only the measures are agreggated. >> >>> >> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39 >> >>> < >> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39> >> >>> (04:55:58 PM) dachary: it makes me a little unconfortable to >> >>> define such an "ad-hoc" grouping >> >>> (04:56:53 PM) dachary: i.e. you actuall control the >> >>> aggregation by chosing which value to put in the id column >> >>> (04:58:43 PM) dachary: s/actuall/actually/ >> >>> (05:05:38 PM) ***dachary reading >> >>> http://www.ogf.org/documents/GFD.98.pdf >> >>> (05:05:54 PM) dachary: I feel like we're trying to resolve a >> >>> non problem here >> >>> (05:08:42 PM) dachary: values need to be aggregated. The raw >> >>> input is a full description of the resource and a value ( >> >>> gauge ). The question is how to control the aggregation in a >> >>> reasonably flexible way. >> >>> (05:11:34 PM) dachary: The definition of a counter could >> >>> probably be described as : the id of a resource and code to >> >>> fill each column associated with it. >> >>> >> >>> I tried to append the following, but the wiki kept failing. >> >>> >> >>> Propose that the counters are defined by a function instead >> >>> of being fixed. That helps addressing the issue of >> >>> aggregating the bandwidth associated to a given IP into a >> >>> single counter. >> >>> >> >>> Alternate idea : >> >>> * a counter is defined by >> >>> * a name ( o1, n2, etc. ) that uniquely identifies the >> >>> nature of the measure ( outbound internet transit, amount of >> >>> RAM, etc. ) >> >>> * the component in which it can be found ( nova, swift etc.) >> >>> * and by columns, each one is set with the result of >> >>> aggregate(find(record),record) where >> >>> * find() looks for the existing column as found by >> >>> selecting with the unique key ( maybe the name and the >> >>> resource id ) >> >>> * record is a detailed description of the metering event to >> >>> be aggregated ( >> >>> >> http://wiki.openstack.org/SystemUsageData#compute.instance.exists: >> >>> ) >> >>> * the aggregate() function returns the updated row. By >> >>> default it just += the counter value with the old row >> >>> returned by find() >> >>> >> >>> >> >>> Would we want aggregation to occur within the database where we >> >>> are collecting events, or should that move somewhere else? >> >> I assume the events collected by the metering agents will all be >> >> archived for auditing (or re-building the database) >> >> >> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44 >> >> < >> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44> >> >> >> >> Therefore the aggregation should occur when the database is >> >> updated to account for a new event. >> >> >> >> Does this make sense ? I may have misunderstood part of your >> question. >> >> >> >> >> >> I guess what I don't understand is why the aggregated data is written >> >> back to the metering database at all. If it's in the same database, it >> >> seems like it should be in a different "table" (or equivalent) so the >> >> original data is left alone. >> > In my view the events are not stored in a database, they are merely >> > appended to a log file. The database is built from the events with >> > aggregated data. I now understand that you (and Joshua Harlow) think >> > it's better to not aggregate the data and let the billing system do this >> > job. >> >> My intent when writing the blueprint was that each event would be >> recorded atomically in the database, as it is the only way to control >> that we have not missed any. Aggregation, should be done at the external >> API level if the request is to get the sum of a given counter. >> > > That matches what I was thinking. The "log file" that Loic mentioned would > in fact be a database that can handle a lot of writes. We could use some > sort of simple file format, but since we're going to have to read and parse > the log anyway, we might as well use a tool that makes that easy. > > Aggregation could happen either in a metering API based on the query, or > an external app could retrieve a large dataset and manage the aggregation > itself. > > >> What I missed in the blueprint and seems to be appearing clearly now, is >> that an event need to be able to carry the "object-reference" for which >> it was collected, and this would seem highly necessary looking at the >> messages in this thread. A metering event would essentially be defined >> by (who, what, which) instead of a simple (who, what). As a consequence >> we would need to extend the DB schema to add this [which/object >> reference], and make sure that we carry it as well when we will work on >> the message API format definition. >> >> How does this sound? >> > > I think so. A lot of these sorts of issues can probably be fixed by being > careful about how we define the measurements. For example, I may want to be > able to show a customer the network bandwidth used per server, not just per > network. If we measure the bandwidth consumed by each VIF, the aggregation > code can take care of summarizing by network (because we know where the VIF > is) and/or server (because we know which server has the VIF). > > We may need to record more detail than a simple "which," though, because > it may be possible to change some information relevant for calculating the > billing rate later. For example, a tenant can resize an instance, which > would usually cause a change in the billing rate. Some of the relationships > might change, too (Is it possible to move a VIF between networks?). > > At first I thought this might require separate table definitions per > resource type (instance, network, etc.) but re-reading the table of > counters in EfficientMetering I guess this is handled by measuring things > like CPU, RAM, and block storage as separate counters? So a single event > for creating a new instance might result in several records being written > to the database, with the "which" set to the instance identifier. The data > could then be presented as a unified "resource usage" report for that > server. > > I think that works, but it may make the job of calculating the bill > harder. We are planning to follow the model of specifying rates per size, > so we would have to figure out which combination of CPU, RAM, and root > volume storage matches up with a given size to determine the rate. > > Another piece I've been thinking about is handling boundary conditions > when resource create and delete events don't both fall inside a billing > cycle (or within the granularity of the metering system). That shouldn't be > part of logging the events, necessarily, but it could be a reusable > component that feeds into producing the aggregated data (either through the > API, or as a way of processing the results returned by the API). > > >> Maybe it's time to start focusing these discussions on user stories? >> >> >> > I agree. Would you like to go first ? >> > > These are "things that might happen" use cases rather than "user stories," > but let's see where they take us: > > 1. User creates an instance, waits some period of time, then terminates it. > - Vary the period of time to allow the events to both fall within the > metering granularity window, to overlap an entire window, to start in one > window and end in another. > - The same variations for "billing cycle" instead of "metering > granularity window." > 2. User creates an instance, waits some period of time, then resizes it. > - Vary the period of time as above. > - Do we need variations for resizing up and down? > 3. User creates an instance but it fails to create properly (provider > issue). > 4. User creates an instance but it fails to boot after creation (bad > image). > 5. User create volume storage, adds it to an existing instance, waits a > period of time, then deletes the volume. > - Vary the period of time as above. > 6. User creates volume storage, adds it to an existing instance, waits a > period of time, then terminates the instance (I'm not sure what happens to > the volume in that case, maybe it still exists?) > > A provider-related story might be: > > 1. As a provider, I can query the metering API to determine the activity > for a tenant within a given period of time. > > Although that's pretty vague. :-) > I thought of another provider story: 2. As a provider, I can install a metering plugin to start collecting data about events not handled by the core metering app.
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp