Jay, The session object will be commit()'d after the session.begin() context manager exits, which will cause the aforementioned BEGIN; INSERT; COMMIT; transaction to be executed against the server for each event record.
It is a just half of problem. We should use SQL Bulk Inserts as well. And we should mention that declarative_base won't make this work for us transparent. (Even if we make all in one transactions there will be N Inserts). Best regards, Boris Pavlovic On Sat, Dec 21, 2013 at 10:57 AM, Jay Pipes <[email protected]> wrote: > On 12/20/2013 04:43 PM, Julien Danjou wrote: > >> On Fri, Dec 20 2013, Herndon, John Luke wrote: >> >> I think there is probably a tolerance for duplicates but you’re right, >>> missing a notification is unacceptable. Can anyone weigh in on how big >>> of a >>> deal duplicates are for meters? Duplicates aren’t really unique to the >>> batching approach, though. If a consumer dies after it’s inserted a >>> message >>> into the data store but before the message is acked, the message will be >>> requeued and handled by another consumer resulting in a duplicate. >>> >> >> Duplicates can be a problem for metering, as if you see twice the same >> event it's possible you will think it happened twice. >> >> As for event storage, it won't be a problem if you use a good storage >> driver that can have unique constraint; you'll just drop it and log the >> fact that this should not have happened, or something like that. >> > > The above brings up a point related to the implementation of the existing > SQL driver code that will need to be re-thought with the introduction of > batch notification processing. > > Currently, the SQL driver's record_events() method [1] is written in a way > that forces a new INSERT transaction for every record supplied to the > method. If the record_events() method is called with 10K events, then 10K > BEGIN; INSERT ...; COMMIT; transactions are executed against the server. > > Suffice to say, this isn't efficient. :) > > Ostensibly, from looking at the code, the reason that this approach was > taken was to allow for the collection of duplicate event IDs, and return > those duplicate event IDs to the caller. > > Because of this code: > > for event_model in event_models: > event = None > try: > with session.begin(): > event = self._record_event(session, event_model) > except dbexc.DBDuplicateEntry: > problem_events.append((api_models.Event.DUPLICATE, > event_model)) > > The session object will be commit()'d after the session.begin() context > manager exits, which will cause the aforementioned BEGIN; INSERT; COMMIT; > transaction to be executed against the server for each event record. > > If we want to actually take advantage of the performance benefits of > batching notification messages, the above code will need to be rewritten so > that a single transaction is executed against the database for the entire > batch of events. > > Best, > -jay > > [1] https://github.com/openstack/ceilometer/blob/master/ > ceilometer/storage/impl_sqlalchemy.py#L932 > > > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
