Re: [Spacewalk-devel] pltcl question
> Make sure you check latest code - I submitted a number of changes in the last > day or so to address *exactly* the "need to reset for logging after a commit" > in the testcases. I'm sure there are still a few places, but it's better > now... Thanks! I noticed your latest patches while developing actually. I am reviewing my own patches today, and will be back with news soon. Regards, -- Silvio Moioli SUSE LINUX Products GmbH Maxfeldstraße 5, 90409 Nürnberg Germany ___ Spacewalk-devel mailing list Spacewalk-devel@redhat.com https://www.redhat.com/mailman/listinfo/spacewalk-devel
Re: [Spacewalk-devel] pltcl question
On Wed, Sep 11, 2013 at 11:20:15AM -0400, Stephen Herr wrote: > > Normally the case yes, but there are at least a couple of instances > where we commit a single request as multiple transactions to avoid > overrunning Oracle's rollback buffer. One example that jumps to mind > is generating yum repodata for channels. I'm not sure if that's > relevant to this discussion or not, I just saw that comment and > thought I'd chime in. These situations really should be checked. We probably don't see problems now because the tables manipulated by the yum repodata generation do not have logging enabled, but eventually there could be a problem. -- Jan Pazdziora Principal Software Engineer, Identity Management Engineering, Red Hat ___ Spacewalk-devel mailing list Spacewalk-devel@redhat.com https://www.redhat.com/mailman/listinfo/spacewalk-devel
Re: [Spacewalk-devel] pltcl question
- Original Message - > On 09/11/2013 02:28 PM, Jan Pazdziora wrote: > > Which for typical web application request handling is what you want, > > isn't it? The whole request processing is one transaction and > > releasing it gives you nice cleanup. > > It is, except for a very few cases of "commit-in-the-middle". > > > Does it happen in our code base? > > I confirm that in most cases, and in particular all cases we observed > until now, this only happens in test code. Make sure you check latest code - I submitted a number of changes in the last day or so to address *exactly* the "need to reset for logging after a commit" in the testcases. I'm sure there are still a few places, but it's better now... G ___ Spacewalk-devel mailing list Spacewalk-devel@redhat.com https://www.redhat.com/mailman/listinfo/spacewalk-devel
Re: [Spacewalk-devel] pltcl question
On 09/11/2013 02:28 PM, Jan Pazdziora wrote: > Which for typical web application request handling is what you want, > isn't it? The whole request processing is one transaction and > releasing it gives you nice cleanup. It is, except for a very few cases of "commit-in-the-middle". > Does it happen in our code base? I confirm that in most cases, and in particular all cases we observed until now, this only happens in test code. > They need to be fixed to properly initialize the auth info in the logging > infrastructure. Indeed. I will send some patches to fix remaining cases I stumbled upon recently. In one case, for example, a ROLLBACK was issued after a test failed. This led to a subsequent code failing because the global Tcl variable would retain the log row id, but the corresponding row in the log table was gone! > No to the first part (we could use some other way), "I did not > find a better way" to the second part. Okay, I see the problem now. > The fact that the global variables are not initialized means that the > user of the database (the applications) broke the contract which > states that they should provide the auth logging information before > doing anything with the database session because *any* database table > can have logging enabled and need that information. That's the critical piece of information I was missing, as the LoggingFactory.clearLogId() method name and documentation were not terribly useful in understanding why it needs to be called after any operation on a new connection, COMMIT or ROLLBACK. Maybe I simply missed some documentation somewhere, I apologize in that case :) Thanks for the detailed explanation, -- Silvio Moioli SUSE LINUX Products GmbH Maxfeldstraße 5, 90409 Nürnberg Germany ___ Spacewalk-devel mailing list Spacewalk-devel@redhat.com https://www.redhat.com/mailman/listinfo/spacewalk-devel
Re: [Spacewalk-devel] pltcl question
On 09/11/2013 08:28 AM, Jan Pazdziora wrote: On Wed, Sep 04, 2013 at 09:41:15AM +0200, Silvio Moioli wrote: We recently stumbled in some failing tests downstream and we were wondering why it was chosen to use pltcl in the new logging schema I've decided it was the optimal solution for the task. Faster than temporary tables, less invasive than custom_variable_classes. Problems arise because: - pltcl global variables are used [1]; That's the goal. - Postgres allocates a new Tcl interpreter for each database connection [2]; - thus, each connection will have its independent set of Tcl global variables; That's the hope. We use it for the logging purpose and wouldn't want the sessions to influence each other. - Hibernate will by default release a database connection after a transaction has ended [3]; Which for typical web application request handling is what you want, isn't it? The whole request processing is one transaction and releasing it gives you nice cleanup. Of course, you could change the mode to ON_CLOSE if your really wanted. Normally the case yes, but there are at least a couple of instances where we commit a single request as multiple transactions to avoid overrunning Oracle's rollback buffer. One example that jumps to mind is generating yum repodata for channels. I'm not sure if that's relevant to this discussion or not, I just saw that comment and thought I'd chime in. -Stephen - thus, any database operation occurring after, for example, a call to ConnectionManager.commitTransaction(), could get a different connection from the one used before the COMMIT; Does it happen in our code base? We tried to catch and fix all such occurences. - possibly, this different connection could be completely new - in that case no global variables are set; - the above could also not happen if, in lucky cases, the same connection is reused but this is (at least) Hibernate, c3p0, context, JVM and load dependent. Actually, there is not a lot of places in the Spacewalk codebase where database operations are carried out after a COMMIT, nevertheless in those cases trouble can happen. For example we found out that DELETE queries on web_contact will fail on a new connection, because web_contact_log_trig will call logging.get_log_id() that will in turn call logging._get_log_id() which would raise an exception due to the global variable the_log_id not being initialized. In our case, this after-commit user deletion happens in BaseTestCaseWithUser.tearDown() -> OrgFactory.deleteOrg() -> rhn_org.delete_org() -> rhn_org.delete_user() which is called because ChannelManagerTest.testListCompatibleBaseChannels() commits in the middle. This is in tests, right? They need to be fixed to properly initialize the auth info in the logging infrastructure. Questions: - is it really necessary to track variables per _database connection_ in the logging package? Is there any other way to get the id of the authenticated user or of the logging event in say an AFTER DELETE TRIGGER? Since connections should be handled in a completely transparent way by Hibernate sessions, I believe this should not happen; Our hope, supported by testing, is that even Hibernate is not so bold to run part of its session using one database session and the other part using other session. In other words, underneath Hibernate session, there is still a RDBMS with its own notion of sessions and Hibernate respects that. - is it really necessary to add the pltcl language and use globals? No to the first part (we could use some other way), "I did not find a better way" to the second part. Wouldn't a simple database sequence or even plain table suffice? The problem I faced with database sequences was, I did not find an easy way to invalidate the currval. With the existing logging infrastructure, we clear the log_id after we are done with it so if the database session gets reused, the application logic needs to reinitialize it or things will fail (basically, what you see in the failing tests). That is a feature -- we do not want the log_id from the previous HTTP request which happened to be handled by application that used our database sessions previously to leak to the subsequent operations. As for plain tables -- I don't think it would work. How would you know in that AFTER DELETE TRIGGER which record in the table is yours? And if you did not COMMIT and depended on there being only one record (the one that your session inserted), how would it work if the application decided to ROLLBACK? And if the application COMMITed by mistake, the assumption of there only being one record would immediately break. I see the following possible solutions: - adopting another way to keep the log counter, possibly at transaction or whole database level; It's not about the log counter. There is a sequence there alright. It's about being able to access in reliable way value which was instantiated in the same d
Re: [Spacewalk-devel] pltcl question
On Wed, Sep 04, 2013 at 09:41:15AM +0200, Silvio Moioli wrote: > We recently stumbled in some failing tests downstream and we were > wondering why it was chosen to use pltcl in the new logging schema I've decided it was the optimal solution for the task. Faster than temporary tables, less invasive than custom_variable_classes. > Problems arise because: > - pltcl global variables are used [1]; That's the goal. > - Postgres allocates a new Tcl interpreter for each database connection > [2]; > - thus, each connection will have its independent set of Tcl global > variables; That's the hope. We use it for the logging purpose and wouldn't want the sessions to influence each other. > - Hibernate will by default release a database connection after a > transaction has ended [3]; Which for typical web application request handling is what you want, isn't it? The whole request processing is one transaction and releasing it gives you nice cleanup. Of course, you could change the mode to ON_CLOSE if your really wanted. > - thus, any database operation occurring after, for example, a call to > ConnectionManager.commitTransaction(), could get a different connection > from the one used before the COMMIT; Does it happen in our code base? We tried to catch and fix all such occurences. > - possibly, this different connection could be completely new - in that > case no global variables are set; > - the above could also not happen if, in lucky cases, the same > connection is reused but this is (at least) Hibernate, c3p0, context, > JVM and load dependent. > > Actually, there is not a lot of places in the Spacewalk codebase where > database operations are carried out after a COMMIT, nevertheless in > those cases trouble can happen. For example we found out that DELETE > queries on web_contact will fail on a new connection, because > web_contact_log_trig will call logging.get_log_id() that will in turn > call logging._get_log_id() which would raise an exception due to the > global variable the_log_id not being initialized. > > In our case, this after-commit user deletion happens in > BaseTestCaseWithUser.tearDown() -> OrgFactory.deleteOrg() -> > rhn_org.delete_org() -> rhn_org.delete_user() which is called because > ChannelManagerTest.testListCompatibleBaseChannels() commits in the middle. This is in tests, right? They need to be fixed to properly initialize the auth info in the logging infrastructure. > Questions: > - is it really necessary to track variables per _database connection_ > in the logging package? Is there any other way to get the id of the authenticated user or of the logging event in say an AFTER DELETE TRIGGER? > Since connections should be handled in a > completely transparent way by Hibernate sessions, I believe this should > not happen; Our hope, supported by testing, is that even Hibernate is not so bold to run part of its session using one database session and the other part using other session. In other words, underneath Hibernate session, there is still a RDBMS with its own notion of sessions and Hibernate respects that. > - is it really necessary to add the pltcl language and use globals? No to the first part (we could use some other way), "I did not find a better way" to the second part. > Wouldn't a simple database sequence or even plain table suffice? The problem I faced with database sequences was, I did not find an easy way to invalidate the currval. With the existing logging infrastructure, we clear the log_id after we are done with it so if the database session gets reused, the application logic needs to reinitialize it or things will fail (basically, what you see in the failing tests). That is a feature -- we do not want the log_id from the previous HTTP request which happened to be handled by application that used our database sessions previously to leak to the subsequent operations. As for plain tables -- I don't think it would work. How would you know in that AFTER DELETE TRIGGER which record in the table is yours? And if you did not COMMIT and depended on there being only one record (the one that your session inserted), how would it work if the application decided to ROLLBACK? And if the application COMMITed by mistake, the assumption of there only being one record would immediately break. > I see the following possible solutions: > - adopting another way to keep the log counter, possibly at transaction > or whole database level; It's not about the log counter. There is a sequence there alright. It's about being able to access in reliable way value which was instantiated in the same database session in one of the previous database operations, and being able to invalidate that information. > - calling LoggingFactory.clearLogId() after COMMITs and ROLLBACKs, or I hope we are doing it in the tomcat context at least. > anyway be sure that it is called before any operation that could involve > logging; Before the first operation that c