Re: [Spacewalk-devel] pltcl question

2013-09-11 Thread Silvio Moioli
> Make sure you check latest code - I submitted a number of changes in the last 
> day or so to address *exactly* the "need to reset for logging after a commit" 
> in the testcases.  I'm sure there are still a few places, but it's better 
> now...

Thanks! I noticed your latest patches while developing actually. I am
reviewing my own patches today, and will be back with news soon.

Regards,
-- 
Silvio Moioli
SUSE LINUX Products GmbH
Maxfeldstraße 5, 90409 Nürnberg Germany

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel


Re: [Spacewalk-devel] pltcl question

2013-09-11 Thread Jan Pazdziora
On Wed, Sep 11, 2013 at 11:20:15AM -0400, Stephen Herr wrote:
> 
> Normally the case yes, but there are at least a couple of instances
> where we commit a single request as multiple transactions to avoid
> overrunning Oracle's rollback buffer. One example that jumps to mind
> is generating yum repodata for channels. I'm not sure if that's
> relevant to this discussion or not, I just saw that comment and
> thought I'd chime in.

These situations really should be checked. We probably don't see
problems now because the tables manipulated by the yum repodata
generation do not have logging enabled, but eventually there could be
a problem.

-- 
Jan Pazdziora
Principal Software Engineer, Identity Management Engineering, Red Hat

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel


Re: [Spacewalk-devel] pltcl question

2013-09-11 Thread Grant Gainey


- Original Message -
> On 09/11/2013 02:28 PM, Jan Pazdziora wrote:
> > Which for typical web application request handling is what you want,
> > isn't it? The whole request processing is one transaction and
> > releasing it gives you nice cleanup.
> 
> It is, except for a very few cases of "commit-in-the-middle".
> 
> > Does it happen in our code base?
> 
> I confirm that in most cases, and in particular all cases we observed
> until now, this only happens in test code.

Make sure you check latest code - I submitted a number of changes in the last 
day or so to address *exactly* the "need to reset for logging after a commit" 
in the testcases.  I'm sure there are still a few places, but it's better now...

G

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel


Re: [Spacewalk-devel] pltcl question

2013-09-11 Thread Silvio Moioli
On 09/11/2013 02:28 PM, Jan Pazdziora wrote:
> Which for typical web application request handling is what you want,
> isn't it? The whole request processing is one transaction and
> releasing it gives you nice cleanup.

It is, except for a very few cases of "commit-in-the-middle".

> Does it happen in our code base? 

I confirm that in most cases, and in particular all cases we observed
until now, this only happens in test code.

> They need to be fixed to properly initialize the auth info in the logging 
> infrastructure.

Indeed. I will send some patches to fix remaining cases I stumbled upon
recently.

In one case, for example, a ROLLBACK was issued after a test failed.
This led to a subsequent code failing because the global Tcl variable
would retain the log row id, but the corresponding row in the log table
was gone!

> No to the first part (we could use some other way), "I did not
> find a better way" to the second part.

Okay, I see the problem now.

> The fact that the global variables are not initialized means that the
> user of the database (the applications) broke the contract which
> states that they should provide the auth logging information before
> doing anything with the database session because *any* database table
> can have logging enabled and need that information.

That's the critical piece of information I was missing, as the
LoggingFactory.clearLogId() method name and documentation were not
terribly useful in understanding why it needs to be called after any
operation on a new connection, COMMIT or ROLLBACK. Maybe I simply missed
some documentation somewhere, I apologize in that case :)

Thanks for the detailed explanation,
-- 
Silvio Moioli
SUSE LINUX Products GmbH
Maxfeldstraße 5, 90409 Nürnberg Germany

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel


Re: [Spacewalk-devel] pltcl question

2013-09-11 Thread Stephen Herr

On 09/11/2013 08:28 AM, Jan Pazdziora wrote:

On Wed, Sep 04, 2013 at 09:41:15AM +0200, Silvio Moioli wrote:

We recently stumbled in some failing tests downstream and we were
wondering why it was chosen to use pltcl in the new logging schema

I've decided it was the optimal solution for the task. Faster than
temporary tables, less invasive than custom_variable_classes.


Problems arise because:
  - pltcl global variables are used [1];

That's the goal.


  - Postgres allocates a new Tcl interpreter for each database connection
[2];
  - thus, each connection will have its independent set of Tcl global
variables;

That's the hope. We use it for the logging purpose and wouldn't want
the sessions to influence each other.


  - Hibernate will by default release a database connection after a
transaction has ended [3];

Which for typical web application request handling is what you want,
isn't it? The whole request processing is one transaction and
releasing it gives you nice cleanup.

Of course, you could change the mode to ON_CLOSE if your really
wanted.


Normally the case yes, but there are at least a couple of instances 
where we commit a single request as multiple transactions to avoid 
overrunning Oracle's rollback buffer. One example that jumps to mind is 
generating yum repodata for channels. I'm not sure if that's relevant to 
this discussion or not, I just saw that comment and thought I'd chime in.


-Stephen



  - thus, any database operation occurring after, for example, a call to
ConnectionManager.commitTransaction(), could get a different connection
from the one used before the COMMIT;

Does it happen in our code base? We tried to catch and fix all such
occurences.


  - possibly, this different connection could be completely new - in that
case no global variables are set;
  - the above could also not happen if, in lucky cases, the same
connection is reused but this is (at least) Hibernate, c3p0, context,
JVM and load dependent.

Actually, there is not a lot of places in the Spacewalk codebase where
database operations are carried out after a COMMIT, nevertheless in
those cases trouble can happen. For example we found out that DELETE
queries on web_contact will fail on a new connection, because
web_contact_log_trig will call logging.get_log_id() that will in turn
call logging._get_log_id() which would raise an exception due to the
global variable the_log_id not being initialized.

In our case, this after-commit user deletion happens in
BaseTestCaseWithUser.tearDown() ->  OrgFactory.deleteOrg() ->
rhn_org.delete_org() -> rhn_org.delete_user() which is called because
ChannelManagerTest.testListCompatibleBaseChannels() commits in the middle.

This is in tests, right? They need to be fixed to properly initialize
the auth info in the logging infrastructure.


Questions:
  - is it really necessary to track variables per _database connection_
in the logging package?

Is there any other way to get the id of the authenticated user or of
the logging event in say an AFTER DELETE TRIGGER?


Since connections should be handled in a
completely transparent way by Hibernate sessions, I believe this should
not happen;

Our hope, supported by testing, is that even Hibernate is not so bold
to run part of its session using one database session and the other
part using other session. In other words, underneath Hibernate
session, there is still a RDBMS with its own notion of sessions and
Hibernate respects that.


  - is it really necessary to add the pltcl language and use globals?

No to the first part (we could use some other way), "I did not
find a better way" to the second part.


Wouldn't a simple database sequence or even plain table suffice?

The problem I faced with database sequences was, I did not find an
easy way to invalidate the currval. With the existing logging
infrastructure, we clear the log_id after we are done with it so if
the database session gets reused, the application logic needs to
reinitialize it or things will fail (basically, what you see in the
failing tests). That is a feature -- we do not want the log_id from
the previous HTTP request which happened to be handled by application
that used our database sessions previously to leak to the subsequent
operations.

As for plain tables -- I don't think it would work. How would you know
in that AFTER DELETE TRIGGER which record in the table is yours? And
if you did not COMMIT and depended on there being only one record (the
one that your session inserted), how would it work if the application
decided to ROLLBACK? And if the application COMMITed by mistake, the
assumption of there only being one record would immediately break.


I see the following possible solutions:
  - adopting another way to keep the log counter, possibly at transaction
or whole database level;

It's not about the log counter. There is a sequence there alright.

It's about being able to access in reliable way value which was
instantiated in the same d

Re: [Spacewalk-devel] pltcl question

2013-09-11 Thread Jan Pazdziora
On Wed, Sep 04, 2013 at 09:41:15AM +0200, Silvio Moioli wrote:
> We recently stumbled in some failing tests downstream and we were
> wondering why it was chosen to use pltcl in the new logging schema

I've decided it was the optimal solution for the task. Faster than
temporary tables, less invasive than custom_variable_classes.

> Problems arise because:
>  - pltcl global variables are used [1];

That's the goal.

>  - Postgres allocates a new Tcl interpreter for each database connection
> [2];
>  - thus, each connection will have its independent set of Tcl global
> variables;

That's the hope. We use it for the logging purpose and wouldn't want
the sessions to influence each other.

>  - Hibernate will by default release a database connection after a
> transaction has ended [3];

Which for typical web application request handling is what you want,
isn't it? The whole request processing is one transaction and
releasing it gives you nice cleanup.

Of course, you could change the mode to ON_CLOSE if your really
wanted.

>  - thus, any database operation occurring after, for example, a call to
> ConnectionManager.commitTransaction(), could get a different connection
> from the one used before the COMMIT;

Does it happen in our code base? We tried to catch and fix all such
occurences.

>  - possibly, this different connection could be completely new - in that
> case no global variables are set;
>  - the above could also not happen if, in lucky cases, the same
> connection is reused but this is (at least) Hibernate, c3p0, context,
> JVM and load dependent.
> 
> Actually, there is not a lot of places in the Spacewalk codebase where
> database operations are carried out after a COMMIT, nevertheless in
> those cases trouble can happen. For example we found out that DELETE
> queries on web_contact will fail on a new connection, because
> web_contact_log_trig will call logging.get_log_id() that will in turn
> call logging._get_log_id() which would raise an exception due to the
> global variable the_log_id not being initialized.
> 
> In our case, this after-commit user deletion happens in
> BaseTestCaseWithUser.tearDown() ->  OrgFactory.deleteOrg() ->
> rhn_org.delete_org() -> rhn_org.delete_user() which is called because
> ChannelManagerTest.testListCompatibleBaseChannels() commits in the middle.

This is in tests, right? They need to be fixed to properly initialize
the auth info in the logging infrastructure.

> Questions:
>  - is it really necessary to track variables per _database connection_
> in the logging package?

Is there any other way to get the id of the authenticated user or of
the logging event in say an AFTER DELETE TRIGGER?

>   Since connections should be handled in a
> completely transparent way by Hibernate sessions, I believe this should
> not happen;

Our hope, supported by testing, is that even Hibernate is not so bold
to run part of its session using one database session and the other
part using other session. In other words, underneath Hibernate
session, there is still a RDBMS with its own notion of sessions and
Hibernate respects that.

>  - is it really necessary to add the pltcl language and use globals?

No to the first part (we could use some other way), "I did not
find a better way" to the second part.

> Wouldn't a simple database sequence or even plain table suffice?

The problem I faced with database sequences was, I did not find an
easy way to invalidate the currval. With the existing logging
infrastructure, we clear the log_id after we are done with it so if
the database session gets reused, the application logic needs to
reinitialize it or things will fail (basically, what you see in the
failing tests). That is a feature -- we do not want the log_id from
the previous HTTP request which happened to be handled by application
that used our database sessions previously to leak to the subsequent
operations.

As for plain tables -- I don't think it would work. How would you know
in that AFTER DELETE TRIGGER which record in the table is yours? And
if you did not COMMIT and depended on there being only one record (the
one that your session inserted), how would it work if the application
decided to ROLLBACK? And if the application COMMITed by mistake, the
assumption of there only being one record would immediately break.

> I see the following possible solutions:
>  - adopting another way to keep the log counter, possibly at transaction
> or whole database level;

It's not about the log counter. There is a sequence there alright.

It's about being able to access in reliable way value which was
instantiated in the same database session in one of the previous
database operations, and being able to invalidate that information.

>  - calling LoggingFactory.clearLogId() after COMMITs and ROLLBACKs, or

I hope we are doing it in the tomcat context at least.

> anyway be sure that it is called before any operation that could involve
> logging;

Before the first operation that c