Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
Matthias Nagel wrote: This is correct, if one has some kind of key to identify a session that could be used as a database index. But unfortunately there are a lot of authenticators out there, that do not correctly generate radius accounting session ids. Basicly I see three different types (despite the correct one): 1) Authenticators that do no send a session id at all (Acct-Session-Id is empty) 2) Those that always return the same session id (even if the user name differs) 3) Those that always return a new session id even if the requests (start/update/stop) belong to the same session Arran has a more technical response here. Mine is this: You were ripped off. Any reputable vendor would put a minimum of effort into creating a system that works. I can see your point. Do you have any other suggestions to solve the issues? (Changing the hardware is not going to happen.) Change the hardware. NOTHING else will work. Deciding to not change the hardware is ridiculous. The cost/benefit tradeoff just isn't there. (a) buy hardware that works for probably not too much money. Or (b) spend huge amounts of time and effort working around bugs in the crappy hardware. FreeRADIUS already has a number of hacks to work around partially broken systems. They should generally work, and they may even work for you. But you really need to toss your hardware in the garbage. It's crap. Alan DeKok. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
On 08/28/2012 05:26 PM, Matthias Nagel wrote: is incomplete, i.e. it only has as start time. The latter never will be completed, because the stop message has already been processed and acknowledged to the authenticator. Aside from anything else - you need to handle this case. Accounting is UDP-based and might be dropped, even with retries. We use an SQL query to expire accounting sessions which haven't been updated in 2 hours. Since our maximum interim-accounting interval is 30 minutes, that covers most cases. e.g. update radacct set acctstoptime=now() where now() - acctstarttime + (coalesce(acctsessiontime,0)||' seconds')::interval '2 hours' If your NAS generates truly unique acctsessionid values, even across reboots (some do) maybe you can modify the SQL to handle this case, for example by setting a unique index on that column. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
Hello everybody, if two accounting messages for the same session are sent by the authenticator very quickly, the messages may be processed by the radius server in the wrong order. This results into two sessions being accounted instead of one. The second phantom session stays open for ever, because it never receives any update and/or stop message. Example: If a supplicant authenticates and immediately disconnects again, the following steps are executed: 1) The authenticator sends an accounting start message 2) The authenticator sends an accounting stop message immediately 3) The RADIUS server receives the start message and assigns it to thread #1 4) The RADIUS server receives the stop message and assigns it to thread #2 5) Thread #2 terminates first and the accounting stop message is written to the PostgreSQL database. The SQL UPDATE statement fails, because there is no entry for this session that could be updated, as the start message has not been processed yet. Hence, an INSERT INTO statement is executed as a fail-over measure. 6) Thread #1 terminates and an SQL INSERT statement is performed in order to log the start message. The result is, that the same session is accounted with two entries in the database. The first entry is complete, this is to say it has a start and stop time. This is the result of step 5. The second entry is incomplete, i.e. it only has as start time. The latter never will be completed, because the stop message has already been processed and acknowledged to the authenticator. At the moment my work around is to run FreeRADIUS in debug mode to keep it single threaded. But I would like to propose the following solution. Instead of assigning incoming requests to the thread pool randomly, first preprocess the request and assign requests that have identical user names (or some other senseful attribute) to the same thread. This way requests that might belong to the same session are processed by the same thread and cannot outperform each other. Requests that never can belong to the same session are still processed concurrently. Best regards, Matthias Nagel -- Matthias Nagel Willy-Andreas-Allee 1, Zimmer 506 76131 Karlsruhe Telefon: +49-721-8695-1506 Mobil: +49-151-15998774 e-Mail: matthias.h.na...@gmail.com ICQ: 499797758 Skype: nagmat84 - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
Matthias Nagel wrote: if two accounting messages for the same session are sent by the authenticator very quickly, the messages may be processed by the radius server in the wrong order. This results into two sessions being accounted instead of one. The second phantom session stays open for ever, because it never receives any update and/or stop message. This is a well-known issue with RADIUS. Packets may appear in any order. Example: If a supplicant authenticates and immediately disconnects again, the following steps are executed: 1) The authenticator sends an accounting start message 2) The authenticator sends an accounting stop message immediately 3) The RADIUS server receives the start message and assigns it to thread #1 4) The RADIUS server receives the stop message and assigns it to thread #2 5) Thread #2 terminates first and the accounting stop message is written to the PostgreSQL database. The SQL UPDATE statement fails, because there is no entry for this session that could be updated, as the start message has not been processed yet. Hence, an INSERT INTO statement is executed as a fail-over measure. 6) Thread #1 terminates and an SQL INSERT statement is performed in order to log the start message. That doesn't make sense. If the table indexes are set up correctly, the SQL insert will fail at step (6). The module will then try the update query, which should succeed. The result is, that the same session is accounted with two entries in the database. The first entry is complete, this is to say it has a start and stop time. This is the result of step 5. The second entry is incomplete, i.e. it only has as start time. The latter never will be completed, because the stop message has already been processed and acknowledged to the authenticator. That is a database consistency issue. You can't have two rows using the same keys. But I would like to propose the following solution. Instead of assigning incoming requests to the thread pool randomly, first preprocess the request and assign requests that have identical user names (or some other senseful attribute) to the same thread. This way requests that might belong to the same session are processed by the same thread and cannot outperform each other. Requests that never can belong to the same session are still processed concurrently. That is not going to happen. It's a bad fix. The correct fix is to use the SQL indexes. Alan DeKok. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
On Wed, Aug 29, 2012 at 4:11 AM, Alan DeKok al...@deployingradius.com wrote: Matthias Nagel wrote: 5) Thread #2 terminates first and the accounting stop message is written to the PostgreSQL database. The SQL UPDATE statement fails, because there is no entry for this session that could be updated, as the start message has not been processed yet. Hence, an INSERT INTO statement is executed as a fail-over measure. 6) Thread #1 terminates and an SQL INSERT statement is performed in order to log the start message. That doesn't make sense. If the table indexes are set up correctly, the SQL insert will fail at step (6). The module will then try the update query, which should succeed. IIRC the default schema for postgresql (as was also the case for mysql untill a month ago) does not have unique index for acctuniqueid. So yes, by default what the OP experienced will happen, and yes, it's an easy fix. -- Fajar - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
Hello, Am Dienstag 28 August 2012, 23:11:57 schrieb Alan DeKok: Matthias Nagel wrote: if two accounting messages for the same session are sent by the authenticator very quickly, the messages may be processed by the radius server in the wrong order. This results into two sessions being accounted instead of one. The second phantom session stays open for ever, because it never receives any update and/or stop message. This is a well-known issue with RADIUS. Packets may appear in any order. Example: If a supplicant authenticates and immediately disconnects again, the following steps are executed: 1) The authenticator sends an accounting start message 2) The authenticator sends an accounting stop message immediately 3) The RADIUS server receives the start message and assigns it to thread #1 4) The RADIUS server receives the stop message and assigns it to thread #2 5) Thread #2 terminates first and the accounting stop message is written to the PostgreSQL database. The SQL UPDATE statement fails, because there is no entry for this session that could be updated, as the start message has not been processed yet. Hence, an INSERT INTO statement is executed as a fail-over measure. 6) Thread #1 terminates and an SQL INSERT statement is performed in order to log the start message. That doesn't make sense. If the table indexes are set up correctly, the SQL insert will fail at step (6). The module will then try the update query, which should succeed. This is correct, if one has some kind of key to identify a session that could be used as a database index. But unfortunately there are a lot of authenticators out there, that do not correctly generate radius accounting session ids. Basicly I see three different types (despite the correct one): 1) Authenticators that do no send a session id at all (Acct-Session-Id is empty) 2) Those that always return the same session id (even if the user name differs) 3) Those that always return a new session id even if the requests (start/update/stop) belong to the same session Of course one can try to use other attributes to construct an own artificial identifier which is actually done by the unique module. But the question is, what attributes should be used? If one uses user-name, nas-ip-address, calling-station-id and friends, then the same session id might be used for different sessions. For example, one session by the same user from the same supplicant in the morning and a new session in the afternoon. So the result is the same as in case 2) If one includes some timestamp related information, the session identifier is always changing. This means we are faced with case 3). Hence, at the moment my SQL query does not rely on identifiers at all, but does the following: 1) If a start request comes in, just create a new row in the database 2a) If a update/stop message comes in, select all rows that have the same user name, nas ip address and some other identical columns and that do not have a stop time. This is to say, select all possibly matching and running sessions. Then order these session by update time and take the most recent one. This row is then updated with the new information 2b) If 2a) fails, because there is no row in the selection, create a new row as in 1) Of course this procedure relies on the correct order of the radius messages. The result is, that the same session is accounted with two entries in the database. The first entry is complete, this is to say it has a start and stop time. This is the result of step 5. The second entry is incomplete, i.e. it only has as start time. The latter never will be completed, because the stop message has already been processed and acknowledged to the authenticator. That is a database consistency issue. You can't have two rows using the same keys. But I would like to propose the following solution. Instead of assigning incoming requests to the thread pool randomly, first preprocess the request and assign requests that have identical user names (or some other senseful attribute) to the same thread. This way requests that might belong to the same session are processed by the same thread and cannot outperform each other. Requests that never can belong to the same session are still processed concurrently. That is not going to happen. It's a bad fix. The correct fix is to use the SQL indexes. I can see your point. Do you have any other suggestions to solve the issues? (Changing the hardware is not going to happen.) Any ideas for a more sophistcated SQL query that does not rely on identifiers (see above)? Alan DeKok. Best regards, Matthias Nagel - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html -- Matthias Nagel Willy-Andreas-Allee 1, Zimmer 506 76131 Karlsruhe Telefon:
Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
Yes yes RADIUS vendors should go die in a big fiery pit somewhere. 1) Verify your NAS supports the Class attribute correctly (http://www.ietf.org/rfc/rfc2865.txt 5.25) 2) Implement the policies in raddb/policy.d/accounting (master:HEAD) 3) Submit patch to add unique index constraint on acctuniqueid in the postgresql schema -Arran- List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)
On 28 Aug 2012, at 23:05, Matthias Nagel matthias.h.na...@gmail.com wrote: Hello, Am Dienstag 28 August 2012, 23:11:57 schrieb Alan DeKok: Matthias Nagel wrote: if two accounting messages for the same session are sent by the authenticator very quickly, the messages may be processed by the radius server in the wrong order. This results into two sessions being accounted instead of one. The second phantom session stays open for ever, because it never receives any update and/or stop message. This is a well-known issue with RADIUS. Packets may appear in any order. Example: If a supplicant authenticates and immediately disconnects again, the following steps are executed: 1) The authenticator sends an accounting start message 2) The authenticator sends an accounting stop message immediately 3) The RADIUS server receives the start message and assigns it to thread #1 4) The RADIUS server receives the stop message and assigns it to thread #2 5) Thread #2 terminates first and the accounting stop message is written to the PostgreSQL database. The SQL UPDATE statement fails, because there is no entry for this session that could be updated, as the start message has not been processed yet. Hence, an INSERT INTO statement is executed as a fail-over measure. 6) Thread #1 terminates and an SQL INSERT statement is performed in order to log the start message. That doesn't make sense. If the table indexes are set up correctly, the SQL insert will fail at step (6). The module will then try the update query, which should succeed. This is correct, if one has some kind of key to identify a session that could be used as a database index. But unfortunately there are a lot of authenticators out there, that do not correctly generate radius accounting session ids. Basicly I see three different types (despite the correct one): 1) Authenticators that do no send a session id at all (Acct-Session-Id is empty) Then they're breaking RFC 2866 and cannot claim to support RADIUS Accounting. Users should log a defect case, demand a refund, or sue them. 5.13. Table of Attributes The following table provides a guide to which attributes may be found in Accounting-Request packets. No attributes should be found in Accounting-Response packets except Proxy-State and possibly Vendor- Specific. # Attribute . . . 1 Acct-Session-Id 2) Those that always return the same session id (even if the user name differs) Then they're breaking RFC 2866 and cannot claim to support RADIUS Accounting. Users should log a defect case, demand a refund or sue them. (see below) 3) Those that always return a new session id even if the requests (start/update/stop) belong to the same session Then they're breaking RFC 2866 and cannot claim to support RADIUS Accounting. Users should log a defect case, demand a refund or sue them. 1.2. Terminology session Each service provided by the NAS to a dial-in user constitutes a session, with the beginning of the session defined as the point where service is first provided and the end of the session defined as the point where service is ended. A user may have multiple sessions in parallel or series if the NAS supports that, with each session generating a separate start and stop accounting record with its own Acct-Session-Id. If instead or returning: HTTP/1.1 200 OK as the status line a HTTP server started returning HTTP/6.9 RAINBOWS It could not claim to implement RFC2616. By working around compliance issues, you're encouraging your equipment vendors to be lazy, you have to work with them to get them to fix their issues, else it'll be just as crappy for the next person who comes along. -Arran - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html