Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-29 Thread Alan DeKok
Matthias Nagel wrote:
 This is correct, if one has some kind of key to identify a session that could 
 be used as a database index. But unfortunately there are a lot of 
 authenticators out there, that do not correctly generate radius accounting 
 session ids. Basicly I see three different types (despite the correct one):
 
 1) Authenticators that do no send a session id at all (Acct-Session-Id is 
 empty)
 2) Those that always return the same session id (even if the user name 
 differs)
 3) Those that always return a new session id even if the requests 
 (start/update/stop) belong to the same session

  Arran has a more technical response here.  Mine is this:

You were ripped off.

  Any reputable vendor would put a minimum of effort into creating a
system that works.

 I can see your point. Do you have any other suggestions to solve the issues? 
 (Changing the hardware is not going to happen.) 

  Change the hardware.

  NOTHING else will work.

  Deciding to not change the hardware is ridiculous.  The cost/benefit
tradeoff just isn't there.  (a) buy hardware that works for probably not
too much money.  Or (b) spend huge amounts of time and effort working
around bugs in the crappy hardware.

  FreeRADIUS already has a number of hacks to work around partially
broken systems.  They should generally work, and they may even work for you.

  But you really need to toss your hardware in the garbage.  It's crap.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-29 Thread Phil Mayers

On 08/28/2012 05:26 PM, Matthias Nagel wrote:


is incomplete, i.e. it only has as start time. The latter never will
be completed, because the stop message has already been processed and
acknowledged to the authenticator.


Aside from anything else - you need to handle this case. Accounting is 
UDP-based and might be dropped, even with retries.


We use an SQL query to expire accounting sessions which haven't been 
updated in 2 hours. Since our maximum interim-accounting interval is 30 
minutes, that covers most cases.


e.g.

update radacct set acctstoptime=now()
where
now() - acctstarttime + (coalesce(acctsessiontime,0)||' 
seconds')::interval  '2 hours'


If your NAS generates truly unique acctsessionid values, even across 
reboots (some do) maybe you can modify the SQL to handle this case, for 
example by setting a unique index on that column.

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-28 Thread Matthias Nagel
Hello everybody,

if two accounting messages for the same session are sent by the authenticator 
very quickly, the messages may be processed by the radius server in the wrong 
order. This results into two sessions being accounted instead of one. The 
second phantom session stays open for ever, because it never receives any 
update and/or stop message.

Example:

If a supplicant authenticates and immediately disconnects again, the following 
steps are executed:

1) The authenticator sends an accounting start message
2) The authenticator sends an accounting stop message immediately
3) The RADIUS server receives the start message and assigns it to thread #1
4) The RADIUS server receives the stop message and assigns it to thread #2
5) Thread #2 terminates first and the accounting stop message is written to the 
PostgreSQL database. The SQL UPDATE statement fails, because there is no entry 
for this session that could be updated, as the start message has not been 
processed yet. Hence, an INSERT INTO statement is executed as a fail-over 
measure.
6) Thread #1 terminates and an SQL INSERT statement is performed in order to 
log the start message.

The result is, that the same session is accounted with two entries in the 
database. The first entry is complete, this is to say it has a start and stop 
time. This is the result of step 5. The second entry is incomplete, i.e. it 
only has as start time. The latter never will be completed, because the stop 
message has already been processed and acknowledged to the authenticator.

At the moment my work around is to run FreeRADIUS in debug mode to keep it 
single threaded.

But I would like to propose the following solution. Instead of assigning 
incoming requests to the thread pool randomly, first preprocess the request and 
assign requests that have identical user names (or some other senseful 
attribute) to the same thread. This way requests that might belong to the same 
session are processed by the same thread and cannot outperform each other. 
Requests that never can belong to the same session are still processed 
concurrently.

Best regards, Matthias Nagel

--
Matthias Nagel
Willy-Andreas-Allee 1, Zimmer 506
76131 Karlsruhe

Telefon: +49-721-8695-1506
Mobil: +49-151-15998774
e-Mail: matthias.h.na...@gmail.com
ICQ: 499797758
Skype: nagmat84

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-28 Thread Alan DeKok
Matthias Nagel wrote:
 if two accounting messages for the same session are sent by the authenticator 
 very quickly, the messages may be processed by the radius server in the wrong 
 order. This results into two sessions being accounted instead of one. The 
 second phantom session stays open for ever, because it never receives any 
 update and/or stop message.

  This is a well-known issue with RADIUS.  Packets may appear in any order.

 Example:
 
 If a supplicant authenticates and immediately disconnects again, the 
 following steps are executed:
 
 1) The authenticator sends an accounting start message
 2) The authenticator sends an accounting stop message immediately
 3) The RADIUS server receives the start message and assigns it to thread #1
 4) The RADIUS server receives the stop message and assigns it to thread #2
 5) Thread #2 terminates first and the accounting stop message is written to 
 the PostgreSQL database. The SQL UPDATE statement fails, because there is no 
 entry for this session that could be updated, as the start message has not 
 been processed yet. Hence, an INSERT INTO statement is executed as a 
 fail-over measure.
 6) Thread #1 terminates and an SQL INSERT statement is performed in order to 
 log the start message.

  That doesn't make sense.  If the table indexes are set up correctly,
the SQL insert will fail at step (6).  The module will then try the
update query, which should succeed.

 The result is, that the same session is accounted with two entries in the 
 database. The first entry is complete, this is to say it has a start and stop 
 time. This is the result of step 5. The second entry is incomplete, i.e. it 
 only has as start time. The latter never will be completed, because the stop 
 message has already been processed and acknowledged to the authenticator.

  That is a database consistency issue.  You can't have two rows using
the same keys.

 But I would like to propose the following solution. Instead of assigning 
 incoming requests to the thread pool randomly, first preprocess the request 
 and assign requests that have identical user names (or some other senseful 
 attribute) to the same thread. This way requests that might belong to the 
 same session are processed by the same thread and cannot outperform each 
 other. Requests that never can belong to the same session are still processed 
 concurrently.

  That is not going to happen.  It's a bad fix.

  The correct fix is to use the SQL indexes.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-28 Thread Fajar A. Nugraha
On Wed, Aug 29, 2012 at 4:11 AM, Alan DeKok al...@deployingradius.com wrote:
 Matthias Nagel wrote:

 5) Thread #2 terminates first and the accounting stop message is written to 
 the PostgreSQL database. The SQL UPDATE statement fails, because there is no 
 entry for this session that could be updated, as the start message has not 
 been processed yet. Hence, an INSERT INTO statement is executed as a 
 fail-over measure.
 6) Thread #1 terminates and an SQL INSERT statement is performed in order to 
 log the start message.

   That doesn't make sense.  If the table indexes are set up correctly,
 the SQL insert will fail at step (6).  The module will then try the
 update query, which should succeed.

IIRC the default schema for postgresql (as was also the case for mysql
untill a month ago) does not have unique index for acctuniqueid. So
yes, by default what the OP experienced will happen, and yes, it's an
easy fix.

-- 
Fajar
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-28 Thread Matthias Nagel
Hello,

Am Dienstag 28 August 2012, 23:11:57 schrieb Alan DeKok:
 Matthias Nagel wrote:
  if two accounting messages for the same session are sent by the 
  authenticator very quickly, the messages may be processed by the radius 
  server in the wrong order. This results into two sessions being accounted 
  instead of one. The second phantom session stays open for ever, because 
  it never receives any update and/or stop message.
 
   This is a well-known issue with RADIUS.  Packets may appear in any order.
 
  Example:
  
  If a supplicant authenticates and immediately disconnects again, the 
  following steps are executed:
  
  1) The authenticator sends an accounting start message
  2) The authenticator sends an accounting stop message immediately
  3) The RADIUS server receives the start message and assigns it to thread #1
  4) The RADIUS server receives the stop message and assigns it to thread #2
  5) Thread #2 terminates first and the accounting stop message is written to 
  the PostgreSQL database. The SQL UPDATE statement fails, because there is 
  no entry for this session that could be updated, as the start message has 
  not been processed yet. Hence, an INSERT INTO statement is executed as a 
  fail-over measure.
  6) Thread #1 terminates and an SQL INSERT statement is performed in order 
  to log the start message.
 
   That doesn't make sense.  If the table indexes are set up correctly,
 the SQL insert will fail at step (6).  The module will then try the
 update query, which should succeed.
 

This is correct, if one has some kind of key to identify a session that could 
be used as a database index. But unfortunately there are a lot of 
authenticators out there, that do not correctly generate radius accounting 
session ids. Basicly I see three different types (despite the correct one):

1) Authenticators that do no send a session id at all (Acct-Session-Id is empty)
2) Those that always return the same session id (even if the user name differs)
3) Those that always return a new session id even if the requests 
(start/update/stop) belong to the same session

Of course one can try to use other attributes to construct an own artificial 
identifier which is actually done by the unique module. But the question is, 
what attributes should be used?

If one uses user-name, nas-ip-address, calling-station-id and friends, 
then the same session id might be used for different sessions. For example, one 
session by the same user from the same supplicant in the morning and a new 
session in the afternoon. So the result is the same as in case 2)

If one includes some timestamp related information, the session identifier is 
always changing. This means we are faced with case 3).

Hence, at the moment my SQL query does not rely on identifiers at all, but does 
the following:

1) If a start request comes in, just create a new row in the database

2a) If a update/stop message comes in, select all rows that have the same user 
name, nas ip address and some other identical columns and that do not have a 
stop time. This is to say, select all possibly matching and running sessions. 
Then order these session by update time and take the most recent one. This row 
is then updated with the new information

2b) If 2a) fails, because there is no row in the selection, create a new row as 
in 1)

Of course this procedure relies on the correct order of the radius messages. 


  The result is, that the same session is accounted with two entries in the 
  database. The first entry is complete, this is to say it has a start and 
  stop time. This is the result of step 5. The second entry is incomplete, 
  i.e. it only has as start time. The latter never will be completed, because 
  the stop message has already been processed and acknowledged to the 
  authenticator.
 
   That is a database consistency issue.  You can't have two rows using
 the same keys.
 
  But I would like to propose the following solution. Instead of assigning 
  incoming requests to the thread pool randomly, first preprocess the request 
  and assign requests that have identical user names (or some other senseful 
  attribute) to the same thread. This way requests that might belong to the 
  same session are processed by the same thread and cannot outperform each 
  other. Requests that never can belong to the same session are still 
  processed concurrently.
 
   That is not going to happen.  It's a bad fix.
 
   The correct fix is to use the SQL indexes.

I can see your point. Do you have any other suggestions to solve the issues? 
(Changing the hardware is not going to happen.) Any ideas for a more 
sophistcated SQL query that does not rely on identifiers (see above)?

   Alan DeKok.

Best regards, Matthias Nagel


 -
 List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
--
Matthias Nagel
Willy-Andreas-Allee 1, Zimmer 506
76131 Karlsruhe

Telefon: 

Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-28 Thread Arran Cudbard-Bell
Yes yes RADIUS vendors should go die in a big fiery pit somewhere.

1) Verify your NAS supports the Class attribute correctly 
(http://www.ietf.org/rfc/rfc2865.txt 5.25)
2) Implement the policies in raddb/policy.d/accounting (master:HEAD)
3) Submit patch to add unique index constraint on acctuniqueid in the 
postgresql schema

-Arran-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Bug/Enhancement request: Race condition with short-term accounting (FreeRadius 2.1.10)

2012-08-28 Thread Arran Cudbard-Bell

On 28 Aug 2012, at 23:05, Matthias Nagel matthias.h.na...@gmail.com wrote:

 Hello,
 
 Am Dienstag 28 August 2012, 23:11:57 schrieb Alan DeKok:
 Matthias Nagel wrote:
 if two accounting messages for the same session are sent by the 
 authenticator very quickly, the messages may be processed by the radius 
 server in the wrong order. This results into two sessions being accounted 
 instead of one. The second phantom session stays open for ever, because 
 it never receives any update and/or stop message.
 
  This is a well-known issue with RADIUS.  Packets may appear in any order.
 
 Example:
 
 If a supplicant authenticates and immediately disconnects again, the 
 following steps are executed:
 
 1) The authenticator sends an accounting start message
 2) The authenticator sends an accounting stop message immediately
 3) The RADIUS server receives the start message and assigns it to thread #1
 4) The RADIUS server receives the stop message and assigns it to thread #2
 5) Thread #2 terminates first and the accounting stop message is written to 
 the PostgreSQL database. The SQL UPDATE statement fails, because there is 
 no entry for this session that could be updated, as the start message has 
 not been processed yet. Hence, an INSERT INTO statement is executed as a 
 fail-over measure.
 6) Thread #1 terminates and an SQL INSERT statement is performed in order 
 to log the start message.
 
  That doesn't make sense.  If the table indexes are set up correctly,
 the SQL insert will fail at step (6).  The module will then try the
 update query, which should succeed.
 
 
 This is correct, if one has some kind of key to identify a session that could 
 be used as a database index. But unfortunately there are a lot of 
 authenticators out there, that do not correctly generate radius accounting 
 session ids. Basicly I see three different types (despite the correct one):
 
 1) Authenticators that do no send a session id at all (Acct-Session-Id is 
 empty)

Then they're breaking RFC 2866 and cannot claim to support RADIUS Accounting. 
Users should log a defect case, demand a refund, or sue them.

5.13.  Table of Attributes

   The following table provides a guide to which attributes may be found
   in Accounting-Request packets.  No attributes should be found in
   Accounting-Response packets except Proxy-State and possibly Vendor-
   Specific.

  # Attribute
.
.
.
  1 Acct-Session-Id


 2) Those that always return the same session id (even if the user name 
 differs)

Then they're breaking RFC 2866 and cannot claim to support RADIUS Accounting. 
Users should log a defect case, demand a refund or sue them.

(see below)

 3) Those that always return a new session id even if the requests 
 (start/update/stop) belong to the same session

Then they're breaking RFC 2866 and cannot claim to support RADIUS Accounting. 
Users should log a defect case, demand a refund or sue them.


1.2.  Terminology

   session   Each service provided by the NAS to a dial-in user
 constitutes a session, with the beginning of the session
 defined as the point where service is first provided and
 the end of the session defined as the point where service
 is ended.  A user may have multiple sessions in parallel or
 series if the NAS supports that, with each session
 generating a separate start and stop accounting record with
 its own Acct-Session-Id.

If instead or returning:

HTTP/1.1 200 OK

as the status line a HTTP server started returning

HTTP/6.9 RAINBOWS

It could not claim to implement RFC2616.

By working around compliance issues, you're encouraging your equipment vendors 
to be lazy, you have to work with them to get them to fix their issues, else 
it'll be just as crappy for the next person who comes along.

-Arran

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html