Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread Johan Meiring

muffin sk wrote:


- show full processlist
- show innodb status



Unfortunately your datbase is doing nothing in the logs attaced.

Please re-run and re-send "show processlist" while radius is claiming the 
SQL pools to be unavailable, i.e. while your'e having the problem.


--


Johan Meiring
Cape PC Services CC
Tel: (021) 883-8271
Fax: (021) 886-7782

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread muffin sk
Hello Fajar,

On Fri, Feb 12, 2010 at 8:48 AM, Fajar A. Nugraha  wrote:
> On Fri, Feb 12, 2010 at 6:22 AM, muffin sk  wrote:
>> Slow queries: 166
>
> Start with that. Activate slow query log (see Mysql docs) to see which
> queries took a long time, and when they happen.
> Combine that with freeradius log about db handle, and you should be
> able to decide whether the problem is something worth investigating,
> or something you can safely ignore.

Noted. I will activate it now. Thank you once again for your advice.

Regards,

Muffin

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread Fajar A. Nugraha
On Fri, Feb 12, 2010 at 6:22 AM, muffin sk  wrote:
> Slow queries: 166

Start with that. Activate slow query log (see Mysql docs) to see which
queries took a long time, and when they happen.
Combine that with freeradius log about db handle, and you should be
able to decide whether the problem is something worth investigating,
or something you can safely ignore.

-- 
Fajar

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread Johan Meiring


I intentionally didn't include the output of "show full processlist"
and "show innodb status" because they are very large to include them
here. If you need them, please tell me and I will send it to you
personally.



"show full processlist" would show how long some of the queries have been 
running, and what state they're in.


If its really to large, use www.pastebin.com and send a link.



--


Johan Meiring
Cape PC Services CC
Tel: (021) 883-8271
Fax: (021) 886-7782

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread muffin sk
Hello Fajar,

On Fri, Feb 12, 2010 at 12:42 AM, Fajar A. Nugraha  wrote:
>
> When the NAS doesn't receive response (for whatever reason) in a
> certain time, it should sends the request again. The number of retries
> and timeout vary depending on NAS (tunable settings on some). If after
> all retries it still doesn't receive a response, the user won't be
> able to login.

Just for your information, below are the snippets of my configurations
which is similar to my 4 x FreeRADIUS servers.

[ /etc/raddb/radiusd.conf ]

max_request_time = 10
delete_blocked_requests = no
cleanup_delay = 5
max_requests = 1024

[ /etc/raddb/sql.conf ]

sql {

num_sql_socks = 30
connect_failure_retry_delay = 10

}

> If the database is dead, FR should reconnect automatically. Usually
> those logs are early sign of problems, and I'd check my database
> first. Delete old radacct records, tune some settings, convert them to
> innodb, get faster disk/more RAM, etc.

The MySQL service is always up except when I manually restart the service.

Below is the hardware specifications of my database server:

- Dell PowerEdge 1950
- Intel(R) Xeon(R) CPU E5405  @ 2.00GHz (2 CPU @ 4 cores each)
- 4GB ECC DDR2 667 MHz RAM
- MegaRAID SAS 1078 RAID 1

Below is the software specifications of my database server:

- CentOS 5.x x86_64
- MySQL-server-community-5.0.67-0.rhel5

Below is my /etc/my.cnf file although I am already using InnoDB engine:

[mysqld]
wait_timeout=3600
connect_timeout=10
interactive_timeout=120
join_buffer_size=1M
query_cache_size=128M
query_cache_limit=2M
max_allowed_packet=16M
table_cache=1024
sort_buffer_size=2M
read_buffer_size=2M
read_rnd_buffer_size=4M
skip-name-resolve
max_connections=500
sort_buffer_size=8M
key_buffer = 256M
key_buffer_size=384M
thread_cache=256
thread_concurrency=16
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
old_passwords=1

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

on which, I am planning to add the following parameters specific for InnoDB:

innodb_buffer_pool_size = 2048M
innodb_additional_mem_pool_size = 20M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 0
innodb_log_file_size = 1024M
innodb_lock_wait_timeout = 300
innodb_thread_concurrency = 16

And lastly, below are outputs from the MySQL shell:

mysql> status
--
mysql  Ver 14.12 Distrib 5.0.67, for redhat-linux-gnu (x86_64) using
readline 5.1

Connection id:  63586
Current database:   
Current user:   r...@localhost
SSL:Not in use
Current pager:  stdout
Using outfile:  ''
Using delimiter:;
Server version: 5.0.67-community MySQL Community Edition (GPL)
Protocol version:   10
Connection: Localhost via UNIX socket
Server characterset:latin1
Db characterset:latin1
Client characterset:latin1
Conn.  characterset:latin1
UNIX socket:/var/lib/mysql/mysql.sock
Uptime: 3 days 4 hours 35 min 14 sec

Threads: 71  Questions: 2547942  Slow queries: 166  Opens: 482  Flush
tables: 1  Open tables: 184  Queries per second avg: 9.241
--

mysql> show variables like 'max_connections';
+-+---+
| Variable_name   | Value |
+-+---+
| max_connections | 500   |
+-+---+
1 row in set (0.00 sec)

mysql> show status like 'max_used_connections';
+--+---+
| Variable_name| Value |
+--+---+
| Max_used_connections | 249   |
+--+---+
1 row in set (0.00 sec)

I intentionally didn't include the output of "show full processlist"
and "show innodb status" because they are very large to include them
here. If you need them, please tell me and I will send it to you
personally.

I have attached the system usage logs for your reference.

> best-case scenario, user will get some delay before they're able to login.
> worst-case scenario, they won't be able to login at all.

Thank you once again for giving your time to explain my questions to this list.

Regards,

Muffin
Script started on Thu Feb 11 23:15:01 2010

procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpd   free  inact active   si   sobibo   in   cs us sy id wa st
 0  0180 344044 925156 260499200 01300  3  0 97  0  0


 [ PROCESSOR ] 


[Processor Statistics Information]

23:15:01 CPU   %user   %nice%sys %iowait%irq   %soft  %steal   
%idleintr/s
23:15:01 all3.050.000.210.020.000.010.00   
96.71 83.79
23:15:01   07.580.000.240.050.000.020.00   
92.10 46.99
23:15:01   11.890.000.140.020.000.000.00   
97.95  1.51
23:15:01   21.850.000.150.050.010.010.00   
97.93

Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread Fajar A. Nugraha
On Thu, Feb 11, 2010 at 11:08 PM, muffin sk  wrote:
>>> Thu Feb 11 05:53:24 2010 : Info: rlm_sql (sql): There are no DB
>>> handles to use! skipped 0, tried to connect 0
>>
>> usually this means the db is dead, or unresponsive, such that radiusd
>> can't find a DB handle it can use
>
> What will happen to the requests and/or the attempt to connect to the
> database? Will it timeout and retry again? Will it cause the user to
> fail to login? What are the possible side effects on this log?
> Honestly, even though it is an INFO log, the mere fact that somehow
> the request failed, I am very much concerned on it.

When the NAS doesn't receive response (for whatever reason) in a
certain time, it should sends the request again. The number of retries
and timeout vary depending on NAS (tunable settings on some). If after
all retries it still doesn't receive a response, the user won't be
able to login.

If the database is dead, FR should reconnect automatically. Usually
those logs are early sign of problems, and I'd check my database
first. Delete old radacct records, tune some settings, convert them to
innodb, get faster disk/more RAM, etc.

>
>>> Thu Feb 11 05:53:25 2010 : Error: WARNING: Unresponsive child (id
>>> 1241577792) for request 123582 (in component accounting module
>>> rlm_sql)
>>
>> ... which can happen if (for example) you have millions of records in
>> radacct, with non-optimum DB settings, and insert/updating a record
>> takes a lng time.
>
> Then, what will happen? It's an ERROR type of log. What are the
> possible side effects for this?

best-case scenario, user will get some delay before they're able to login.
worst-case scenario, they won't be able to login at all.

-- 
Fajar
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread muffin sk
Hello Fajar,

First of all, thank you for your time answering my questions to this list.

On Thu, Feb 11, 2010 at 4:47 PM, Fajar A. Nugraha  wrote:
>
>> Thu Feb 11 05:53:24 2010 : Info: rlm_sql (sql): There are no DB
>> handles to use! skipped 0, tried to connect 0
>
> usually this means the db is dead, or unresponsive, such that radiusd
> can't find a DB handle it can use

What will happen to the requests and/or the attempt to connect to the
database? Will it timeout and retry again? Will it cause the user to
fail to login? What are the possible side effects on this log?
Honestly, even though it is an INFO log, the mere fact that somehow
the request failed, I am very much concerned on it.

>> Thu Feb 11 05:53:25 2010 : Error: WARNING: Unresponsive child (id
>> 1241577792) for request 123582 (in component accounting module
>> rlm_sql)
>
> ... which can happen if (for example) you have millions of records in
> radacct, with non-optimum DB settings, and insert/updating a record
> takes a lng time.

Then, what will happen? It's an ERROR type of log. What are the
possible side effects for this?

Thank you for your time again.

Regards,

Muffin
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Logs from /var/log/radius/radius.log

2010-02-11 Thread Fajar A. Nugraha
On Thu, Feb 11, 2010 at 10:23 AM, muffin sk  wrote:
> Can you briefly explain the
> meaning for each line below if that doesn't cost much of your time? I

Which part are you having trouble with?


> Thu Feb 11 05:53:24 2010 : Info: rlm_sql (sql): There are no DB
> handles to use! skipped 0, tried to connect 0

usually this means the db is dead, or unresponsive, such that radiusd
can't find a DB handle it can use

> Thu Feb 11 05:53:25 2010 : Error: WARNING: Unresponsive child (id
> 1241577792) for request 123582 (in component accounting module
> rlm_sql)

... which can happen if (for example) you have millions of records in
radacct, with non-optimum DB settings, and insert/updating a record
takes a lng time.

> ---
> Thu Feb 11 06:26:21 2010 : Error: rlm_sql_mysql: Cannot store result
> Thu Feb 11 06:26:21 2010 : Error: rlm_sql_mysql: MySQL error 'Column
> 'UserName' cannot be null'
> Thu Feb 11 09:03:25 2010 : Error: rlm_sql (sql): Couldn't update SQL
> accounting STOP record - Column 'UserName' cannot be null

As the message said : "Column 'UserName' cannot be null". What does
your acct stop query look like?

-- 
Fajar
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html