Good afternoon list members,

I'm wondering if anyone can help me with an issue that has been
affecting us since our upgrade to NGX 2 months ago. Here is the
background on our environment that is effected.

We are running FireWall-1(R) NGX (R60) HFA_03, Hotfix 603 - Build 015
kernel: NGX (R60) HFA_03, Hotfix 603 - Build 015 with ClusterXL HA New
mode. The OS we run is RHEL 3.0 and the hardware is 2 DL360's connected
via Crossover cable for the Syncnet. We are running kernel
2.4.21-40.Elsmp with all the latest updates. 

Both cluster members are Dual processor Zeon 2.8. One cluster member has
512MB of mem and the other has 1GB. They both have more then adequate
disk space.

We also have another firewall cluster for our DMZ's running on DL380's 

fw ver -k
This is Check Point VPN-1(TM) & FireWall-1(R) NG with Application
Intelligence (R55) - Build 817
kernel: NG with Application Intelligence (R55) - Build 817

We have had no problems on this R55 cluster. Originally the 2 cluster
shared the same internal network link but this has been rectified, see
below.

Our Management server is on the same OS and firewall version.

The symptoms are as follows, every 2 days or so one of the cluster
members will inexplicitly become unresponsive. When I jump onto the
cluster member via ILO there are 2 messages that show on the console
depending on the crash. 

Either VPN-1 Log buffer is full

VPN-1 : Lost xxxx trap message

The other message we have seen is  :

Net : xxxx messages supressed

In the case of this last message I can still see this populating the
screen but cannot use the keyboard through ILO. 

Actually in both cases the only way to recover is to power cycle the
box. 

About half of the time during the failover the other cluster member does
not recognize the fact that the active has gone down and until we power
cycle the crashed or assummed to be crashed box internet access and vpn
access are down at our site.

We've had oue vendor open a ticket with Checkpoint and have done the
following steps to try to resolve.

1. We've moved both of our clusters to a separate firewall subnet so
that they can't see each other sync traffic.

2. Completely reinstalled from OS up and a fresh install of the firewall
software.

3. Upgraded to HFA3 and flushed state table on both Smart Center and
firewall cluster members.

4. Changed our CCP transport from Multicast to Broadcast and made sure
that our switch and network settings comply to the Cluster XL
specifications.

We are now considering replacing the hardware on at least one of the
firewalls. This issue did occur on R55 a few times over a 3 or 4 month
period but after the upgrade it's happening as often as every 12 hours.
More information is that we have upgrade approximately 18 other firewall
to NGX HFA02 all running the same OS, Kernel, and firewall software
version HFA02 which we had originally upgraded our cluster too. Only one
of the firewalls has crashed and it has happened twice in a month and a
half. The rest of the upgrades have been flawless and without issue. The
affected Cluster does have a whole lot more traffic and connections
going through it then any of our other firewalls however.

If anyone needs further information or I have not covered everything
feel free to let me know. I would really appreciate any info,
suggestions, or similar experiences.


Jeremy Lieb CCSE-NG CCSE+NG
Firewall Administrator
Open Text Corporation
100 Tri-State Int'l Pkwy
Lincolnshire, IL 60069

 

=================================================
To set vacation, Out-Of-Office, or away messages,
send an email to [EMAIL PROTECTED]
in the BODY of the email add:
set fw-1-mailinglist nomail
=================================================
To unsubscribe from this mailing list,
please see the instructions at
http://www.checkpoint.com/services/mailing.html
=================================================
If you have any questions on how to change your
subscription options, email
[EMAIL PROTECTED]
=================================================

Reply via email to