Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-16 Thread Sihan Goi
I mean things like firewall settings, as well as services like pcsd,
pacemaker and corosync not starting up automatically sometimes.

On Tue, Sep 16, 2014 at 5:10 PM, Nikita Michalko 
wrote:

>  On 16.09.2014 10:31, Sihan Goi wrote:
>
> Figured out the problem - the firewall rules are somehow not persistent.
> After running the following commands:
>
> iptables -I INPUT -m state --state NEW -p udp -m multiport --dports
> 5404,5405 -j ACCEPT
> iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT
> iptables -I INPUT -p igmp -j ACCEPT
> iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
> service iptables save
>
> Both nodes are able to communicate with each other.
>
> Seems like several things aren't persistent upon reboots, and need to be
> restarted/reconfigured. Is this the intended behavior?
>
>  What do you mean with "several things" ? Firewall/iptables on CentOS 7? Or 
> Pacemaker/Corosync/pcs ?
>
>
> Nikita
>
>
> On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko  
> 
> wrote:
>
>
>   Hi,
>
> maybe is following 
> helpfull:https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CDEQFjAB&url=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.html&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiaww&sig2=hR8kUWRcpmN4PE1V42t9kg&bvm=bv.74115972,d.bGEhttps://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CC0QrAIwAA&url=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzg&sig2=ra1qjZ8nly8opwawrACidw&bvm=bv.74115972,d.bGE
>
>
> HTH
>
> Nikita
>
>
>
> On 02.09.2014 07:47, Sihan Goi wrote:
>
> Hi,
>
> After some investigation, it seems that my Apache is having trouble
> starting in both nodes. I get the following error message when I try to
> restart the service:
>
> Job for httpd.service failed. See 'systemctl status httpd.service' and
> 'journalctl -xn' for details.
>
> "systemctl status httpd.service" shows the following output:
>
> httpd.service - The Apache HTTP Server
>Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
>Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
> ago
>   Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
> status=0/SUCCESS)
>   Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
> (code=exited, status=1/FAILURE)
>  Main PID: 26093 (code=exited, status=1/FAILURE)
>
> Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
> det...ge
> Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
> m...80
> Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
> shutti...wn
> Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
> Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
> code...RE
> Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
> Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
> Hint: Some lines were ellipsized, use -l to show in full.
>
> /var/log/messages also shows similar messages
>
> Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
> Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
> the server's fully qualified domain name, using 192.168.0.112. Set the
> 'ServerName' directive globally to suppress this message
> Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
> make_sock: could not bind to address 127.0.0.1:80
> Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
> Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
> Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
> code=exited, status=1/FAILURE
> Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
> Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.
>
> Is this related to the problem?
>
>
>
> On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai 
>  
>  wrote:
>
>
>  Try to set cidr_netmask=32 for resource only, and let the physical
> interface's netmask be 24.
>
> On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi  
>wrote:
>
>  Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
> cidr_netmask to 24 and it works...sort of.
>
> It was working for a while, and then I rebooted both PCs, and now each
> thinks its online and the other is offline.
>
> "pcs status" on my node01 gives the following output:
> Cluster name: cluster_web
> Last updated: Tue Sep  2 12:21:25 2014
> Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
> Stack: corosync
> Current DC: node01 (1) - partition WITHOUT quorum
> Version: 1.1.10-32.el7_0-368c726
> 2 Nodes configured
> 2 Resources configured
>
>
> Online: [ node01 ]
> OFFLINE: [ node02 ]
>
> Full list of resources:
>
>  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
>  webserver(ocf::heartbeat:apache):Started node01
>
> 

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-16 Thread Nikita Michalko

On 16.09.2014 10:31, Sihan Goi wrote:

Figured out the problem - the firewall rules are somehow not persistent.
After running the following commands:

iptables -I INPUT -m state --state NEW -p udp -m multiport --dports
5404,5405 -j ACCEPT
iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT
iptables -I INPUT -p igmp -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
service iptables save

Both nodes are able to communicate with each other.

Seems like several things aren't persistent upon reboots, and need to be
restarted/reconfigured. Is this the intended behavior?


What do you mean with "several things" ? Firewall/iptables on CentOS 7? Or 
Pacemaker/Corosync/pcs ?


Nikita


On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko 
wrote:


  Hi,

maybe is following helpfull:
https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CDEQFjAB&url=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.html&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiaww&sig2=hR8kUWRcpmN4PE1V42t9kg&bvm=bv.74115972,d.bGE
https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CC0QrAIwAA&url=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzg&sig2=ra1qjZ8nly8opwawrACidw&bvm=bv.74115972,d.bGE


HTH

Nikita



On 02.09.2014 07:47, Sihan Goi wrote:

Hi,

After some investigation, it seems that my Apache is having trouble
starting in both nodes. I get the following error message when I try to
restart the service:

Job for httpd.service failed. See 'systemctl status httpd.service' and
'journalctl -xn' for details.

"systemctl status httpd.service" shows the following output:

httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
ago
   Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
status=0/SUCCESS)
   Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
(code=exited, status=1/FAILURE)
  Main PID: 26093 (code=exited, status=1/FAILURE)

Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
det...ge
Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
m...80
Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
shutti...wn
Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
code...RE
Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

/var/log/messages also shows similar messages

Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
the server's fully qualified domain name, using 192.168.0.112. Set the
'ServerName' directive globally to suppress this message
Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
make_sock: could not bind to address 127.0.0.1:80
Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
code=exited, status=1/FAILURE
Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

Is this related to the problem?



On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai 
 wrote:


  Try to set cidr_netmask=32 for resource only, and let the physical
interface's netmask be 24.

On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi  
 wrote:

  Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
cidr_netmask to 24 and it works...sort of.

It was working for a while, and then I rebooted both PCs, and now each
thinks its online and the other is offline.

"pcs status" on my node01 gives the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:21:25 2014
Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
Stack: corosync
Current DC: node01 (1) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node01 ]
OFFLINE: [ node02 ]

Full list of resources:

  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
  webserver(ocf::heartbeat:apache):Started node01

PCSD Status:
   node01: Offline
   node02: Online

Daemon Status:
   corosync: active/disabled
   pacemaker: active/disabled
   pcsd: active/disabled

However, "pcs status" on node02 shows the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:20:41 2014
Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
Stack: corosync
Current DC: node02 (2) - partition WITHOUT quorum
Version: 1

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-16 Thread Sihan Goi
Figured out the problem - the firewall rules are somehow not persistent.
After running the following commands:

iptables -I INPUT -m state --state NEW -p udp -m multiport --dports
5404,5405 -j ACCEPT
iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT
iptables -I INPUT -p igmp -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
service iptables save

Both nodes are able to communicate with each other.

Seems like several things aren't persistent upon reboots, and need to be
restarted/reconfigured. Is this the intended behavior?

On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko 
wrote:

>  Hi,
>
> maybe is following helpfull:
> https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CDEQFjAB&url=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.html&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiaww&sig2=hR8kUWRcpmN4PE1V42t9kg&bvm=bv.74115972,d.bGE
> https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CC0QrAIwAA&url=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzg&sig2=ra1qjZ8nly8opwawrACidw&bvm=bv.74115972,d.bGE
>
>
> HTH
>
> Nikita
>
>
>
> On 02.09.2014 07:47, Sihan Goi wrote:
>
> Hi,
>
> After some investigation, it seems that my Apache is having trouble
> starting in both nodes. I get the following error message when I try to
> restart the service:
>
> Job for httpd.service failed. See 'systemctl status httpd.service' and
> 'journalctl -xn' for details.
>
> "systemctl status httpd.service" shows the following output:
>
> httpd.service - The Apache HTTP Server
>Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
>Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
> ago
>   Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
> status=0/SUCCESS)
>   Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
> (code=exited, status=1/FAILURE)
>  Main PID: 26093 (code=exited, status=1/FAILURE)
>
> Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
> det...ge
> Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
> m...80
> Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
> shutti...wn
> Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
> Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
> code...RE
> Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
> Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
> Hint: Some lines were ellipsized, use -l to show in full.
>
> /var/log/messages also shows similar messages
>
> Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
> Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
> the server's fully qualified domain name, using 192.168.0.112. Set the
> 'ServerName' directive globally to suppress this message
> Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
> make_sock: could not bind to address 127.0.0.1:80
> Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
> Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
> Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
> code=exited, status=1/FAILURE
> Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
> Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.
>
> Is this related to the problem?
>
>
>
> On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai 
>  wrote:
>
>
>  Try to set cidr_netmask=32 for resource only, and let the physical
> interface's netmask be 24.
>
> On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi  
>  wrote:
>
>  Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
> cidr_netmask to 24 and it works...sort of.
>
> It was working for a while, and then I rebooted both PCs, and now each
> thinks its online and the other is offline.
>
> "pcs status" on my node01 gives the following output:
> Cluster name: cluster_web
> Last updated: Tue Sep  2 12:21:25 2014
> Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
> Stack: corosync
> Current DC: node01 (1) - partition WITHOUT quorum
> Version: 1.1.10-32.el7_0-368c726
> 2 Nodes configured
> 2 Resources configured
>
>
> Online: [ node01 ]
> OFFLINE: [ node02 ]
>
> Full list of resources:
>
>  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
>  webserver(ocf::heartbeat:apache):Started node01
>
> PCSD Status:
>   node01: Offline
>   node02: Online
>
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/disabled
>
> However, "pcs status" on node02 shows the following output:
> Cluster name: cluster_web
> Last updated: Tue Sep  2 12:20:41 2014
> Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
> Stack: corosync
> Current DC: node02 (2) - 

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Nikita Michalko

Hi,

maybe is following helpfull:

https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CDEQFjAB&url=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.html&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiaww&sig2=hR8kUWRcpmN4PE1V42t9kg&bvm=bv.74115972,d.bGE

https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CC0QrAIwAA&url=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzg&sig2=ra1qjZ8nly8opwawrACidw&bvm=bv.74115972,d.bGE


HTH

Nikita


On 02.09.2014 07:47, Sihan Goi wrote:

Hi,

After some investigation, it seems that my Apache is having trouble
starting in both nodes. I get the following error message when I try to
restart the service:

Job for httpd.service failed. See 'systemctl status httpd.service' and
'journalctl -xn' for details.

"systemctl status httpd.service" shows the following output:

httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
ago
   Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
status=0/SUCCESS)
   Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
(code=exited, status=1/FAILURE)
  Main PID: 26093 (code=exited, status=1/FAILURE)

Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
det...ge
Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
m...80
Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
shutti...wn
Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
code...RE
Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

/var/log/messages also shows similar messages

Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
the server's fully qualified domain name, using 192.168.0.112. Set the
'ServerName' directive globally to suppress this message
Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
make_sock: could not bind to address 127.0.0.1:80
Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
code=exited, status=1/FAILURE
Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

Is this related to the problem?



On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai <
maillist...@gmail.com> wrote:


Try to set cidr_netmask=32 for resource only, and let the physical
interface's netmask be 24.

On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi  wrote:

Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
cidr_netmask to 24 and it works...sort of.

It was working for a while, and then I rebooted both PCs, and now each
thinks its online and the other is offline.

"pcs status" on my node01 gives the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:21:25 2014
Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
Stack: corosync
Current DC: node01 (1) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node01 ]
OFFLINE: [ node02 ]

Full list of resources:

  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
  webserver(ocf::heartbeat:apache):Started node01

PCSD Status:
   node01: Offline
   node02: Online

Daemon Status:
   corosync: active/disabled
   pacemaker: active/disabled
   pcsd: active/disabled

However, "pcs status" on node02 shows the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:20:41 2014
Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
Stack: corosync
Current DC: node02 (2) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node02 ]
OFFLINE: [ node01 ]

Full list of resources:

  virtual_ip(ocf::heartbeat:IPaddr2):Started node02
  webserver(ocf::heartbeat:apache):Started node02

PCSD Status:
   node01: Offline
   node02: Online

Daemon Status:
   corosync: active/disabled
   pacemaker: active/disabled
   pcsd: active/disabled

Seems like each node thinks it's online and the other is not. I'm

running HA

on apache webserver, and if I access the webpage on node01, I get

node01's

index.html. If I access it on node02, I get node02's index.html. If I

access

it via another PC connected to the same AP, the webpage is unavailable.

What could be wrong?


On Mon, Sep 1, 2014 at 9:09 PM, John La

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Sihan Goi
Hi,

After some investigation, it seems that my Apache is having trouble
starting in both nodes. I get the following error message when I try to
restart the service:

Job for httpd.service failed. See 'systemctl status httpd.service' and
'journalctl -xn' for details.

"systemctl status httpd.service" shows the following output:

httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
   Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
ago
  Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
status=0/SUCCESS)
  Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
(code=exited, status=1/FAILURE)
 Main PID: 26093 (code=exited, status=1/FAILURE)

Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
det...ge
Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
m...80
Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
shutti...wn
Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
code...RE
Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

/var/log/messages also shows similar messages

Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
the server's fully qualified domain name, using 192.168.0.112. Set the
'ServerName' directive globally to suppress this message
Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
make_sock: could not bind to address 127.0.0.1:80
Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
code=exited, status=1/FAILURE
Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

Is this related to the problem?



On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai <
maillist...@gmail.com> wrote:

> Try to set cidr_netmask=32 for resource only, and let the physical
> interface's netmask be 24.
>
> On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi  wrote:
> > Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
> > cidr_netmask to 24 and it works...sort of.
> >
> > It was working for a while, and then I rebooted both PCs, and now each
> > thinks its online and the other is offline.
> >
> > "pcs status" on my node01 gives the following output:
> > Cluster name: cluster_web
> > Last updated: Tue Sep  2 12:21:25 2014
> > Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
> > Stack: corosync
> > Current DC: node01 (1) - partition WITHOUT quorum
> > Version: 1.1.10-32.el7_0-368c726
> > 2 Nodes configured
> > 2 Resources configured
> >
> >
> > Online: [ node01 ]
> > OFFLINE: [ node02 ]
> >
> > Full list of resources:
> >
> >  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
> >  webserver(ocf::heartbeat:apache):Started node01
> >
> > PCSD Status:
> >   node01: Offline
> >   node02: Online
> >
> > Daemon Status:
> >   corosync: active/disabled
> >   pacemaker: active/disabled
> >   pcsd: active/disabled
> >
> > However, "pcs status" on node02 shows the following output:
> > Cluster name: cluster_web
> > Last updated: Tue Sep  2 12:20:41 2014
> > Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
> > Stack: corosync
> > Current DC: node02 (2) - partition WITHOUT quorum
> > Version: 1.1.10-32.el7_0-368c726
> > 2 Nodes configured
> > 2 Resources configured
> >
> >
> > Online: [ node02 ]
> > OFFLINE: [ node01 ]
> >
> > Full list of resources:
> >
> >  virtual_ip(ocf::heartbeat:IPaddr2):Started node02
> >  webserver(ocf::heartbeat:apache):Started node02
> >
> > PCSD Status:
> >   node01: Offline
> >   node02: Online
> >
> > Daemon Status:
> >   corosync: active/disabled
> >   pacemaker: active/disabled
> >   pcsd: active/disabled
> >
> > Seems like each node thinks it's online and the other is not. I'm
> running HA
> > on apache webserver, and if I access the webpage on node01, I get
> node01's
> > index.html. If I access it on node02, I get node02's index.html. If I
> access
> > it via another PC connected to the same AP, the webpage is unavailable.
> >
> > What could be wrong?
> >
> >
> > On Mon, Sep 1, 2014 at 9:09 PM, John Lauro 
> > wrote:
> >>
> >> ip=192.168.0.110 cidr_netmask=32
> >> /32 leaves no room for any other IP addresses on that interface and so
> you
> >> have to specify the nic.  Are you certain 192.168.0.111 and
> 192.168.0.112 do
> >> not have a different netmask from 255.255.255.255, like 255.255.255.0
> for
> >> /24 or 255.255.0.0 for /16?  If they do have 255.255.255.255 too, then
>

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Teerapatr Kittiratanachai
Try to set cidr_netmask=32 for resource only, and let the physical
interface's netmask be 24.

On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi  wrote:
> Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
> cidr_netmask to 24 and it works...sort of.
>
> It was working for a while, and then I rebooted both PCs, and now each
> thinks its online and the other is offline.
>
> "pcs status" on my node01 gives the following output:
> Cluster name: cluster_web
> Last updated: Tue Sep  2 12:21:25 2014
> Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
> Stack: corosync
> Current DC: node01 (1) - partition WITHOUT quorum
> Version: 1.1.10-32.el7_0-368c726
> 2 Nodes configured
> 2 Resources configured
>
>
> Online: [ node01 ]
> OFFLINE: [ node02 ]
>
> Full list of resources:
>
>  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
>  webserver(ocf::heartbeat:apache):Started node01
>
> PCSD Status:
>   node01: Offline
>   node02: Online
>
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/disabled
>
> However, "pcs status" on node02 shows the following output:
> Cluster name: cluster_web
> Last updated: Tue Sep  2 12:20:41 2014
> Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
> Stack: corosync
> Current DC: node02 (2) - partition WITHOUT quorum
> Version: 1.1.10-32.el7_0-368c726
> 2 Nodes configured
> 2 Resources configured
>
>
> Online: [ node02 ]
> OFFLINE: [ node01 ]
>
> Full list of resources:
>
>  virtual_ip(ocf::heartbeat:IPaddr2):Started node02
>  webserver(ocf::heartbeat:apache):Started node02
>
> PCSD Status:
>   node01: Offline
>   node02: Online
>
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/disabled
>
> Seems like each node thinks it's online and the other is not. I'm running HA
> on apache webserver, and if I access the webpage on node01, I get node01's
> index.html. If I access it on node02, I get node02's index.html. If I access
> it via another PC connected to the same AP, the webpage is unavailable.
>
> What could be wrong?
>
>
> On Mon, Sep 1, 2014 at 9:09 PM, John Lauro 
> wrote:
>>
>> ip=192.168.0.110 cidr_netmask=32
>> /32 leaves no room for any other IP addresses on that interface and so you
>> have to specify the nic.  Are you certain 192.168.0.111 and 192.168.0.112 do
>> not have a different netmask from 255.255.255.255, like 255.255.255.0 for
>> /24 or 255.255.0.0 for /16?  If they do have 255.255.255.255 too, then they
>> are probably not setup correctly...
>>
>> PS: cidr_netmask is optional.  Assuming a proper netmask (not
>> 255.255.255.2555) is on 192.168.0.111 and 192.168.0.112 it should work
>> without specifying cidr_netmask.
>>
>>
>> 
>>
>> From: "Sihan Goi" 
>> To: pacemaker@oss.clusterlabs.org
>> Sent: Monday, September 1, 2014 4:17:20 AM
>> Subject: [Pacemaker] ERROR: Unable to find nic or netmask.
>>
>>
>> Hi,
>>
>> I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a
>> wireless AP. The PCs have the static IP addresses 192.168.0.111 and
>> 192.168.0.112 respectively and hostnames node01 and node02 respectively.
>>
>> I've tried to create a virtual IP address of 192.168.0.110 using the
>> following command:
>>
>> pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110
>> cidr_netmask=32 op monitor interval=30s
>>
>> However, when I do a "pcs status resources" I get the following output:
>>
>>  virtual_ip(ocf::heartbeat:IPaddr2):Stopped
>>
>> The virtual IP is stopped rather than started. I looked into
>> /var/log/messages and /var/log/pacemaker.log
>>  and I find the following error messages:
>>
>> node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask.
>> node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed
>>
>> It seems that it's unable to find my nic. How can I fix this?
>>
>> Thanks.
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
>
>
> --
> - Goi Sihan
> gois...@gmail.com
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Sihan Goi
Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
cidr_netmask to 24 and it works...sort of.

It was working for a while, and then I rebooted both PCs, and now each
thinks its online and the other is offline.

"pcs status" on my node01 gives the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:21:25 2014
Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
Stack: corosync
Current DC: node01 (1) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node01 ]
OFFLINE: [ node02 ]

Full list of resources:

 virtual_ip(ocf::heartbeat:IPaddr2):Started node01
 webserver(ocf::heartbeat:apache):Started node01

PCSD Status:
  node01: Offline
  node02: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

However, "pcs status" on node02 shows the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:20:41 2014
Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
Stack: corosync
Current DC: node02 (2) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node02 ]
OFFLINE: [ node01 ]

Full list of resources:

 virtual_ip(ocf::heartbeat:IPaddr2):Started node02
 webserver(ocf::heartbeat:apache):Started node02

PCSD Status:
  node01: Offline
  node02: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Seems like each node thinks it's online and the other is not. I'm running
HA on apache webserver, and if I access the webpage on node01, I get
node01's index.html. If I access it on node02, I get node02's index.html.
If I access it via another PC connected to the same AP, the webpage is
unavailable.

What could be wrong?


On Mon, Sep 1, 2014 at 9:09 PM, John Lauro 
wrote:

> ip=192.168.0.110 cidr_netmask=32
> /32 leaves no room for any other IP addresses on that interface and so you
> have to specify the nic.  Are you certain 192.168.0.111 and 192.168.0.112
> do not have a different netmask from 255.255.255.255, like 255.255.255.0
> for /24 or 255.255.0.0 for /16?  If they do have 255.255.255.255 too, then
> they are probably not setup correctly...
>
> PS: cidr_netmask is optional.  Assuming a proper netmask (not
> 255.255.255.2555) is on 192.168.0.111 and 192.168.0.112 it should work
> without specifying cidr_netmask.
>
>
> --
>
> *From: *"Sihan Goi" 
> *To: *pacemaker@oss.clusterlabs.org
> *Sent: *Monday, September 1, 2014 4:17:20 AM
> *Subject: *[Pacemaker] ERROR: Unable to find nic or netmask.
>
>
> Hi,
>
> I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a
> wireless AP. The PCs have the static IP addresses 192.168.0.111 and
> 192.168.0.112 respectively and hostnames node01 and node02 respectively.
>
> I've tried to create a virtual IP address of 192.168.0.110 using the
> following command:
>
> pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110
> cidr_netmask=32 op monitor interval=30s
>
> However, when I do a "pcs status resources" I get the following output:
>
>  virtual_ip(ocf::heartbeat:IPaddr2):Stopped
>
> The virtual IP is stopped rather than started. I looked into
> /var/log/messages and /var/log/pacemaker.log
>  and I find the following error messages:
>
> node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask.
> node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed
>
> It seems that it's unable to find my nic. How can I fix this?
>
> Thanks.
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>


-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org