[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-18 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Attachment: ZOOKEEPER-888-3.3.patch

Patch based on the 3.3 branch attached (ZOOKEEPER-888-3.3.patch). Verified that 
unit tests pass with the changes, including the new watcher_test.

> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Assignee: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py, ZOOKEEPER-888-3.3.patch, 
> ZOOKEEPER-888.patch
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-14 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Attachment: ZOOKEEPER-888.patch

Updated ZOOKEEPER-888.patch with the following changes:

- Fixed zookeeper.is_unrecoverable to return the correct value, it was 
returning false in all cases.

- Added watcher_test.py to cover the issue this patch fixes. Verified that it 
crashes before patching and succeeds afterward.


> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py, ZOOKEEPER-888.patch
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-14 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Status: Open  (was: Patch Available)

> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-14 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Attachment: (was: ZOOKEEPER-888.patch)

> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-13 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker resolved ZOOKEEPER-890.


Resolution: Not A Problem

Closing, C client works as intended. Submitted a patch in ZOOKEEPER-888 to 
handle this properly in zkpython.

> C client invokes watcher callbacks multiple times
> -
>
> Key: ZOOKEEPER-890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
> Environment: Mac OS X 10.6.5
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: watcher_twice.c, ZOOKEEPER-890.patch
>
>
> Code using the C client assumes that watcher callbacks are called exactly 
> once. If the watcher is called more than once, the process will likely 
> overwrite freed memory and/or crash.
> collect_session_watchers (zk_hashtable.c) gathers watchers from 
> active_node_watchers, active_exist_watchers, and active_child_watchers 
> without removing them. This results in watchers being invoked more than once.
> Test code is attached that reproduces the bug, along with a proposed patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-740) zkpython leading to segfault on zookeeper

2010-10-09 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919553#action_12919553
 ] 

Austin Shoemaker commented on ZOOKEEPER-740:


ZOOKEEPER-740.patch fixes the crash, though it looks like the pywatcher_t will 
be leaked on an unrecoverable session state change (EXPIRED_SESSION_STATE or 
AUTH_FAILED_STATE). Attached a proposed revision to ZOOKEEPER-888 for your 
review.

> zkpython leading to segfault on zookeeper
> -
>
> Key: ZOOKEEPER-740
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-740
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Federico
>Assignee: Henry Robinson
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-740.patch
>
>
> The program that we are implementing uses the python binding for zookeeper 
> but sometimes it crash with segfault; here is the bt from gdb:
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xad244b70 (LWP 28216)]
> 0x080611d5 in PyObject_Call (func=0x862fab0, arg=0x8837194, kw=0x0)
> at ../Objects/abstract.c:2488
> 2488../Objects/abstract.c: No such file or directory.
> in ../Objects/abstract.c
> (gdb) bt
> #0  0x080611d5 in PyObject_Call (func=0x862fab0, arg=0x8837194, kw=0x0)
> at ../Objects/abstract.c:2488
> #1  0x080d6ef2 in PyEval_CallObjectWithKeywords (func=0x862fab0,
> arg=0x8837194, kw=0x0) at ../Python/ceval.c:3575
> #2  0x080612a0 in PyObject_CallObject (o=0x862fab0, a=0x8837194)
> at ../Objects/abstract.c:2480
> #3  0x0047af42 in watcher_dispatch (zzh=0x86174e0, type=-1, state=1,
> path=0x86337c8 "", context=0x8588660) at src/c/zookeeper.c:314
> #4  0x00496559 in do_foreach_watcher (zh=0x86174e0, type=-1, state=1,
> path=0x86337c8 "", list=0xa5354140) at src/zk_hashtable.c:275
> #5  deliverWatchers (zh=0x86174e0, type=-1, state=1, path=0x86337c8 "",
> list=0xa5354140) at src/zk_hashtable.c:317
> #6  0x0048ae3c in process_completions (zh=0x86174e0) at src/zookeeper.c:1766
> #7  0x0049706b in do_completion (v=0x86174e0) at src/mt_adaptor.c:333
> #8  0x0013380e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
> #9  0x002578de in clone () from /lib/tls/i686/cmov/libc.so.6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-09 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Attachment: ZOOKEEPER-888.patch

Improved patch attached. Before, watcher_dispatch would unconditionally free 
non-global watcher objects.

Any number of recoverable session state change events may be sent to the 
watcher. This change frees the watcher only on the last callback- a data change 
event or unrecoverable session state change.


> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py, ZOOKEEPER-888.patch
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-09 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Attachment: (was: ZOOKEEPER-888.patch)

> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-888) c-client / zkpython: Double free corruption on node watcher

2010-10-07 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-888:
---

Attachment: ZOOKEEPER-888.patch

Path that prevents freeing a watcher in response to a session event, per the 
feedback in ZOOKEEPER-890.

> c-client / zkpython: Double free corruption on node watcher
> ---
>
> Key: ZOOKEEPER-888
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-888
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, contrib-bindings
>Affects Versions: 3.3.1
>Reporter: Lukas
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: resume-segfault.py, ZOOKEEPER-888.patch
>
>
> the c-client / zkpython wrapper invokes already freed watcher callback
> steps to reproduce:
>   0. start a zookeper server on your machine
>   1. run the attached python script
>   2. suspend the zookeeper server process (e.g. using `pkill -STOP -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
>   3. wait until the connection and the node observer fired with a session 
> event
>   4. resume the zookeeper server process  (e.g. using `pkill -CONT -f 
> org.apache.zookeeper.server.quorum.QuorumPeerMain` )
> -> the client tries to dispatch the node observer function again, but it was 
> already freed -> double free corruption

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-07 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919094#action_12919094
 ] 

Austin Shoemaker commented on ZOOKEEPER-890:


That sounds like a good design. Perhaps it could be clarified in the 
documentation?
http://hadoop.apache.org/zookeeper/docs/r3.3.1/zookeeperProgrammers.html#ch_zkWatches

If this is correct behavior then the Python client needs to be fixed to not 
delete the watcher on session events. Will file a separate bug on that.

> C client invokes watcher callbacks multiple times
> -
>
> Key: ZOOKEEPER-890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
> Environment: Mac OS X 10.6.5
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: watcher_twice.c, ZOOKEEPER-890.patch
>
>
> Code using the C client assumes that watcher callbacks are called exactly 
> once. If the watcher is called more than once, the process will likely 
> overwrite freed memory and/or crash.
> collect_session_watchers (zk_hashtable.c) gathers watchers from 
> active_node_watchers, active_exist_watchers, and active_child_watchers 
> without removing them. This results in watchers being invoked more than once.
> Test code is attached that reproduces the bug, along with a proposed patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-07 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-890:
---

Description: 
Code using the C client assumes that watcher callbacks are called exactly once. 
If the watcher is called more than once, the process will likely overwrite 
freed memory and/or crash.

collect_session_watchers (zk_hashtable.c) gathers watchers from 
active_node_watchers, active_exist_watchers, and active_child_watchers without 
removing them. This results in watchers being invoked more than once.

Test code is attached that reproduces the bug, along with a proposed patch.

  was:
The collect_session_watchers function in zk_hashtable.c gathers watchers from 
active_node_watchers, active_exist_watchers, and active_child_watchers without 
removing the watchers from the table.

Please see attached repro case and patch.


> C client invokes watcher callbacks multiple times
> -
>
> Key: ZOOKEEPER-890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
> Environment: Mac OS X 10.6.5
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: watcher_twice.c, ZOOKEEPER-890.patch
>
>
> Code using the C client assumes that watcher callbacks are called exactly 
> once. If the watcher is called more than once, the process will likely 
> overwrite freed memory and/or crash.
> collect_session_watchers (zk_hashtable.c) gathers watchers from 
> active_node_watchers, active_exist_watchers, and active_child_watchers 
> without removing them. This results in watchers being invoked more than once.
> Test code is attached that reproduces the bug, along with a proposed patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-07 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-890:
---

Attachment: ZOOKEEPER-890.patch

Patch that clears active watcher sets when broadcasting a session event to all 
watchers.

> C client invokes watcher callbacks multiple times
> -
>
> Key: ZOOKEEPER-890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
> Environment: Mac OS X 10.6.5
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: watcher_twice.c, ZOOKEEPER-890.patch
>
>
> The collect_session_watchers function in zk_hashtable.c gathers watchers from 
> active_node_watchers, active_exist_watchers, and active_child_watchers 
> without removing the watchers from the table.
> Please see attached repro case and patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-07 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-890:
---

Attachment: watcher_twice.c

> C client invokes watcher callbacks multiple times
> -
>
> Key: ZOOKEEPER-890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
> Environment: Mac OS X 10.6.5
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: watcher_twice.c
>
>
> The collect_session_watchers function in zk_hashtable.c gathers watchers from 
> active_node_watchers, active_exist_watchers, and active_child_watchers 
> without removing the watchers from the table.
> Please see attached repro case and patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-07 Thread Austin Shoemaker (JIRA)
C client invokes watcher callbacks multiple times
-

 Key: ZOOKEEPER-890
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.3.1
 Environment: Mac OS X 10.6.5
Reporter: Austin Shoemaker
Priority: Critical
 Attachments: watcher_twice.c

The collect_session_watchers function in zk_hashtable.c gathers watchers from 
active_node_watchers, active_exist_watchers, and active_child_watchers without 
removing the watchers from the table.

Please see attached repro case and patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-889) pyzoo_aget_children crashes due to incorrect watcher context

2010-10-06 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker resolved ZOOKEEPER-889.


Resolution: Fixed

Just noticed that the fix is already in trunk, closing the issue.

> pyzoo_aget_children crashes due to incorrect watcher context
> 
>
> Key: ZOOKEEPER-889
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-889
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bindings
>Affects Versions: 3.3.1
> Environment: OS X 10.6.5, Python 2.6.1
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: repro.py
>
>
> The pyzoo_aget_children function passes the completion callback ("pyw") in 
> place of the watcher callback ("get_pyw"). Since it is a one-shot callback, 
> it is deallocated after the completion callback fires, causing a crash when 
> the watcher callback should be invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-889) pyzoo_aget_children crashes due to incorrect watcher context

2010-10-06 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-889:
---

Attachment: repro.py

Minimal repro script

> pyzoo_aget_children crashes due to incorrect watcher context
> 
>
> Key: ZOOKEEPER-889
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-889
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bindings
>Affects Versions: 3.3.1
> Environment: OS X 10.6.5, Python 2.6.1
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: repro.py
>
>
> The pyzoo_aget_children function passes the completion callback ("pyw") in 
> place of the watcher callback ("get_pyw"). Since it is a one-shot callback, 
> it is deallocated after the completion callback fires, causing a crash when 
> the watcher callback should be invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-889) pyzoo_aget_children crashes due to incorrect watcher context

2010-10-06 Thread Austin Shoemaker (JIRA)
pyzoo_aget_children crashes due to incorrect watcher context


 Key: ZOOKEEPER-889
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-889
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.1
 Environment: OS X 10.6.5, Python 2.6.1
Reporter: Austin Shoemaker
Priority: Critical


The pyzoo_aget_children function passes the completion callback ("pyw") in 
place of the watcher callback ("get_pyw"). Since it is a one-shot callback, it 
is deallocated after the completion callback fires, causing a crash when the 
watcher callback should be invoked.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-208) Zookeeper C client uses API that are not thread safe, causing crashes when multiple instances are active

2008-11-18 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648756#action_12648756
 ] 

Austin Shoemaker commented on ZOOKEEPER-208:


Chris, thanks for modifying my patch to comply with the project. I reattached 
it granting the license, let me know if I can help with anything else.

> Zookeeper C client uses API that are not thread safe, causing crashes when 
> multiple instances are active
> 
>
> Key: ZOOKEEPER-208
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-208
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.0.0
> Environment: Linux
>Reporter: Austin Shoemaker
>Assignee: Austin Shoemaker
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: zookeeper-strtok_getaddrinfo-trunk.patch, 
> zookeeper-strtok_getaddrinfo-trunk.patch
>
>
> The Zookeeper C client library uses gethostbyname and strtok, both of which 
> are not safe to use from multiple threads.
> The problem is resolved by using getaddrinfo and strtok_r in place of the 
> older API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-208) Zookeeper C client uses API that are not thread safe, causing crashes when multiple instances are active

2008-11-18 Thread Austin Shoemaker (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Shoemaker updated ZOOKEEPER-208:
---

Attachment: zookeeper-strtok_getaddrinfo-trunk.patch

Reattaching patch with license granted.

> Zookeeper C client uses API that are not thread safe, causing crashes when 
> multiple instances are active
> 
>
> Key: ZOOKEEPER-208
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-208
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.0.0
> Environment: Linux
>Reporter: Austin Shoemaker
>Assignee: Austin Shoemaker
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: zookeeper-strtok_getaddrinfo-trunk.patch, 
> zookeeper-strtok_getaddrinfo-trunk.patch
>
>
> The Zookeeper C client library uses gethostbyname and strtok, both of which 
> are not safe to use from multiple threads.
> The problem is resolved by using getaddrinfo and strtok_r in place of the 
> older API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-208) Zookeeper C client uses API that are not thread safe, causing crashes when multiple instances are active

2008-10-26 Thread Austin Shoemaker (JIRA)
Zookeeper C client uses API that are not thread safe, causing crashes when 
multiple instances are active


 Key: ZOOKEEPER-208
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-208
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.0.0
 Environment: Linux
Reporter: Austin Shoemaker
Priority: Critical


The Zookeeper C client library uses gethostbyname and strtok, both of which are 
not safe to use from multiple threads. Below is the original patch we made 
which fixes the problem.

The problem is resolved by using getaddrinfo and strtok_r in place of the older 
API.

Patch for zookeeper-c-client-2.2.1/src/zookeeper.c (2008-06-09 on SF.net)

241c241
< struct hostent *he;
---
>   struct addrinfo hints, *res, *res0;
243,245d242
< struct sockaddr_in *addr4;
< struct sockaddr_in6 *addr6;
< char **ptr;
247a245
>   char *strtok_last;
263c261
< host=strtok(hosts, ",");
---
> host=strtok_r(hosts, ",", &strtok_last);
283,294c281,297
< he = gethostbyname(host);
< if (!he) {
< LOG_ERROR(("could not resolve %s", host));
< errno=EINVAL;
< rc=ZBADARGUMENTS;
< goto fail;
< }
<
< /* Setup the address array */
< for(ptr = he->h_addr_list;*ptr != 0; ptr++) {
< if (zh->addrs_count == alen) {
< void *tmpaddr;
---
>   
>   memset(&hints, 0, sizeof(hints));
>   hints.ai_flags = AI_ADDRCONFIG;
>   hints.ai_family = AF_UNSPEC;
>   hints.ai_socktype = SOCK_STREAM;
>   hints.ai_protocol = IPPROTO_TCP;
>
>   if (getaddrinfo(host, port_spec, &hints, &res0) != 0) {
>   LOG_ERROR(("getaddrinfo: %s\n", strerror(errno)));
>   rc=ZSYSTEMERROR;
>   goto fail;
>   }
>   
>   for (res = res0; res; res = res->ai_next) {
>   // Expand address list if needed
>   if (zh->addrs_count == alen) {
>   void *tmpaddr;
304,313c307,312
< }
< addr = &zh->addrs[zh->addrs_count];
< addr4 = (struct sockaddr_in*)addr;
< addr6 = (struct sockaddr_in6*)addr;
< addr->sa_family = he->h_addrtype;
< if (addr->sa_family == AF_INET) {
< addr4->sin_port = htons(port);
< memset(&addr4->sin_zero, 0, sizeof(addr4->sin_zero));
< memcpy(&addr4->sin_addr, *ptr, he->h_length);
< zh->addrs_count++;
---
>   }
>   
>   // Copy addrinfo into address list
>   addr = &zh->addrs[zh->addrs_count];
>   switch (res->ai_family) {
>   case AF_INET:
315,320c314
< } else if (addr->sa_family == AF_INET6) {
< addr6->sin6_port = htons(port);
< addr6->sin6_scope_id = 0;
< addr6->sin6_flowinfo = 0;
< memcpy(&addr6->sin6_addr, *ptr, he->h_length);
< zh->addrs_count++;
---
>   case AF_INET6:
322,327c316,328
< } else {
< LOG_WARN(("skipping unknown address family %x for %s",
< addr->sa_family, zh->hostname));
< }
< }
< host = strtok(0, ",");
---
>   memcpy(addr, res->ai_addr, res->ai_addrlen);
>   ++zh->addrs_count;
>   break;
>   default:
>   LOG_WARN(("skipping unknown address family %x 
> for %s",
>   res->ai_family, zh->hostname));
>   break;
>   }
>   }
>   
>   freeaddrinfo(res0);
>
>   host = strtok_r(0, ",", &strtok_last);
329a331
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-17) zookeeper_init doc needs clarification

2008-10-01 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635964#action_12635964
 ] 

Austin Shoemaker commented on ZOOKEEPER-17:
---

The documentation states that if the client_id given to zookeeper_init is 
expired or invalid that a new session will be automatically generated, implying 
that it will proceed to the CONNECTED state.

In the implementation an expired or invalid client_id leads to the 
unrecoverable SESSION_EXPIRED_STATE, which requires closing and reopening a new 
connection with no client_id specified to continue.

Since the server has already assigned a replacement client_id it seems logical 
to follow the header documentation and proceed with the new value, which 
appears to be possible by removing the if-block that triggers the expired state 
in check_events (zookeeper.c).

If the client application needs to know if the session was replaced, it can 
simply compare the client_id it provided with the client_id upon entering 
CONNECTED_STATE.

What do you think?

> zookeeper_init doc needs clarification
> --
>
> Key: ZOOKEEPER-17
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-17
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Fix For: 3.0.0
>
> Attachments: ZOOKEEPER-17.patch
>
>
> Moved from SourceForge to Apache.
> http://sourceforge.net/tracker/index.php?func=detail&aid=1967467&group_id=209147&atid=1008544

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-131) Old leader election can elect a dead leader over and over again

2008-09-18 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632120#action_12632120
 ] 

Austin Shoemaker commented on ZOOKEEPER-131:


This patch appears to solve the problem for algorithm 0- our unit test 
completed successfully 16 times.

> Old leader election can elect a dead leader over and over again
> ---
>
> Key: ZOOKEEPER-131
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-131
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Attachments: ZOOKEEPER-131.patch
>
>
> I think there is a race condition that is probably easy to get into with the 
> old leader election and a large number of servers:
> 1) Leader dies
> 2) Followers start looking for a new leader before all Followers have 
> abandoned the Leader
> 3) The Followers looking for a new leader see votes of Followers still 
> following the (now dead) Leader and start voting for the dead Leader
> 4) The dead Leader gets reelected.
> For the old leader election a server should not vote for another server that 
> is not nominating himself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

2008-09-17 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632111#action_12632111
 ] 

austin edited comment on ZOOKEEPER-127 at 9/17/08 11:36 PM:
--

After several more runs of our unit test using the patched algorithm 3, the 
test hangs as the service repeatedly tries to reelect the killed leader. This 
behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 
0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is 
from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:[EMAIL PROTECTED] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - unable to parse 
zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - New election: 
8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:[EMAIL PROTECTED] - Cannot 
open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Created server 
with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Following 
/10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:[EMAIL PROTECTED] - Unexpected 
exception
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:[EMAIL PROTECTED] - FIXMSG
java.lang.Exception: shutdown Follower
at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]


  was (Author: austin):
After about 6 runs of our unit test the test hangs as the service 
repeatedly tries to reelect the killed leader (similar to ZOOKEEPER-131 with 
algorithms 0 and 1). 


After several more runs of our unit test using the patched algorithm 3, the 
test hangs as the service repeatedly tries to reelect the killed leader. This 
behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 
0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is 
from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:[EMAIL PROTECTED] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - unable to parse 
zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - New election: 
8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:[EMAIL PROTECTED] - Cannot 
open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Created server 
with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Following 
/10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:[EMAIL PROTECTED] - Unexpected 
exception
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:[EMAIL PROTECTED] - FIXMSG
java.lang.Exception: shutdown Follower
at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
at 
org.apache.zookeeper

[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

2008-09-17 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632111#action_12632111
 ] 

Austin Shoemaker commented on ZOOKEEPER-127:


After about 6 runs of our unit test the test hangs as the service repeatedly 
tries to reelect the killed leader (similar to ZOOKEEPER-131 with algorithms 0 
and 1). 


After several more runs of our unit test using the patched algorithm 3, the 
test hangs as the service repeatedly tries to reelect the killed leader. This 
behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 
0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is 
from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:[EMAIL PROTECTED] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - unable to parse 
zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - New election: 
8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:[EMAIL PROTECTED] - Cannot 
open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Created server 
with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Following 
/10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:[EMAIL PROTECTED] - Unexpected 
exception
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:[EMAIL PROTECTED] - FIXMSG
java.lang.Exception: shutdown Follower
at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]


> Use of non-standard election ports in config breaks services
> 
>
> Key: ZOOKEEPER-127
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.0.0
>Reporter: Mark Harwood
>Assignee: Flavio Paiva Junqueira
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, 
> ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
> channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the 
> electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 
> 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not 
> manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

2008-09-17 Thread Austin Shoemaker (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632099#action_12632099
 ] 

Austin Shoemaker commented on ZOOKEEPER-127:


Applying the patch (from 9/17) to the latest trunk (r696563) now passes our 
leader election unit tests using algorithm 3. This is great.

Two minor issues I noticed:

1. The default constructor for QuorumPeer should call setStatsProvider, rather 
than the attribute-passing constructor. Since QuorumPeerMain calls the default 
constructor, echo stat | nc ... requests are returning invalid data because no 
provider is set.

2. In QuorumPeerConfig.java:105 where parts.length is checked the operator 
should be && instead of ||.


> Use of non-standard election ports in config breaks services
> 
>
> Key: ZOOKEEPER-127
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.0.0
>Reporter: Mark Harwood
>Assignee: Flavio Paiva Junqueira
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, 
> ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
> channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the 
> electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 
> 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not 
> manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.