[jira] Updated: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-08-26 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-855:
-

Priority: Trivial  (was: Major)

> clientPortBindAddress should be clientPortAddress
> -
>
> Key: ZOOKEEPER-855
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.0, 3.3.1
>Reporter: Jared Cantwell
>Priority: Trivial
>
> The server documentation states that the configuration parameter for binding 
> to a specific ip address is clientPortBindAddress.  The code believes the 
> parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
> changed to reflect the correct parameter .  This parameter was added in 
> ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-08-26 Thread Jared Cantwell (JIRA)
clientPortBindAddress should be clientPortAddress
-

 Key: ZOOKEEPER-855
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
 Project: Zookeeper
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.3.1, 3.3.0
Reporter: Jared Cantwell


The server documentation states that the configuration parameter for binding to 
a specific ip address is clientPortBindAddress.  The code believes the 
parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
changed to reflect the correct parameter .  This parameter was added in 
ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twise

2010-09-28 Thread Jared Cantwell (JIRA)
ZooKeeperServer.loadData loads database twise
-

 Key: ZOOKEEPER-881
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Reporter: Jared Cantwell
Priority: Trivial


zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
shouldn't have any negative affects, but is unnecessary.   A patch should be 
trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-09-28 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-881:
-

Summary: ZooKeeperServer.loadData loads database twice  (was: 
ZooKeeperServer.loadData loads database twise)

> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Trivial
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-28 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-882:
-

Attachment: 882.diff

A simple patch for consideration.

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-28 Thread Jared Cantwell (JIRA)
Startup loads last transaction from snapshot


 Key: ZOOKEEPER-882
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Reporter: Jared Cantwell
Priority: Minor
 Attachments: 882.diff

On startup, the server first loads the latest snapshot, and then loads from the 
log starting at the last transaction in the snapshot.  It should begin from one 
past that last transaction in the log.  I will attach a possible patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-29 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916122#action_12916122
 ] 

Jared Cantwell commented on ZOOKEEPER-882:
--

Maybe I am misunderstanding the FileTxnLog.next() call, but I interpret the 
following:

- Based on its usage, the next() call should prepare the next hdr and record, 
and return true if it did this successfully, and false otherwise.
- if the catch() executes, it means that we couldn't prepare from the current 
file, so we need to move to the next
- goToNextLog() simply swaps the log file, but does not prepare hdr and record, 
so the current ones will remain current and get processed a second time

If any of this is wrong, than my patch probably doesn't make sense, so please 
let me know.

As far as the second point, I agree that init() is inclusive.  So you want to 
pass in as a parameter something that has not yet been processed.  So, if we 
pass in lastProcessedZxid (which is already in the snapshot), then that will be 
read from the log also since init() starts at that transaction, not one past 
it.  Is that right?

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-762) Allow dynamic addition/removal of server nodes in the client API

2010-09-29 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-762:
-

Attachment: 762.diff

We had this same issue.  I have submitted a possible patch for the c-client.  I 
didn't add in any unit tests yet, but if the code looks good I can put those 
together.  A java client fix would be similar, and I can probably get that 
together too if there's interest.  

> Allow dynamic addition/removal of server nodes in the client API
> 
>
> Key: ZOOKEEPER-762
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-762
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: c client, java client
>Reporter: Dave Wright
>Priority: Minor
> Attachments: 762.diff
>
>
> Currently the list of zookeeper servers needs to be provided to the client 
> APIs at construction time, and cannot be changed without a complete 
> shutdown/restart of the client API. However, there are scenarios that require 
> the server list to be updated, such as removal or addition of a ZK cluster 
> node, and it would be nice if the list could be updated via a simple API call.
> The general approach (in the Java client) would be to 
> "RemoveServer()/AddServer()" functions for Zookeeper that calls down to 
> ClientCnxn, where they are just maintained in a list. Of course if
> the server being removed is the one currently connected, we'd need to 
> disconnect, but a simple call to disconnect() seems like it would resolve 
> that and trigger the automatic re-connection logic.
> An equivalent change could be made in the C code. 
> This change would also make dynamic cluster membership in ZOOKEEPER-107 
> easier to implement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-29 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916150#action_12916150
 ] 

Jared Cantwell commented on ZOOKEEPER-882:
--

On second look, I agree with the way that works.  I think the bug in fact is 
later in FileSnapLog.restore() because next() is never called before the loop 
starts executing, so the first transaction processed is the one that TxnLog was 
initialized with.  Do you see that also?

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-29 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-882:
-

Attachment: restore

Maybe restore() should look like the attached (not a diff) instead.  

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff, restore
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-890) C client invokes watcher callbacks multiple times

2010-10-07 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919087#action_12919087
 ] 

Jared Cantwell commented on ZOOKEEPER-890:
--

I don't believe the C-client makes the guarantee that "watcher callbacks are 
called exactly once."  Callbacks are called for different reasons, including:

- connection lost event
- connection reestablished event
- session lost event
- data changed event

Only the last two events make the guarantee about being called exactly once, 
but the first two connection events can be called numerous times until either 
one of the last two events happens.  I may be missing some events, but that's 
the general idea.  Bottom line is the callback can receive events of type 
ZOO_SESSION_EVENT multiple times.  I believe this was by design.

> C client invokes watcher callbacks multiple times
> -
>
> Key: ZOOKEEPER-890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-890
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
> Environment: Mac OS X 10.6.5
>Reporter: Austin Shoemaker
>Priority: Critical
> Attachments: watcher_twice.c, ZOOKEEPER-890.patch
>
>
> Code using the C client assumes that watcher callbacks are called exactly 
> once. If the watcher is called more than once, the process will likely 
> overwrite freed memory and/or crash.
> collect_session_watchers (zk_hashtable.c) gathers watchers from 
> active_node_watchers, active_exist_watchers, and active_child_watchers 
> without removing them. This results in watchers being invoked more than once.
> Test code is attached that reproduces the bug, along with a proposed patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-804) c unit tests failing due to "assertion cptr failed"

2010-10-12 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920307#action_12920307
 ] 

Jared Cantwell commented on ZOOKEEPER-804:
--

Should the return statement of this patch be:

return api_epilog(zh,ZINVALIDSTATE);

Otherwise, it seems possible that the reference counting could get off in this 
case.

> c unit tests failing due to "assertion cptr failed"
> ---
>
> Key: ZOOKEEPER-804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-804
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.0
> Environment: gcc 4.4.3, ubuntu lucid lynx, dual core laptop (intel)
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-804.patch
>
>
> I'm seeing this frequently:
>  [exec] Zookeeper_simpleSystem::testPing : elapsed 18006 : OK
>  [exec] Zookeeper_simpleSystem::testAcl : elapsed 1022 : OK
>  [exec] Zookeeper_simpleSystem::testChroot : elapsed 3145 : OK
>  [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started : 
> elapsed 25687 : OK
>  [exec] zktest-mt: 
> /home/phunt/dev/workspace/gitzk/src/c/src/zookeeper.c:1952: 
> zookeeper_process: Assertion `cptr' failed.
>  [exec] make: *** [run-check] Aborted
>  [exec] Zookeeper_simpleSystem::testHangingClient
> Mahadev can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-804) c unit tests failing due to "assertion cptr failed"

2010-10-13 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920689#action_12920689
 ] 

Jared Cantwell commented on ZOOKEEPER-804:
--

It seems like zookeeper_process unnecessarily calls api_prolog() and 
api_epilog() to begin with.  But given that api_prolog() is called at the 
beginning, if this new code path executes while a close is requested, then the 
threads will successfully exit, but the final part of zookeeper_close that 
releases the memory on the last reference will not execute (since the last 
reference will never be reached).  Unless I'm missing something, this will leak 
the zkhandle.

> c unit tests failing due to "assertion cptr failed"
> ---
>
> Key: ZOOKEEPER-804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-804
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.0
> Environment: gcc 4.4.3, ubuntu lucid lynx, dual core laptop (intel)
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-804.patch
>
>
> I'm seeing this frequently:
>  [exec] Zookeeper_simpleSystem::testPing : elapsed 18006 : OK
>  [exec] Zookeeper_simpleSystem::testAcl : elapsed 1022 : OK
>  [exec] Zookeeper_simpleSystem::testChroot : elapsed 3145 : OK
>  [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started : 
> elapsed 25687 : OK
>  [exec] zktest-mt: 
> /home/phunt/dev/workspace/gitzk/src/c/src/zookeeper.c:1952: 
> zookeeper_process: Assertion `cptr' failed.
>  [exec] make: *** [run-check] Aborted
>  [exec] Zookeeper_simpleSystem::testHangingClient
> Mahadev can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-897) C Client seg faults during close

2010-10-14 Thread Jared Cantwell (JIRA)
C Client seg faults during close


 Key: ZOOKEEPER-897
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Reporter: Jared Cantwell


We observed a crash while closing our c client.  It was in the do_io() thread 
that was processing as during the close() call.

#0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
#1  0x0046234e in check_events (zh=0x6bd480, events=) at src/zookeeper.c:1687
#2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
src/zookeeper.c:1971
#3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
#4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
#5  0x76f706fd in clone () from /lib/libc.so.6
#6  0x in ?? ()

We tracked down the sequence of events, and the cause is that input_buffer is 
being freed from a thread other than the do_io thread that relies on it:

1. do_io() call check_events()
2. if(events&ZOOKEEPER_READ) branch executes
3. if (rc > 0) branch executes
4. if (zh->input_buffer != &zh->primer_buffer) branch executes
.in the meantime..
 5. zookeeper_close() called
 6. if (inc_ref_counter(zh,0)!=0) branch executes
 7. cleanup_bufs() is called
 8. input_buffer is freed at the end
. back to check_events().
9. queue_events() is called on a NULL buffer.

I believe the patch is to only call free_completions() in zookeeper_close() and 
not cleanup_bufs().  The original reason cleanup_bufs() was added was to call 
any outstanding synhcronous completions, so only free_completions (which is 
guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-897) C Client seg faults during close

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-897:
-

Attachment: ZOOKEEEPER-897.diff

Patch that only calls free_completions in zookeeper_close() instead of 
cleanup_bufs().

> C Client seg faults during close
> 
>
> Key: ZOOKEEPER-897
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Jared Cantwell
> Attachments: ZOOKEEEPER-897.diff
>
>
> We observed a crash while closing our c client.  It was in the do_io() thread 
> that was processing as during the close() call.
> #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
> #1  0x0046234e in check_events (zh=0x6bd480, events= out>) at src/zookeeper.c:1687
> #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
> src/zookeeper.c:1971
> #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
> #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
> #5  0x76f706fd in clone () from /lib/libc.so.6
> #6  0x in ?? ()
> We tracked down the sequence of events, and the cause is that input_buffer is 
> being freed from a thread other than the do_io thread that relies on it:
> 1. do_io() call check_events()
> 2. if(events&ZOOKEEPER_READ) branch executes
> 3. if (rc > 0) branch executes
> 4. if (zh->input_buffer != &zh->primer_buffer) branch executes
> .in the meantime..
>  5. zookeeper_close() called
>  6. if (inc_ref_counter(zh,0)!=0) branch executes
>  7. cleanup_bufs() is called
>  8. input_buffer is freed at the end
> . back to check_events().
> 9. queue_events() is called on a NULL buffer.
> I believe the patch is to only call free_completions() in zookeeper_close() 
> and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
> call any outstanding synhcronous completions, so only free_completions (which 
> is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-898) C Client might not cleanup correctly during close

2010-10-14 Thread Jared Cantwell (JIRA)
C Client might not cleanup correctly during close
-

 Key: ZOOKEEPER-898
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-898
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Reporter: Jared Cantwell
Priority: Trivial
 Attachments: ZOOKEEEPER-898.diff

I was looking through the c-client code and noticed a situation where a counter 
can be incorrectly incremented and a small memory leak can occur.

In zookeeper.c : add_completion(), if close_requested is true, then the 
completion will not be queued.  But at the end, outstanding_sync is still 
incremented and free() never called on the newly allocated completion_list_t.  

I will submit for review a diff that I believe corrects this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-898) C Client might not cleanup correctly during close

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-898:
-

Attachment: ZOOKEEEPER-898.diff

Suggested correction.

> C Client might not cleanup correctly during close
> -
>
> Key: ZOOKEEPER-898
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-898
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Jared Cantwell
>Priority: Trivial
> Attachments: ZOOKEEEPER-898.diff
>
>
> I was looking through the c-client code and noticed a situation where a 
> counter can be incorrectly incremented and a small memory leak can occur.
> In zookeeper.c : add_completion(), if close_requested is true, then the 
> completion will not be queued.  But at the end, outstanding_sync is still 
> incremented and free() never called on the newly allocated completion_list_t. 
>  
> I will submit for review a diff that I believe corrects this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-855:
-

Status: Patch Available  (was: Open)

Simple patch.

> clientPortBindAddress should be clientPortAddress
> -
>
> Key: ZOOKEEPER-855
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.1, 3.3.0
>Reporter: Jared Cantwell
>Priority: Trivial
> Attachments: ZOOKEEPER-855.patch
>
>
> The server documentation states that the configuration parameter for binding 
> to a specific ip address is clientPortBindAddress.  The code believes the 
> parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
> changed to reflect the correct parameter .  This parameter was added in 
> ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-855:
-

Attachment: ZOOKEEPER-855.patch

Documentation patch.

> clientPortBindAddress should be clientPortAddress
> -
>
> Key: ZOOKEEPER-855
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.0, 3.3.1
>Reporter: Jared Cantwell
>Priority: Trivial
> Attachments: ZOOKEEPER-855.patch
>
>
> The server documentation states that the configuration parameter for binding 
> to a specific ip address is clientPortBindAddress.  The code believes the 
> parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
> changed to reflect the correct parameter .  This parameter was added in 
> ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-882:
-

Attachment: ZOOKEEPER-882.patch

Patch to incorporate recommended changes.

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-882:
-

Assignee: Jared Cantwell
  Status: Patch Available  (was: Open)

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-881:
-

Attachment: ZOOKEEPER-881.patch

> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Trivial
> Attachments: ZOOKEEPER-881.patch
>
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-10-14 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-881:
-

Assignee: Jared Cantwell
  Status: Patch Available  (was: Open)

> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Attachments: ZOOKEEPER-881.patch
>
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-804) c unit tests failing due to "assertion cptr failed"

2010-10-15 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-804:
-

Attachment: ZOOKEEPER-804-1.patch

Not sure what the protocol is here, but I'm gonna go ahead and attach a revised 
patch.

> c unit tests failing due to "assertion cptr failed"
> ---
>
> Key: ZOOKEEPER-804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-804
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.0
> Environment: gcc 4.4.3, ubuntu lucid lynx, dual core laptop (intel)
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-804-1.patch, ZOOKEEPER-804.patch
>
>
> I'm seeing this frequently:
>  [exec] Zookeeper_simpleSystem::testPing : elapsed 18006 : OK
>  [exec] Zookeeper_simpleSystem::testAcl : elapsed 1022 : OK
>  [exec] Zookeeper_simpleSystem::testChroot : elapsed 3145 : OK
>  [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started : 
> elapsed 25687 : OK
>  [exec] zktest-mt: 
> /home/phunt/dev/workspace/gitzk/src/c/src/zookeeper.c:1952: 
> zookeeper_process: Assertion `cptr' failed.
>  [exec] make: *** [run-check] Aborted
>  [exec] Zookeeper_simpleSystem::testHangingClient
> Mahadev can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-804) c unit tests failing due to "assertion cptr failed"

2010-10-16 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921690#action_12921690
 ] 

Jared Cantwell commented on ZOOKEEPER-804:
--

That probably failed because the first patch is already applied, so this patch 
doesn't exactly apply to your trunk.  I can open a new bug and submit a patch 
that way if its preferred.

> c unit tests failing due to "assertion cptr failed"
> ---
>
> Key: ZOOKEEPER-804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-804
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.0
> Environment: gcc 4.4.3, ubuntu lucid lynx, dual core laptop (intel)
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-804-1.patch, ZOOKEEPER-804.patch
>
>
> I'm seeing this frequently:
>  [exec] Zookeeper_simpleSystem::testPing : elapsed 18006 : OK
>  [exec] Zookeeper_simpleSystem::testAcl : elapsed 1022 : OK
>  [exec] Zookeeper_simpleSystem::testChroot : elapsed 3145 : OK
>  [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started : 
> elapsed 25687 : OK
>  [exec] zktest-mt: 
> /home/phunt/dev/workspace/gitzk/src/c/src/zookeeper.c:1952: 
> zookeeper_process: Assertion `cptr' failed.
>  [exec] make: *** [run-check] Aborted
>  [exec] Zookeeper_simpleSystem::testHangingClient
> Mahadev can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-897) C Client seg faults during close

2010-10-18 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-897:
-

Attachment: ZOOKEEPER-897.patch

Updated patch format and spelling.

> C Client seg faults during close
> 
>
> Key: ZOOKEEPER-897
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEEPER-897.diff, ZOOKEEPER-897.patch
>
>
> We observed a crash while closing our c client.  It was in the do_io() thread 
> that was processing as during the close() call.
> #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
> #1  0x0046234e in check_events (zh=0x6bd480, events= out>) at src/zookeeper.c:1687
> #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
> src/zookeeper.c:1971
> #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
> #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
> #5  0x76f706fd in clone () from /lib/libc.so.6
> #6  0x in ?? ()
> We tracked down the sequence of events, and the cause is that input_buffer is 
> being freed from a thread other than the do_io thread that relies on it:
> 1. do_io() call check_events()
> 2. if(events&ZOOKEEPER_READ) branch executes
> 3. if (rc > 0) branch executes
> 4. if (zh->input_buffer != &zh->primer_buffer) branch executes
> .in the meantime..
>  5. zookeeper_close() called
>  6. if (inc_ref_counter(zh,0)!=0) branch executes
>  7. cleanup_bufs() is called
>  8. input_buffer is freed at the end
> . back to check_events().
> 9. queue_events() is called on a NULL buffer.
> I believe the patch is to only call free_completions() in zookeeper_close() 
> and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
> call any outstanding synhcronous completions, so only free_completions (which 
> is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-898) C Client might not cleanup correctly during close

2010-10-18 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-898:
-

Attachment: ZOOKEEPER-898.patch

Updated patch format and spelling.

> C Client might not cleanup correctly during close
> -
>
> Key: ZOOKEEPER-898
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-898
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEEPER-898.diff, ZOOKEEPER-898.patch
>
>
> I was looking through the c-client code and noticed a situation where a 
> counter can be incorrectly incremented and a small memory leak can occur.
> In zookeeper.c : add_completion(), if close_requested is true, then the 
> completion will not be queued.  But at the end, outstanding_sync is still 
> incremented and free() never called on the newly allocated completion_list_t. 
>  
> I will submit for review a diff that I believe corrects this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-906) Improve C client connection reliability by making it sleep between reconnect attempts as in Java Client

2010-10-20 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922954#action_12922954
 ] 

Jared Cantwell commented on ZOOKEEPER-906:
--

I like this idea-- we ran into an issue with this recently too.  I was looking 
at your patch and don't understand how the last_connect_index works.  It seems 
like its intention is to store the last successful connection that was 
established.  However, it seems to only be assigned a value before a connection 
is established.  Based on what I can tell:

1. it will get set to the very fist connect_index
2. connections will loop around, but it won't get reset since 
last_connect_index != -1
3. once reconnections make it around to the beginning, it will get reset to -1 
again
4. then the very next connect_index will get assigned to that same connect_index
...loops...

Please let me know what I am missing.

> Improve C client connection reliability by making it sleep between reconnect 
> attempts as in Java Client
> ---
>
> Key: ZOOKEEPER-906
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-906
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client
>Affects Versions: 3.3.1
>Reporter: Radu Marin
>Assignee: Radu Marin
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, when a C client get disconnected, it retries a couple of hosts 
> (not all) with no delay between attempts and then if it doesn't succeed it 
> sleeps for 1/3 session expiration timeout period before trying again.
> In the worst case the disconnect event can occur after 2/3 of session 
> expiration timeout has past, and sleeping for even more 1/3 session timeout 
> will cause a session loss in most of the times.
> A better approach is to check all hosts but with random delay between 
> reconnect attempts. Also the delay must be independent of session timeout so 
> if we increase the session timeout we also increase the number of available 
> attempts.
> This improvement covers the case when the C client experiences network 
> problems for a short period of time and is not able to reach any zookeeper 
> hosts.
> Java client already uses this logic and works very good.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-906) Improve C client connection reliability by making it sleep between reconnect attempts as in Java Client

2010-10-20 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923146#action_12923146
 ] 

Jared Cantwell commented on ZOOKEEPER-906:
--

That's more like what I imagined.  Do you still need to set it to -1 and then 
reset to connect_index in the !must_sleep branch? Or can that all be removed 
now (with proper initialization on last_connect_index)?

Also, I'm not to familiar with POLLERR... what is that addition for?

> Improve C client connection reliability by making it sleep between reconnect 
> attempts as in Java Client
> ---
>
> Key: ZOOKEEPER-906
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-906
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client
>Affects Versions: 3.3.1
>Reporter: Radu Marin
>Assignee: Radu Marin
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-906.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, when a C client get disconnected, it retries a couple of hosts 
> (not all) with no delay between attempts and then if it doesn't succeed it 
> sleeps for 1/3 session expiration timeout period before trying again.
> In the worst case the disconnect event can occur after 2/3 of session 
> expiration timeout has past, and sleeping for even more 1/3 session timeout 
> will cause a session loss in most of the times.
> A better approach is to check all hosts but with random delay between 
> reconnect attempts. Also the delay must be independent of session timeout so 
> if we increase the session timeout we also increase the number of available 
> attempts.
> This improvement covers the case when the C client experiences network 
> problems for a short period of time and is not able to reach any zookeeper 
> hosts.
> Java client already uses this logic and works very good.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-906) Improve C client connection reliability by making it sleep between reconnect attempts as in Java Client

2010-10-21 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923431#action_12923431
 ] 

Jared Cantwell commented on ZOOKEEPER-906:
--

Some other small comments based on what I've seen so far with this project:

1. Are the changes to mt_adaptor still needed?
2. I think it can be restructured to remove the must_sleep bool and be more 
obvious whats going on.  The check that resets the index to 0 can happen all 
the time.  And then the two sleep conditions can be an if/else if, with a final 
else that does what the !must_sleep branch does now.
3. Changing 'zh->connect_index == zh->addrs_count' to 'zh->connect_index >= 
zh->addrs_count' may seem subtle, but there is a whole bug devoted to it, so 
this patch probably shouldn't change that.  See ZOOKEEPER-458.
4. Before this gets committed, i imagine its going to need a unittest or two.
5. Indentation should be 4 spaces.

> Improve C client connection reliability by making it sleep between reconnect 
> attempts as in Java Client
> ---
>
> Key: ZOOKEEPER-906
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-906
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client
>Affects Versions: 3.3.1
>Reporter: Radu Marin
>Assignee: Radu Marin
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-906.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, when a C client get disconnected, it retries a couple of hosts 
> (not all) with no delay between attempts and then if it doesn't succeed it 
> sleeps for 1/3 session expiration timeout period before trying again.
> In the worst case the disconnect event can occur after 2/3 of session 
> expiration timeout has past, and sleeping for even more 1/3 session timeout 
> will cause a session loss in most of the times.
> A better approach is to check all hosts but with random delay between 
> reconnect attempts. Also the delay must be independent of session timeout so 
> if we increase the session timeout we also increase the number of available 
> attempts.
> This improvement covers the case when the C client experiences network 
> problems for a short period of time and is not able to reach any zookeeper 
> hosts.
> Java client already uses this logic and works very good.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-897) C Client seg faults during close

2010-10-25 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924748#action_12924748
 ] 

Jared Cantwell commented on ZOOKEEPER-897:
--

we are using the 3.3.2 release.  i don't think the patch leaks memory because 
destroy() will eventually get called (by the reentrant call to 
zookeeper_close()), which calls cleanup_bufs() and frees those buffers, right?  
Also, i had a test that reproduced this error, but it was easiest to reproduce 
if i injected artificial sleeps into the zookeeper.c file.  If that's ok, then 
I can submit that.  Otherwise, i'll see if i can devise a test that can 
reproduce it otherwise.

> C Client seg faults during close
> 
>
> Key: ZOOKEEPER-897
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEEPER-897.diff, ZOOKEEPER-897.patch
>
>
> We observed a crash while closing our c client.  It was in the do_io() thread 
> that was processing as during the close() call.
> #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
> #1  0x0046234e in check_events (zh=0x6bd480, events= out>) at src/zookeeper.c:1687
> #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
> src/zookeeper.c:1971
> #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
> #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
> #5  0x76f706fd in clone () from /lib/libc.so.6
> #6  0x in ?? ()
> We tracked down the sequence of events, and the cause is that input_buffer is 
> being freed from a thread other than the do_io thread that relies on it:
> 1. do_io() call check_events()
> 2. if(events&ZOOKEEPER_READ) branch executes
> 3. if (rc > 0) branch executes
> 4. if (zh->input_buffer != &zh->primer_buffer) branch executes
> .in the meantime..
>  5. zookeeper_close() called
>  6. if (inc_ref_counter(zh,0)!=0) branch executes
>  7. cleanup_bufs() is called
>  8. input_buffer is freed at the end
> . back to check_events().
> 9. queue_events() is called on a NULL buffer.
> I believe the patch is to only call free_completions() in zookeeper_close() 
> and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
> call any outstanding synhcronous completions, so only free_completions (which 
> is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-897) C Client seg faults during close

2010-10-28 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925835#action_12925835
 ] 

Jared Cantwell commented on ZOOKEEPER-897:
--

I wasn't able to figure out how to write a test for this scenario without 
injecting some helper code into zookeeper.c.  I ran the cpp unittests ~20 times 
locally and no issues.  Also, I've temporarily been using this in our test 
environment for ~2 weeks and no issues there either.

> C Client seg faults during close
> 
>
> Key: ZOOKEEPER-897
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEEPER-897.diff, ZOOKEEPER-897.patch
>
>
> We observed a crash while closing our c client.  It was in the do_io() thread 
> that was processing as during the close() call.
> #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
> #1  0x0046234e in check_events (zh=0x6bd480, events= out>) at src/zookeeper.c:1687
> #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
> src/zookeeper.c:1971
> #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
> #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
> #5  0x76f706fd in clone () from /lib/libc.so.6
> #6  0x in ?? ()
> We tracked down the sequence of events, and the cause is that input_buffer is 
> being freed from a thread other than the do_io thread that relies on it:
> 1. do_io() call check_events()
> 2. if(events&ZOOKEEPER_READ) branch executes
> 3. if (rc > 0) branch executes
> 4. if (zh->input_buffer != &zh->primer_buffer) branch executes
> .in the meantime..
>  5. zookeeper_close() called
>  6. if (inc_ref_counter(zh,0)!=0) branch executes
>  7. cleanup_bufs() is called
>  8. input_buffer is freed at the end
> . back to check_events().
> 9. queue_events() is called on a NULL buffer.
> I believe the patch is to only call free_completions() in zookeeper_close() 
> and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
> call any outstanding synhcronous completions, so only free_completions (which 
> is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-11-03 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927904#action_12927904
 ] 

Jared Cantwell commented on ZOOKEEPER-882:
--

Sure Flavio.  I have a test ready, but I have a question for you.  In order to 
make the test illustrate the bug in the current version, it needs to be changed 
slightly (~3 lines).  Otherwise, it will still fail with the current version, 
but for the wrong reason.  It does, however, verify that transactions are 
loaded correctly, and once, with the patch.  How should I handle this?  I was 
thinking about posting a version of the test for people to verify that this 
truly is a bug, and then submitting the real test with the patch.  How does 
that sound?

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-11-03 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927912#action_12927912
 ] 

Jared Cantwell commented on ZOOKEEPER-882:
--

Sorry, I don't think I was clear.  That makes sense, but I was saying that if 
you only apply the test then it will fail (as expected), but for the wrong 
reason.  Due to the loop restructuring, and the semantics of the change, a test 
to illustrate the current bug would need to be slightly different (~3 lines) 
than the test I would submit to illustrate correct functionality after the 
patch.  

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-11-04 Thread Jared Cantwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jared Cantwell updated ZOOKEEPER-882:
-

Attachment: FailureTest-882.patch

Submitting a unit test that illustrates the bug and confirms that transactions 
are loaded multiple times when moving to the next log file.  This test ignores 
FileTxnSnapLog and focuses on the behavior of FileTxnLog for the moment.  

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: 882.diff, FailureTest-882.patch, restore, 
> ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.