IIRC, that was a full paste of all the bitcask.write.locks.  Riak fails pretty 
much immediately while running my test suite, maybe before a lock is opened for 
each partition?  

My ulimit was set to 256, which is obviously no good.  After boosting it to 
9000 and running my test suite, I have the locks shown below.  Riak is still 
running.  I guess that makes it an issue with max open files rather than a 
write lock issue?

$ lsof -p 53113 | awk '{print $9}'| uniq -c  | grep lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/1438665674247607560106752257205091097473808596992/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/22835963083295358096932575511191922182123945984/bitcask.write.lock
   1 /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/0/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/776422744832042175295707567380525354192214163456/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/730750818665451459101842416358141509827966271488/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/753586781748746817198774991869333432010090217472/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/1164634117248063262943561351070788031288321245184/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/1187470080331358621040493926581979953470445191168/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/1210306043414653979137426502093171875652569137152/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/45671926166590716193865151022383844364247891968/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/91343852333181432387730302044767688728495783936/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/137015778499772148581595453067151533092743675904/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/114179815416476790484662877555959610910619729920/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/182687704666362864775460604089535377456991567872/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/228359630832953580969325755111919221821239459840/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/205523667749658222872393179600727299639115513856/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/593735040165679310520246963290989976735222595584/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/639406966332270026714112114313373821099470487552/bitcask.write.lock
   1 
/usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/616571003248974668617179538802181898917346541568/bitcask.write.lock


On Jul 30, 2010, at 8:34 PM, David Smith wrote:

> That's only a partial paste, correct? How many partitions 
> ({ring_creation_size, 64} in your etc/app.config) do you have defined? There 
> should be a write lock file open for each partition. Also, what is your 
> ulimit -n set to?
> 
> Thanks,
> 
> D.
> 
> On Fri, Jul 30, 2010 at 5:09 PM, Alex Wolfe <[email protected]> wrote:
> $ lsof -p 16129 | awk '{print $9}'| uniq -c | grep lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/913438523331814323877303020447676887284957839360/bitcask.write.lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/959110449498405040071168171470060731649205731328/bitcask.write.lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/936274486415109681974235595958868809467081785344/bitcask.write.lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/411047335499316445744786359201454599278231027712/bitcask.write.lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/456719261665907161938651510223838443642478919680/bitcask.write.lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/433883298582611803841718934712646521460354973696/bitcask.write.lock
>   1 
> /usr/local/Cellar/riak/0.12.0/libexec/data/bitcask/388211372416021087647853783690262677096107081728/bitcask.write.lock
> 
> 
> On Jul 30, 2010, at 6:03 PM, David Smith wrote:
> 
> > Yup, that looks like the file handle leak. You can verify by using
> > lsof on the server and looking for multiple handles to
> > bitcask.write.lock. Something like:
> >
> > lsof -p pid | awk '{print $9}'| uniq -c
> >
> > D.
> >
> > On Friday, July 30, 2010, Alex Wolfe <[email protected]> wrote:
> >> Hey David.
> >> Does the below log output look like it could be caused by the issue you 
> >> fixed?
> >> Alex
> >>
> >> ==== Fri Jul 30 14:22:34 CDT 2010
> >> =ERROR REPORT==== 30-Jul-2010::14:22:34 ===** State machine <0.176.0> 
> >> terminating ** Last event in was {riak_vnode_req_v1,                     
> >> 593735040165679310520246963290989976735222595584,                     
> >> {fsm,undefined,<0.12466.0>},                     {riak_kv_put_req_v1,      
> >>                 {<<"test.groups">>,<<"EghzXywWrGGtp2fCcSLoatIdjML">>},     
> >>                  {r_object,<<"test.groups">>,                       
> >> <<"EghzXywWrGGtp2fCcSLoatIdjML">>,                       [{r_content,      
> >>                    {dict,5,16,16,8,80,48,                          
> >> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},                         
> >>  {{[],[],                            [[<<"Links">>]],                      
> >>       [],[],[],[],[],[],[],                            
> >> [[<<"content-type">>,97,112,112,108,105,99,97,                             
> >>  116,105,111,110,47,106,115,111,110],                             
> >> [<<"X-Riak-VTag">>,89,69,78,55,55,111,66,121,73,                           
> >>    69,78,53,122,101,85,105,117,68,89,80,52]],                            
> >> [],[],                            [[<<"X-Riak-Last-Modified">>|            
> >>                   {1280,517754,951062}]],                            [],   
> >>                          [[<<"X-Riak-Meta">>]]}}},                         
> >> <<"{\"name\":\"foo\",\"created_at\":\"2010-07-30T19:22:34.947Z\",\"type\":\"group\",\"version\":1}">>}],
> >>                        [{<<0,55,119,231>>,{1,63447736954}}],               
> >>         {dict,1,16,16,8,80,48,                        
> >> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},                        
> >> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],                          
> >> [[clean|true]],                          []}}},                       
> >> undefined},                      33218311,63447736954,                     
> >>  [{returnbody,true}]}}** When State == active**      Data  == 
> >> {state,593735040165679310520246963290989976735222595584,                   
> >>     riak_kv_vnode,                       
> >> {state,593735040165679310520246963290989976735222595584,                   
> >>             riak_kv_bitcask_backend,                              
> >> {#Ref<0.0.0.611>,                               
> >> "data/bitcask/593735040165679310520246963290989976735222595584"},          
> >>                     [],false},                       undefined,none}** 
> >> Reason for termination = ** {{badmatch,{error,emfile}},   
> >> [{bitcask_fileops,create_file_loop,3},    {bitcask,put,3},    
> >> {riak_kv_bitcask_backend,put,3},    {riak_kv_vnode,perform_put,3},    
> >> {riak_kv_vnode,do_put,7},    {riak_kv_vnode,handle_command,3},    
> >> {riak_core_vnode,vnode_command,3},    {gen_fsm,handle_msg,7}]}
> >> =ERROR REPORT==== 30-Jul-2010::14:22:35 ===Failed to open lock file 
> >> data/bitcask/593735040165679310520246963290989976735222595584/bitcask.write.lock:
> >>  emfile
> >> =ERROR REPORT==== 30-Jul-2010::14:22:35 ===** State machine <0.12471.0> 
> >> terminating ** Last event in was {riak_vnode_req_v1,                     
> >> 593735040165679310520246963290989976735222595584,                     
> >> {fsm,undefined,<0.12470.0>},                     {riak_kv_put_req_v1,      
> >>                 {<<"test.users">>,<<"ZrAxzFghd51VG902GuCcJ2gYOMJ">>},      
> >>                 {r_object,<<"test.users">>,                       
> >> <<"ZrAxzFghd51VG902GuCcJ2gYOMJ">>,                       [{r_content,      
> >>                    {dict,5,16,16,8,80,48,                          
> >> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},                         
> >>  {{[],[],                            [[<<"Links">>,                        
> >>       {{<<"test.groups">>,                                
> >> <<"EghzXywWrGGtp2fCcSLoatIdjML">>},                               
> >> <<"groups">>}]],                            [],[],[],[],[],[],[],          
> >>                   [[<<"content-type">>,97,112,112,108,105,99,97,           
> >>                    116,105,111,110,47,106,115,111,110],                    
> >>          [<<"X-Riak-VTag">>,50,88,97,105,85,49,70,49,55,                   
> >>            54,48,71,113,115,75,103,54,102,84,56,118,84]],                  
> >>           [],[],                            
> >> [[<<"X-Riak-Last-Modified">>|{1280,517755,638}]],                          
> >>   [],                            [[<<"X-Riak-Meta">>]]}}},                 
> >>         
> >> <<"{\"name\":\"bar\",\"created_at\":\"2010-07-30T19:22:34.998Z\",\"type\":\"user\",\"version\":1}">>}],
> >>                        [{<<3,30,180,15>>,{1,63447736955}}],                
> >>        {dict,1,16,16,8,80,48,                        
> >> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},                        
> >> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],                          
> >> [[clean|true]],                          []}}},                       
> >> undefined},                      122945524,63447736955,                    
> >>   [{returnbody,true}]}}** When State == active**      Data  == 
> >> {state,593735040165679310520246963290989976735222595584,                   
> >>     riak_kv_vnode,                       
> >> {state,593735040165679310520246963290989976735222595584,                   
> >>             riak_kv_bitcask_backend,                              
> >> {#Ref<0.0.0.16357>,                               
> >> "data/bitcask/593735040165679310520246963290989976735222595584"},          
> >>                     [],false},                       undefined,none}** 
> >> Reason for termination = ** 
> >> {bad_return_value,{error,{write_locked,emfile}}}
> >> ==> sasl-error.log <==
> >> =CRASH REPORT==== 30-Jul-2010::14:22:35 === crasher:   initial call: 
> >> riak_core_vnode:init/1   pid: <0.176.0>   registered_name: []   exception 
> >> exit: {{badmatch,{error,emfile}},                    
> >> [{bitcask_fileops,create_file_loop,3},                     
> >> {bitcask,put,3},                     {riak_kv_bitcask_backend,put,3},      
> >>                {riak_kv_vnode,perform_put,3},                     
> >> {riak_kv_vnode,do_put,7},                     
> >> {riak_kv_vnode,handle_command,3},                     
> >> {riak_core_vnode,vnode_command,3},                     
> >> {gen_fsm,handle_msg,7}]}     in function  gen_fsm:terminate/7   ancestors: 
> >> [riak_core_vnode_sup,riak_core_sup,<0.98.0>]   messages: []   links: 
> >> [#Port<0.3507>,<0.100.0>]   dictionary: []   trap_exit: true   status: 
> >> running   heap_size: 1597   stack_size: 24   reductions: 11185 neighbours:
> >> =SUPERVISOR REPORT==== 30-Jul-2010::14:22:35 ===    Supervisor: 
> >> {local,riak_core_vnode_sup}    Context:    child_terminated    Reason:     
> >> {{badmatch,{error,emfile}},                 
> >> [{bitcask_fileops,create_file_loop,3},                  {bitcask,put,3},   
> >>                {riak_kv_bitcask_backend,put,3},                  
> >> {riak_kv_vnode,perform_put,3},                  {riak_kv_vnode,do_put,7},  
> >>                 {riak_kv_vnode,handle_command,3},                  
> >> {riak_core_vnode,vnode_command,3},                  
> >> {gen_fsm,handle_msg,7}]}    Offender:   [{pid,<0.176.0>},                 
> >> {name,undefined},                 {mfa,                     
> >> {riak_core_vnode,start_link,                         [riak_kv_vnode,       
> >>                    593735040165679310520246963290989976735222595584]}},    
> >>              {restart_type,temporary},                 
> >> {shutdown,brutal_kill},                 {child_type,worker}]
> >>
> >> =CRASH REPORT==== 30-Jul-2010::14:22:35 === crasher:   initial call: 
> >> riak_core_vnode:init/1   pid: <0.12471.0>   registered_name: []   
> >> exception exit: {bad_return_value,{error,{write_locked,emfile}}}     in 
> >> function  gen_fsm:terminate/7   ancestors: 
> >> [riak_core_vnode_sup,riak_core_sup,<0.98.0>]   messages: []   links: 
> >> [<0.100.0>]   dictionary: []   trap_exit: true   status: running   
> >> heap_size: 2584   stack_size: 24   reductions: 1955 neighbours:
> >> =SUPERVISOR REPORT==== 30-Jul-2010::14:22:35 ===    Supervisor: 
> >> {local,riak_core_vnode_sup}    Context:    child_terminated    Reason:     
> >> {bad_return_value,{error,{write_locked,emfile}}}    Offender:   
> >> [{pid,<0.12471.0>},                 {name,undefined},                 
> >> {mfa,                     {riak_core_vnode,start_link,                     
> >>     [riak_kv_vnode,                          
> >> 593735040165679310520246963290989976735222595584]}},                 
> >> {restart_type,temporary},                 {shutdown,brutal_kill},          
> >>        {child_type,worker}]
> >>
> >> ==> erlang.log.4 <==
> >> =ERROR REPORT==== 30-Jul-2010::14:23:27 ===** Generic server memsup 
> >> terminating ** Last message in was {'EXIT',<0.21020.78>,                   
> >>        {emfile,                              [{erlang,open_port,           
> >>                         [{spawn,"/bin/sh -s unix:cmd 2>&1"},               
> >>                      [stream]]},                               
> >> {os,start_port_srv_loop,2}]}}** When Server state == {state,{unix,darwin}, 
> >>                              false,                              
> >> {1897500000,7776508000},                              {<0.81.0>,972008},   
> >>                            false,60000,30000,0.8,0.05,<0.21020.78>,        
> >>                       #Ref<0.0.2.58535>,undefined,                         
> >>      [reg],                              []}** Reason for termination == 
> >> ** {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 
> >> 2>&1"},[stream]]},           {os,start_port_srv_loop,2}]}
> >> =ERROR REPORT==== 30-Jul-2010::14:23:27 ===** Generic server memsup 
> >> terminating ** Last message in was {'EXIT',<0.21186.78>,                   
> >>        {emfile,                              [{erlang,open_port,           
> >>                         [{spawn,"/bin/sh -s unix:cmd 2>&1"},               
> >>                      [stream]]},                               
> >> {os,start_port_srv_loop,2}]}}** When Server state == {state,{unix,darwin}, 
> >>                              false,undefined,undefined,false,60000,30000,  
> >>                             0.8,0.05,<0.21186.78>,#Ref<0.0.2.58551>,       
> >>                        undefined,                              [reg],      
> >>                         []}** Reason for termination == ** 
> >> {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]}, 
> >>           {os,start_port_srv_loop,2}]}
> >> =ERROR REPORT==== 30-Jul-2010::14:23:27 ===** Generic server memsup 
> >> terminating ** Last message in was {'EXIT',<0.21252.78>,                   
> >>        {emfile,                              [{erlang,open_port,           
> >>                         [{spawn,"/bin/sh -s unix:cmd 2>&1"},               
> >>                      [stream]]},                               
> >> {os,start_port_srv_loop,2}]}}** When Server state == {state,{unix,darwin}, 
> >>                              false,undefined,undefined,false,60000,30000,  
> >>                             0.8,0.05,<0.21252.78>,#Ref<0.0.2.58559>,       
> >>                        undefined,                              [reg],      
> >>                         []}** Reason for termination == ** 
> >> {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]}, 
> >>           {os,start_port_srv_loop,2}]}
> >> =ERROR REPORT==== 30-Jul-2010::14:23:27 ===** Generic server memsup 
> >> terminating ** Last message in was {'EXIT',<0.21374.78>,                   
> >>        {emfile,                              [{erlang,open_port,           
> >>                         [{spawn,"/bin/sh -s unix:cmd 2>&1"},               
> >>                      [stream]]},                               
> >> {os,start_port_srv_loop,2}]}}** When Server state == {state,{unix,darwin}, 
> >>                              false,undefined,undefined,false,60000,30000,  
> >>                             0.8,0.05,<0.21374.78>,#Ref<0.0.2.58569>,       
> >>                        undefined,                              [reg],      
> >>                         []}** Reason for termination == ** 
> >> {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]}, 
> >>           {os,start_port_srv_loop,2}]}
> >> =ERROR REPORT==== 30-Jul-2010::14:23:27 ===** Generic server memsup 
> >> terminating ** Last message in was {'EXIT',<0.21396.78>,                   
> >>        {emfile,                              [{erlang,open_port,           
> >>                         [{spawn,"/bin/sh -s unix:cmd 2>&1"},               
> >>                      [stream]]},                               
> >> {os,start_port_srv_loop,2}]}}** When Server state == {state,{unix,darwin}, 
> >>                              false,undefined,undefined,false,60000,30000,  
> >>                             0.8,0.05,<0.21396.78>,#Ref<0.0.2.58575>,       
> >>                        undefined,                              [reg],      
> >>                         []}** Reason for termination == ** 
> >> {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]}, 
> >>           {os,start_port_srv_loop,2}]}
> >> =ERROR REPORT==== 30-Jul-2010::14:23:27 ===** State machine <0.175.0> 
> >> terminating ** Last event in was {riak_vnode_req_v1,                     
> >> 570899077082383952423314387779798054553098649600,                     
> >> {fsm,undefined,<0.12470.0>},                     {riak_kv_put_req_v1,      
> >>                 {<<"test.users">>,<<"ZrAxzFghd51VG902GuCcJ2gYOMJ">>},      
> >>                 {r_object,<<"test.users">>,                       
> >> <<"ZrAxzFghd51VG902GuCcJ2gYOMJ">>,                       [{r_content,      
> >>                    {dict,5,16,16,8,80,48,                          
> >> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},                         
> >>  {{[],[],                            [[<<"Links">>,                        
> >>       {{<<"test.groups">>,                                
> >> <<"EghzXywWrGGtp2fCcSLoatIdjML">>},                               
> >> <<"groups">>}]],                            [],[],[],[],[],[],[],          
> >>                   [[<<"content-type">>,97,112,112,108,105,99,97,           
> >>                    116,105,111,110,47,106,115,111,110],                    
> >>          [<<"X-Riak-VTag">>,50,88,97,105,85,49,70,49,55,                   
> >>            54,48,71,113,115,75,103,54,102,84,56,118,84]],                  
> >>           [],[],                            
> >> [[<<"X-Riak-Last-Modified">>|{1280,517755,638}]],                          
> >>   [],                            [[<<"X-Riak-Meta">>]]}}},                 
> >>         
> >> <<"{\"name\":\"bar\",\"created_at\":\"2010-07-30T19:22:34.998Z\",\"type\":\"user\",\"version\":1}">>}],
> >>                        [{<<3,30,180,15>>,{1,63447736955}}],                
> >>        {dict,1,16,16,8,80,48,                        
> >> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},                        
> >> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],                          
> >> [[clean|true]],                          []}}},                       
> >> undefined},                      122945524,63447736955,                    
> >>   [{returnbody,true}]}}** When State == active**      Data  == 
> >> {state,570899077082383952423314387779798054553098649600,                   
> >>     riak_kv_vnode,                       
> >> {state,570899077082383952423314387779798054553098649600,
> 
> 

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to