Re: Errors when moving a shard

Carlos Alonso Thu, 27 Jul 2017 04:09:26 -0700

Hi Robert,

Following your comments I've made some tests and came up with this
different scenarios based on the replication technique (internal replicator
or rsync) and the maintenance_mode flag


*I can:*
* Rsync the shard and then modify the shards map straight away with no
errors
* Enable maintenance mode on the receiver node and modify the shards map
and let the internal replicator sync the whole shard

*I cannot:*
* Rsync the shard, set maintenance mode on the receiver node and modify the
shards map because this errors appear all the time:

[info]  -------- db shards/c0000000-dfffffff/my_db.1501146045 died with
reason
{{badmatch,{error,eacces}},[{couch_file,init,1,[{file,"src/couch_file.erl"},{line,381}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,306}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}
[error]  -------- CRASH REPORT Process  (<0.12185.4>) with 1 neighbors
exited with reason: no match of right hand value {error,eacces} at
couch_file:init/1(line:381) <= gen_server:init_it/6(line:306) <=
proc_lib:init_p_do_apply/3(line:237) at gen_server:init_it/6(line:330) <=
proc_lib:init_p_do_apply/3(line:237); initial_call:
{couch_file,init,['Argument__1']}, ancestors: [<0.12184.4>], messages: [],
links: [#Port<0.31027>,<0.12184.4>], dictionary: [], trap_exit: true,
status: running, heap_size: 610, stack_size: 27, reductions: 605

*I should not:*
* Just modify the shards map and let the internal replicator sync the whole
shard because there may be potential consistency issues (since nodes reply
to views requests using local data)

Aside from that, when a shard has been completely synced and I'm ready to
delete the original one to reclaim disk space I have to delete it and
restart the corresponding couchdb instance straight away to avoid strange
behaviours due to references to the file still held in memory.

Does this list make sense?

Regards

On Thu, Jul 27, 2017 at 10:45 AM Robert Samuel Newson <rnew...@apache.org>
wrote:

> Hi Carlos,
>
> The SO post is more geared towards a larger rebalancing effort, which is
> why it copies the files around directly (it's a lot faster).
>
> If you want to rely on the internal replicator, that's fine, but it's
> noisy. It would be a bit of work to describe how to do that with the
> minimum of noise and taking care of correctness.
>
> At cloudant we obviously do this kind of operation from time to time and
> have our own tools and procedures. I will mention this again internally, as
> it really should be easier to do this in CouchDB.
>
> To your other question. the shard map stored in the '_dbs' database is
> considered definitive. all nodes have a copy of that (unsharded) database
> and all nodes cache those maps into memory as they are consulted very
> frequently. most actions that occur in the mem3 tier (where knowledge of
> all shards lives and where the internal replicator lives) will create files
> that are missing automatically, as you've seen.
>
> As a general point, you shouldn't be creating or deleting files in the
> database directory while CouchDB is running. My procedure copies the file
> to its new location _before_ telling couchdb about it, and that's
> important. CouchDB will keep an open file descriptor for performance
> reasons, so copying data to those files via external processes, or even
> deleting the files, can have unexpected consequences and data loss is
> certainly possible.
>
> B.
>
>
> > On 27 Jul 2017, at 09:17, Carlos Alonso <carlos.alo...@cabify.com>
> wrote:
> >
> > Hi Robert, thanks for your reply. Very good SO response indeed.
> >
> > Just a couple of questions yet.
> >
> > When you say "Perform a replication, again taking care to do this on port
> > 5986" what do you exactly mean? Could you please paste replication
> document
> > example? Isn't a replication started as soon as the shards map is
> modified?
> >
> > Also, related to the issue I've experienced. When I manually delete the
> > shard from the first location, is there anything internal to CouchDB that
> > still references it? When I tried to move it back, the logs shown an
> error
> > complaining that the file already existed (but in reality it didn't).
> > Please see those errors pasted below:
> >
> > 1. Tries to create the shard, somehow detect that it existed before
> > (probably something I forgot to delete on step 6)
> >
> > `mem3_shards tried to create shards/0fffffff-15555553/my-db.1500994155,
> got
> > file_exists` (edited)
> >
> > 2. gen_server crashes
> >
> > `CRASH REPORT Process  (<0.27288.4>) with 0 neighbors exited with reason:
> > no match of right hand value {error,enoent} at
> couch_file:sync/1(line:211)
> > <= couch_db_updater:sync_header/2(line:987) <=
> > couch_db_updater:update_docs_int/5(line:906) <=
> > couch_db_updater:handle_info/2(line:289) <=
> > gen_server:handle_msg/5(line:599) <= proc_lib:wake_up/3(line:247) at
> > gen_server:terminate/6(line:737) <= proc_lib:wake_up/3(line:247);
> > initial_call: {couch_db_updater,init,['Argument__1']}, ancestors:
> > [<0.27267.4>], messages: [], links: [<0.210.0>], dictionary:
> > [{io_priority,{db_update,<<"shards/0fffffff-15555553/my_db...">>}},...],
> > trap_exit: false, status: running, heap_size: 6772, stack_size: 27,
> > reductions: 300961927`
> >
> > 3. Seems to somehow recover and try to open the file again
> >
> > ```
> > Could not open file
> ./data/shards/0fffffff-15555553/my_db.1500994155.couch:
> > no such file or directory
> >
> > open_result error {not_found,no_db_file} for
> > shards/0fffffff-15555553/my_db.1500994155
> > ```
> >
> > 4. Tries to create the file
> >
> > `creating missing database: shards/0fffffff-15555553/my_db.1500994155`
> >
> >
> > Thank you very much.
> >
> > On Thu, Jul 27, 2017 at 9:59 AM Robert Samuel Newson <rnew...@apache.org
> >
> > wrote:
> >
> >> Not sure if you saw my write-up from the BigCouch era, still valid for
> >> CouchDB 2.0;
> >>
> >>
> >>
> https://stackoverflow.com/questions/6676972/moving-a-shard-from-one-bigcouch-server-to-another-for-balancing
> >>
> >> Shard moving / database rebalancing is definitely a bit tricky and we
> >> could use better tools for it.
> >>
> >>
> >>> On 26 Jul 2017, at 18:10, Carlos Alonso <carlos.alo...@cabify.com>
> >> wrote:
> >>>
> >>> Hi!
> >>>
> >>> I have had a few log errors when moving a shard under particular
> >>> circumstances and I'd like to share it here and get your input on
> whether
> >>> this should be reported or not.
> >>>
> >>> So let me describe the steps I took:
> >>>
> >>> 1. 3 nodes cluster (couch-0, couch-1 and couch-2), 1 database (my_db)
> >> with
> >>> 48 shards and 1 replica
> >>> 2. A 4th node (couch-3) is added to the cluster.
> >>> 3. Change shards map so that the last one gets one of the shards from
> >>> couch-0 (at this moment both couch-0 and couch-4 contain the shard)
> >>> 4. Synchronisation happens and the new node gets its shard
> >>> 5. Change shards map again so that couch-0 is not that shard owner
> >> anymore
> >>> 6. I go into couch-0 node and manually delete the .couch file of the
> >> shard,
> >>> to reclaim disk space
> >>> 7. All fine here
> >>>
> >>> 8. Now I want to put back the shard into the original node, where it
> was
> >>> before
> >>> 9. I put couch-0 into maintenance (I always do this before adding a
> shard
> >>> to a node, to avoid it responding to reads before it is synced)
> >>> 10. Modify the shards map adding the shard to the couch-0
> >>> 11. All nodes logs get full of errors (details below)
> >>> 12. I remove couch-0 maintenance mode and things seem to flow again
> >>>
> >>> So this is the process, now please let me describe what I spotted on
> the
> >>> logs:
> >>>
> >>> Couch-0 seems to go through a few statuses:
> >>>
> >>> 1. Tries to create the shard, somehow detect that it existed before
> >>> (probably something I forgot to delete on step 6)
> >>>
> >>> `mem3_shards tried to create shards/0fffffff-15555553/my-db.1500994155,
> >> got
> >>> file_exists` (edited)
> >>>
> >>> 2. gen_server crashes
> >>>
> >>> `CRASH REPORT Process  (<0.27288.4>) with 0 neighbors exited with
> reason:
> >>> no match of right hand value {error,enoent} at
> >> couch_file:sync/1(line:211)
> >>> <= couch_db_updater:sync_header/2(line:987) <=
> >>> couch_db_updater:update_docs_int/5(line:906) <=
> >>> couch_db_updater:handle_info/2(line:289) <=
> >>> gen_server:handle_msg/5(line:599) <= proc_lib:wake_up/3(line:247) at
> >>> gen_server:terminate/6(line:737) <= proc_lib:wake_up/3(line:247);
> >>> initial_call: {couch_db_updater,init,['Argument__1']}, ancestors:
> >>> [<0.27267.4>], messages: [], links: [<0.210.0>], dictionary:
> >>>
> [{io_priority,{db_update,<<"shards/0fffffff-15555553/my_db...">>}},...],
> >>> trap_exit: false, status: running, heap_size: 6772, stack_size: 27,
> >>> reductions: 300961927`
> >>>
> >>> 3. Seems to somehow recover and try to open the file again
> >>>
> >>> ```
> >>> Could not open file
> >> ./data/shards/0fffffff-15555553/my_db.1500994155.couch:
> >>> no such file or directory
> >>>
> >>> open_result error {not_found,no_db_file} for
> >>> shards/0fffffff-15555553/my_db.1500994155
> >>> ```
> >>>
> >>> 4. Tries to create the file
> >>>
> >>> `creating missing database: shards/0fffffff-15555553/my_db.1500994155`
> >>>
> >>> 5. Continuously fails because it cannot load validation funcs, possibly
> >>> because of the maintenance mode?
> >>>
> >>> ```
> >>> Error in process <0.2126.141> on node 'couchdb@couch-0' with exit
> value:
> >>> {{badmatch,{error,{maintenance_mode,nil,'couchdb@couch-0
> >>>
> >>
> '}}},[{ddoc_cache_opener,recover_validation_funs,1,[{file,"src/ddoc_cache_opener.erl"},{line,127}]},{ddoc_cache_opener,fetch_doc_data,1,[...
> >>>
> >>> Error in process <0.1970.141> on node 'couchdb@couch-0' with exit
> value:
> >>>
> >>
> {{case_clause,{error,{{badmatch,{error,{maintenance_mode,nil,'couchdb@couch-0
> >>>
> >>
> '}}},[{ddoc_cache_opener,recover_validation_funs,1,[{file,"src/ddoc_cache_opener.erl"},{line,127}]},{ddoc_cache_opener...
> >>>
> >>> could not load validation funs
> >>>
> >>
> {{case_clause,{error,{{badmatch,{error,{maintenance_mode,nil,'couchdb@couch-0
> >>>
> >>
> '}}},[{ddoc_cache_opener,recover_validation_funs,1,[{file,"src/ddoc_cache_opener.erl"},{line,127}]},{ddoc_cache_opener,fetch_doc_data,1,[{file,"src/ddoc_cache_opener.erl"},{line,240}]}]}}},[{ddoc_cache_opener,handle_open_response,1,[{file,"src/ddoc_cache_opener.erl"},{line,282}]},{couch_db,'-load_validation_funs/1-fun-0-',1,[{file,"src/couch_db.erl"},{line,659}]}]}
> >>> ```
> >>>
> >>> Couch-3 shows a warning and an error
> >>>
> >>> ```
> >>> [warning] ... -------- mem3_sync
> >> shards/0fffffff-15555553/my_db.1500994155
> >>> couchdb@couch-0
> >>>
> >>
> {internal_server_error,[{mem3_rpc,rexi_call,2,[{file,[115,114,99,47,109,101,109,51,95,114,112,99,46,101,114,108]},{line,267}]},{mem3_rep,save_on_target,3,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,286}]},{mem3_rep,replicate_batch,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,256}]},{mem3_rep,repl,2,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,178}]},{mem3_rep,go,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,81}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,[115,114,99,47,109,101,109,51,95,115,121,110,99,46,101,114,108]},{line,208}]}]}
> >>> ```
> >>> ```
> >>> [error] ... -------- Error in process <0.13658.13> on node
> >> 'couchdb@couch-3'
> >>> with exit value:
> >>>
> >>
> {internal_server_error,[{mem3_rpc,rexi_call,2,[{file,"src/mem3_rpc.erl"},{line,267}]},{mem3_rep,save_on_target,3,[{file,"src/mem3_rep.erl"},{line,286}]},{mem3_rep,replicate_batch,1,[{file,"src/mem3_rep.erl"},{line,256}]},{mem3_rep...
> >>> ```
> >>>
> >>> Which, to me, means that couch-0 is responding internal server errors
> to
> >>> his requests
> >>>
> >>> Couch-1, which is, by the way, the owner of the task for replicating
> >> my_db
> >>> from a remote server seems to go through two statuses:
> >>>
> >>> First seems as being unable to continue with the replication process
> >>> because receives a 500 error (maybe from couch-0?)
> >>> ```
> >>> [error] ... req_err(4096501418 <(409)%20650-1418> <(409)%20650-1418>)
> unknown_error :
> >> badarg
> >>>   [<<"dict:fetch/2
> L130">>,<<"couch_util:-reorder_results/2-lc$^1/1-1-/2
> >>> L424">>,<<"couch_util:-reorder_results/2-lc$^1/1-1-/2
> >>> L424">>,<<"fabric_doc_update:go/3 L41">>,<<"fabric:update_docs/3
> >>> L259">>,<<"chttpd_db:db_req/2 L445">>,<<"chttpd:process_request/1
> >>> L293">>,<<"chttpd:handle_request_int/1 L229">>]
> >>>
> >>>
> >>> [notice] ... 127.0.0.1:5984 127.0.0.1 undefined POST /my_db/_bulk_docs
> >> 500
> >>> ok 114
> >>>
> >>> [notice] ... Retrying POST request to
> >> http://127.0.0.1:5984/my_db/_bulk_docs
> >>> in 0.25 seconds due to error {code,500}
> >>> ```
> >>>
> >>> After disabling maintenance on couch-0 the replication process seem to
> >> work
> >>> again, but a few seconds later lots of new errors appear again:
> >>>
> >>> ```
> >>> [error]  -------- rexi_server exit:timeout
> >>>
> >>
> [{rexi,init_stream,1,[{file,"src/rexi.erl"},{line,256}]},{rexi,stream2,3,[{file,"src/rexi.erl"},{line,204}]},{fabric_rpc,view_cb,2,[{file,"src/fabric_rpc.erl"},{line,286}]},{couch_mrview,finish_fold,2,[{file,"src/couch_mrview.erl"},{line,632}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,139}]}]
> >>> ```
> >>>
> >>> And a few seconds later (I have not been able to correlate it to
> anything
> >>> so far) they stop.
> >>>
> >>> Finally, couch-2 just show one error, the same as the last from couch-1
> >>>
> >>> ```
> >>> [error]  -------- rexi_server exit:timeout
> >>>
> >>
> [{rexi,init_stream,1,[{file,"src/rexi.erl"},{line,256}]},{rexi,stream2,3,[{file,"src/rexi.erl"},{line,204}]},{fabric_rpc,view_cb,2,[{file,"src/fabric_rpc.erl"},{line,286}]},{couch_mrview,finish_fold,2,[{file,"src/couch_mrview.erl"},{line,632}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,139}]}]
> >>> ```
> >>>
> >>> *In conclusion:*
> >>>
> >>> To me it looks like two things are involved here:
> >>>
> >>> 1. The fact that I deleted the file from disk and something else still
> >> know
> >>> that it should be there
> >>> 2. The fact that the node is under maintenance and it seems that
> prevents
> >>> from new shards to be created
> >>>
> >>> Sorry for such a wall of text. I hope it is detailed enough to get
> >>> someone's input on this that can help me confirm or refuse my theories
> >> and
> >>> decide whether it makes sense or not to open a GH issue to make the
> >> process
> >>> more robust at this stage.
> >>>
> >>> Regards
> >>>
> >>>
> >>> --
> >>> [image: Cabify - Your private Driver] <http://www.cabify.com/>
> >>>
> >>> *Carlos Alonso*
> >>> Data Engineer
> >>> Madrid, Spain
> >>>
> >>> carlos.alo...@cabify.com
> >>>
> >>> Prueba gratis con este código
> >>> #CARLOSA6319 <https://cabify.com/i/carlosa6319>
> >>> [image: Facebook] <http://cbify.com/fb_ES>[image: Twitter]
> >>> <http://cbify.com/tw_ES>[image: Instagram] <http://cbify.com/in_ES
> >>> [image:
> >>> Linkedin] <https://www.linkedin.com/in/mrcalonso>
> >>>
> >>> --
> >>> Este mensaje y cualquier archivo adjunto va dirigido exclusivamente a
> su
> >>> destinatario, pudiendo contener información confidencial sometida a
> >> secreto
> >>> profesional. No está permitida su reproducción o distribución sin la
> >>> autorización expresa de Cabify. Si usted no es el destinatario final
> por
> >>> favor elimínelo e infórmenos por esta vía.
> >>>
> >>> This message and any attached file are intended exclusively for the
> >>> addressee, and it may be confidential. You are not allowed to copy or
> >>> disclose it without Cabify's prior written authorization. If you are
> not
> >>> the intended recipient please delete it from your system and notify us
> by
> >>> e-mail.
> >>
> >> --
> > [image: Cabify - Your private Driver] <http://www.cabify.com/>
> >
> > *Carlos Alonso*
> > Data Engineer
> > Madrid, Spain
> >
> > carlos.alo...@cabify.com
> >
> > Prueba gratis con este código
> > #CARLOSA6319 <https://cabify.com/i/carlosa6319>
> > [image: Facebook] <http://cbify.com/fb_ES>[image: Twitter]
> > <http://cbify.com/tw_ES>[image: Instagram] <http://cbify.com/in_ES
> >[image:
> > Linkedin] <https://www.linkedin.com/in/mrcalonso>
> >
> > --
> > Este mensaje y cualquier archivo adjunto va dirigido exclusivamente a su
> > destinatario, pudiendo contener información confidencial sometida a
> secreto
> > profesional. No está permitida su reproducción o distribución sin la
> > autorización expresa de Cabify. Si usted no es el destinatario final por
> > favor elimínelo e infórmenos por esta vía.
> >
> > This message and any attached file are intended exclusively for the
> > addressee, and it may be confidential. You are not allowed to copy or
> > disclose it without Cabify's prior written authorization. If you are not
> > the intended recipient please delete it from your system and notify us by
> > e-mail.
>
> --
[image: Cabify - Your private Driver] <http://www.cabify.com/>

*Carlos Alonso*
Data Engineer
Madrid, Spain

carlos.alo...@cabify.com

Prueba gratis con este código
#CARLOSA6319 <https://cabify.com/i/carlosa6319>
[image: Facebook] <http://cbify.com/fb_ES>[image: Twitter]
<http://cbify.com/tw_ES>[image: Instagram] <http://cbify.com/in_ES>[image:
Linkedin] <https://www.linkedin.com/in/mrcalonso>

-- 
Este mensaje y cualquier archivo adjunto va dirigido exclusivamente a su 
destinatario, pudiendo contener información confidencial sometida a secreto 
profesional. No está permitida su reproducción o distribución sin la 
autorización expresa de Cabify. Si usted no es el destinatario final por 
favor elimínelo e infórmenos por esta vía. 

This message and any attached file are intended exclusively for the 
addressee, and it may be confidential. You are not allowed to copy or 
disclose it without Cabify's prior written authorization. If you are not 
the intended recipient please delete it from your system and notify us by 
e-mail.

Re: Errors when moving a shard

Reply via email to