Hmm, this is definitely narrowing the race (probably enough to never hit it), 
but it's not actually eliminating it (if the restart happens after 4 billion 
requests…). More importantly this kind of symptom makes me worry that we might 
be papering over more serious issues with colliding states in the Table on 
restart.
I don't have the MDSTable semantics in my head so I'll need to look into this 
later unless somebody else volunteers to do so…
-Greg

Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sunday, March 17, 2013 at 7:51 AM, Yan, Zheng wrote:

> From: "Yan, Zheng" <zheng.z....@intel.com>
>  
> When a MDS becomes active, the table server re-sends 'agree' messages
> for old prepared request. If the recoverd MDS starts a new table request
> at the same time, The new request's ID can happen to be the same as old
> prepared request's ID, because current table client assigns request ID
> from zero after MDS restarts.
>  
> Signed-off-by: Yan, Zheng <zheng.z....@intel.com 
> (mailto:zheng.z....@intel.com)>
> ---
> src/mds/MDS.cc (http://MDS.cc) | 3 +++
> src/mds/MDSTableClient.cc (http://MDSTableClient.cc) | 5 +++++
> src/mds/MDSTableClient.h | 2 ++
> 3 files changed, 10 insertions(+)
>  
> diff --git a/src/mds/MDS.cc (http://MDS.cc) b/src/mds/MDS.cc (http://MDS.cc)
> index bb1c833..859782a 100644
> --- a/src/mds/MDS.cc (http://MDS.cc)
> +++ b/src/mds/MDS.cc (http://MDS.cc)
> @@ -1212,6 +1212,9 @@ void MDS::boot_start(int step, int r)
> dout(2) << "boot_start " << step << ": opening snap table" << dendl;  
> snapserver->load(gather.new_sub());
> }
> +
> + anchorclient->init();
> + snapclient->init();
>  
> dout(2) << "boot_start " << step << ": opening mds log" << dendl;
> mdlog->open(gather.new_sub());
> diff --git a/src/mds/MDSTableClient.cc (http://MDSTableClient.cc) 
> b/src/mds/MDSTableClient.cc (http://MDSTableClient.cc)
> index ea021f5..beba0a3 100644
> --- a/src/mds/MDSTableClient.cc (http://MDSTableClient.cc)
> +++ b/src/mds/MDSTableClient.cc (http://MDSTableClient.cc)
> @@ -34,6 +34,11 @@
> #undef dout_prefix
> #define dout_prefix *_dout << "mds." << mds->get_nodeid() << ".tableclient(" 
> << get_mdstable_name(table) << ") "
>  
> +void MDSTableClient::init()
> +{
> + // make reqid unique between MDS restarts
> + last_reqid = (uint64_t)mds->mdsmap->get_epoch() << 32;
> +}
>  
> void MDSTableClient::handle_request(class MMDSTableRequest *m)
> {
> diff --git a/src/mds/MDSTableClient.h b/src/mds/MDSTableClient.h
> index e15837f..78035db 100644
> --- a/src/mds/MDSTableClient.h
> +++ b/src/mds/MDSTableClient.h
> @@ -63,6 +63,8 @@ public:
> MDSTableClient(MDS *m, int tab) : mds(m), table(tab), last_reqid(0) {}
> virtual ~MDSTableClient() {}
>  
> + void init();
> +
> void handle_request(MMDSTableRequest *m);
>  
> void _prepare(bufferlist& mutation, version_t *ptid, bufferlist *pbl, Context 
> *onfinish);
> --  
> 1.7.11.7



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to