Re: Something odd with async request status for BACKUP operation on Collections API

Shalin Shekhar Mangar Sun, 14 Oct 2018 21:39:47 -0700

The responses are collected by node so subsequent responses from the same
node overwrite previous responses. Definitely a bug. Please open an issue.


On Mon, Oct 15, 2018 at 6:24 AM Shawn Heisey <apa...@elyograg.org> wrote:

> On 10/14/2018 6:25 PM, dami...@gmail.com wrote:
> > I had an issue with async backup on solr 6.5.1 reporting that the backup
> > was complete when clearly it was not. I was using 12 shards across 6
> nodes.
> > I only noticed this issue when one shard was much larger than the others.
> > There were no answers here
> > http://lucene.472066.n3.nabble.com/async-backup-td4342776.html
>
> One detail I thought I had written but isn't there:  The backup did
> fully complete -- all 30 shards were in the backup location.  Not a lot
> in each shard backup -- the collection was empty.  It would be easy
> enough to add a few thousand documents to the collection before doing
> the backup.
>
> If the backup process reports that it's done before it's ACTUALLY done,
> that's a bad thing.  It's hard to say whether that problem is related to
> the problem I described.  Since I haven't dived into the code, I cannot
> say for sure, but it honestly would not surprise me to find they are
> connected.  Every time I try to understand Collections API code, I find
> it extremely difficult to follow.
>
> I'm sorry that you never got resolution on your problem.  Do you know
> whether that is still a problem in 7.x?  Setting up a reproduction where
> one shard is significantly larger than the others will take a little bit
> of work.
>
> > I was focusing on the STATUS returned from the REQUESTSTATUS command, but
> > looking again now I can see a response from only 6 shards, and each shard
> > is from a different node. So this fits with what you're seeing. I assume
> > your shards 1, 7, 9 are all on different nodes.
>
> I did not actually check, and the cloud example I was using isn't around
> any more, but each of the shards in the status response were PROBABLY on
> separate nodes.  The cloud example was 3 nodes.  It's an easy enough
> scenario to replicate, and I provided enough details for anyone to do it.
>
> The person on IRC that reported this problem had a cluster of 15 nodes,
> and the status response had ten shards (out of 30) mentioned.  It was
> shards 1-9 and shard 20.  The suspicion is that there's something
> hard-coded that limits it to 10 responses ... because without that, I
> would expect the number of shards in the response to match the number of
> nodes.
>
> Thanks,
> Shawn
>
>

-- 
Regards,
Shalin Shekhar Mangar.

Re: Something odd with async request status for BACKUP operation on Collections API

Reply via email to