hkvision commented on issue #17651: Distributed training with kvstore crashes if worker has different number of data batches URL: https://github.com/apache/incubator-mxnet/issues/17651#issuecomment-591834890 @eric-haibin-lin Hi, I'm using NDArrayIter. I checked ResizeIter, so in order that all my data are trained, I need to set the size to the largest batch among all workers and for workers with less batches, after finish all the data, it will iterate from the very beginning until the target size is reached. Point out if I'm wrong. :) Thanks so much and this can be a workaround for me.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services