KellenSunderland opened a new issue #15690: Deadlock when using trivial Gluon Dataset URL: https://github.com/apache/incubator-mxnet/issues/15690 ## Description I'm not very familiar with Gluon Datasets but I'm seeing a deadlock when using them with multiple workers. Am I doing something wrong here? This repro script deadlocks for me in most (but not all) environments. ## Environment info (Required) Tested with many versions of MXNet, but for example a 1.5 build with CUDA deadlocks for me. ## Minimum reproducible example ```python import logging import mxnet as mx from mxnet.gluon.data import dataset _log = logging.getLogger() handler = logging.StreamHandler() formatter = logging.Formatter('%(asctime)s %(name)-12s %(levelname)-8s %(message)s') handler.setFormatter(formatter) _log.addHandler(handler) _log.setLevel(logging.DEBUG) class MyMxNetDataset(dataset.Dataset): def __getitem__(self, idx): out = mx.nd.zeros((6, 8, 32, 120, 120)), mx.nd.zeros((15)), mx.nd.zeros((1)) return out def __len__(self): return 200 def main(): _log.info('Create the Dataset') dataset = MyMxNetDataset() _log.info('Create the DataLoader') train_loader = mx.gluon.data.DataLoader(dataset, batch_size=64, num_workers=2) _log.info('Loop over all the training minibatches') idx = 0 for data0, data1, data2 in train_loader: _log.info('Example {} correctly retrieved. Shapes: {} {} {}'.format(idx, data0.shape, data1.shape, data2.shape)) idx += 1 if idx > 10: break _log.info('Exit main') if __name__ == '__main__': main() ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services