[GitHub] javelinjs commented on issue #7417: Update mxnet in maven timely?
javelinjs commented on issue #7417: Update mxnet in maven timely? URL: https://github.com/apache/incubator-mxnet/issues/7417#issuecomment-322661354 @szha Thanks for invitation to the deployment project. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #7417: Update mxnet in maven timely?
szha commented on issue #7417: Update mxnet in maven timely? URL: https://github.com/apache/incubator-mxnet/issues/7417#issuecomment-322658768 @javelinjs let me know if need any help on this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] javelinjs commented on a change in pull request #7411: [scala-package][spark] fix example script
javelinjs commented on a change in pull request #7411: [scala-package][spark] fix example script URL: https://github.com/apache/incubator-mxnet/pull/7411#discussion_r133352986 ## File path: scala-package/spark/bin/run-mnist-example.sh ## @@ -18,47 +18,62 @@ # under the License. CURR_DIR=$(cd `dirname $0`; pwd) -MODULE_DIR=$(cd $CURR_DIR/../; pwd) -ROOT_DIR=$(cd $CURR_DIR/../../; pwd) +SPARK_MODULE_DIR=$(cd $CURR_DIR/../; pwd) +SCALA_PKG_DIR=$(cd $CURR_DIR/../../; pwd) +OS="" -LIB_DIR=${MODULE_DIR}/target/classes/lib -JAR=${MODULE_DIR}/target/mxnet-spark_2.10-0.1.2-SNAPSHOT.jar +if [ "$(uname)" == "Darwin" ]; then + # Do something under Mac OS X platform + OS='osx-x86_64-cpu' Review comment: Could you make all indent to 2 spaces? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] javelinjs commented on issue #7417: Update mxnet in maven timely?
javelinjs commented on issue #7417: Update mxnet in maven timely? URL: https://github.com/apache/incubator-mxnet/issues/7417#issuecomment-322657600 Sure. I'll work on this. BTW, are we going to change package name from `ml.dmlc` to `org.apache`? cc @mli This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] starimpact commented on issue #7445: Using cuDNN for CTC Loss
starimpact commented on issue #7445: Using cuDNN for CTC Loss URL: https://github.com/apache/incubator-mxnet/issues/7445#issuecomment-322657512 so, the ctc of cudnn7 supports neither variable lengths inputs nor longer labellengths than 256. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #7488: Fixes scaling issue identified in #7455
szha commented on issue #7488: Fixes scaling issue identified in #7455 URL: https://github.com/apache/incubator-mxnet/pull/7488#issuecomment-322656625 Thanks for bringing this up @solin319 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jeremiedb commented on issue #7476: R-package RNN refactor
jeremiedb commented on issue #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#issuecomment-322632234 @thirdwing `source()` and `library()` calls removed. Functions `mx.model.train.rnn.buckets` and `mx.rnn.buckets` merged into `model.rnn.R` in order to better align with `model.R`. Sorry for multiple commits - I struggled a bit with rebasing the submodules. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #7484: add gluon resnet18_v2, resnet34_v2 models
piiswrong closed pull request #7484: add gluon resnet18_v2, resnet34_v2 models URL: https://github.com/apache/incubator-mxnet/pull/7484 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: add gluon resnet18_v2, resnet34_v2 models (#7484)
This is an automated email from the ASF dual-hosted git repository. jxie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new bca9c4c add gluon resnet18_v2, resnet34_v2 models (#7484) bca9c4c is described below commit bca9c4cf0b7c90374557170eec088a2b30b8bb72 Author: Joshua Z. ZhangAuthorDate: Tue Aug 15 19:22:06 2017 -0700 add gluon resnet18_v2, resnet34_v2 models (#7484) --- python/mxnet/gluon/model_zoo/model_store.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/python/mxnet/gluon/model_zoo/model_store.py b/python/mxnet/gluon/model_zoo/model_store.py index 67ba572..e524f21 100644 --- a/python/mxnet/gluon/model_zoo/model_store.py +++ b/python/mxnet/gluon/model_zoo/model_store.py @@ -36,6 +36,8 @@ _model_sha1 = {name: checksum for checksum, name in [ ('38d6d423c22828718ec3397924b8e116a03e6ac0', 'resnet18_v1'), ('4dc2c2390a7c7990e0ca1e53aeebb1d1a08592d1', 'resnet34_v1'), ('2a903ab21260c85673a78fe65037819a843a1f43', 'resnet50_v1'), +('8aacf80ff4014c1efa2362a963ac5ec82cf92d5b', 'resnet18_v2'), +('0ed3cd06da41932c03dea1de7bc2506ef3fb97b3', 'resnet34_v2'), ('264ba4970a0cc87a4f15c96e25246a1307caf523', 'squeezenet1.0'), ('33ba0f93753c83d86e1eb397f38a667eaf2e9376', 'squeezenet1.1'), ('dd221b160977f36a53f464cb54648d227c707a05', 'vgg11'), -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org" '].
[GitHub] kevinthesun closed pull request #7419: Add resnet50_v2, resnet101_V2 and resnet152_v2 gluon pre-trained model
kevinthesun closed pull request #7419: Add resnet50_v2, resnet101_V2 and resnet152_v2 gluon pre-trained model URL: https://github.com/apache/incubator-mxnet/pull/7419 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] madjam opened a new pull request #7488: Fixes scaling issue identified in #7455
madjam opened a new pull request #7488: Fixes scaling issue identified in #7455 URL: https://github.com/apache/incubator-mxnet/pull/7488 @mli @ptrendx @szha This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] starimpact commented on issue #7455: Distributed training is slow
starimpact commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322638762 in mxnet0.8.0 there is no "send_buf.WaitToReadd()". lucky for me.^_^ https://github.com/starimpact/mxnet_v0.8.0/blob/bProxy_Weight/src/kvstore/kvstore_dist.h#L412 my mxnet support partial parameters update. welcome to use it. haha This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] starimpact commented on issue #7455: Distributed training is slow
starimpact commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322638762 in mxnet0.8.0 there is no "send_buf.WaitToReadd()". lucky for me.^_^ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jeremiedb commented on issue #7476: R-package RNN refactor
jeremiedb commented on issue #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#issuecomment-322632234 @thirdwing `source()` and `library()` calls removed. Functions `mx.model.train.rnn.buckets` and `mx.rnn.buckets` merged into `model.rnn.R` in order to better align with `model.R`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #7486: Quick question about LSTM parameters
szha commented on issue #7486: Quick question about LSTM parameters URL: https://github.com/apache/incubator-mxnet/issues/7486#issuecomment-322618950 No problem. And the reason that you see i2h_f_bias and h2h_f_bias being the same could be that they were initialized with the same value. Since they are added together with the same weight 1, what ends up happening is that the gradients they received will always be the same. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #7486: Quick question about LSTM parameters
szha commented on issue #7486: Quick question about LSTM parameters URL: https://github.com/apache/incubator-mxnet/issues/7486#issuecomment-322618950 No problem. And the reason that you see i2h_f_bias and h2h_f_bias being the same could be that they were initialized the same way. Since they are added together with the same weight 1, what ends up happening is that the gradients they received will always be the same. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aspzest commented on issue #7486: Quick question about LSTM parameters
aspzest commented on issue #7486: Quick question about LSTM parameters URL: https://github.com/apache/incubator-mxnet/issues/7486#issuecomment-322617307 Okay. Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mli commented on issue #7455: Distributed training is slow
mli commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322612600 https://github.com/apache/incubator-mxnet/issues/6975 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aspzest commented on issue #7486: Quick question about LSTM parameters
aspzest commented on issue #7486: Quick question about LSTM parameters URL: https://github.com/apache/incubator-mxnet/issues/7486#issuecomment-322611872 @szha So, b_f = i2h_f_bias + h2h_f_bias? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mli commented on issue #7455: Distributed training is slow
mli commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322609681 @madjam 's test case is that `send_buf` maybe not ready to get `data()` agree with @ptrendx that we should remove this WaitToRead. One solution is moving https://github.com/madjam/mxnet/blob/0012f7722d97238a84c33f1bee8cd2926707a7e9/src/kvstore/kvstore_dist.h#L221 into the captured function. Can someone help contribute a PR for it? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] leoxiaobin opened a new issue #7455: Distributed training is slow
leoxiaobin opened a new issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455 ## Environment info Operating System: Ubuntu 16.4 Compiler: gcc 5.4 Package used (Python/R/Scala/Julia): Python MXNet version: Last code Or if installed from source: installed from source MXNet commit hash (`git rev-parse HEAD`): 1a3faa If you are using python package, please provide Python version and distribution: Python 2.7.13 :: Anaconda custom (64-bit) I tried to train image classification model using two servers with infiniband cards. But the speed is a little slow, just like using one server. I used the code of example/image-classifaction. when training on one server, the training command is ``` python train_imagenet.py --benchmark 1 --gpus 0,1,2,3,4,5,6,7 --kv-store device --network inception-v3 --batch-size 256 --image-shape 3,299,299 ``` the speed is ``` INFO:root:start with arguments Namespace(batch_size=256, benchmark=1, data_nthreads=4, data_train=None, data_val=None, disp_batches=20, dtype='float32', gpus='0,1,2,3,4,5,6,7', image_shape='3,299,299', kv_store='device', load_epoch=None, lr=0.1, lr_factor=0.1, lr_step_epochs='30,60', max_random_aspect_ratio=0.25, max_random_h=36, max_random_l=50, max_random_rotate_angle=10, max_random_s=50, max_random_scale=1, max_random_shear_ratio=0.1, min_random_scale=1, model_prefix=None, mom=0.9, monitor=0, network='inception-v3', num_classes=1000, num_epochs=80, num_examples=1281167, num_layers=50, optimizer='sgd', pad_size=0, random_crop=1, random_mirror=1, rgb_mean='123.68,116.779,103.939', test_io=0, top_k=0, wd=0.0001) [22:35:19] src/operator/././cudnn_algoreg-inl.h:112: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable) [22:35:40] src/kvstore/././comm.h:327: only 24 out of 56 GPU pairs are enabled direct access. It may affect the performance. You can set MXNET_ENABLE_GPU_P2P=0 to turn it off [22:35:40] src/kvstore/././comm.h:336: .vvv [22:35:40] src/kvstore/././comm.h:336: v.vv [22:35:40] src/kvstore/././comm.h:336: vv.v [22:35:40] src/kvstore/././comm.h:336: vvv. [22:35:40] src/kvstore/././comm.h:336: .vvv [22:35:40] src/kvstore/././comm.h:336: v.vv [22:35:40] src/kvstore/././comm.h:336: vv.v [22:35:40] src/kvstore/././comm.h:336: vvv. INFO:root:Epoch[0] Batch [20] Speed: 1065.93 samples/sec accuracy=0.165365 INFO:root:Epoch[0] Batch [40] Speed: 1033.22 samples/sec accuracy=0.989648 INFO:root:Epoch[0] Batch [60] Speed: 1029.90 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [80] Speed: 1029.80 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [100] Speed: 1028.05 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [120] Speed: 1019.75 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [140] Speed: 1025.79 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [160] Speed: 1027.82 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [180] Speed: 1021.11 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [200] Speed: 1025.14 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [220] Speed: 1017.72 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [240] Speed: 1021.09 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [260] Speed: 1024.25 samples/sec accuracy=1.00 ``` When training with 2 servers, the command is ``` python ../../tools/launch.py -n 2 --launcher ssh -H hosts python train_imagenet.py --benchmark 1 --gpus 0,1,2,3,4,5,6,7 --kv-store dist_sync --network inception-v3 --num-layers 50 --batch-size 256 --sync-dst-dir /tmp/mxnet --image-shape 3,299,299 ``` And the speed is ``` INFO:root:Epoch[0] Batch [20] Speed: 609.31 samples/sec accuracy=0.056920 INFO:root:Epoch[0] Batch [20] Speed: 610.12 samples/sec accuracy=0.050967 INFO:root:Epoch[0] Batch [40] Speed: 608.68 samples/sec accuracy=0.854883 INFO:root:Epoch[0] Batch [40] Speed: 608.19 samples/sec accuracy=0.868164 INFO:root:Epoch[0] Batch [60] Speed: 602.48 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [60] Speed: 603.86 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [80] Speed: 603.11 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [80] Speed: 603.87 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [100] Speed: 607.30 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [100] Speed: 606.54 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [120] Speed: 604.53 samples/sec accuracy=1.00 INFO:root:Epoch[0] Batch [120]
[GitHub] aspzest opened a new issue #7486: Quick question about LSTM parameters
aspzest opened a new issue #7486: Quick question about LSTM parameters URL: https://github.com/apache/incubator-mxnet/issues/7486 I am using LSTM from mxnet and was able to get the parameters of the LSTM block. I have a question about the biases. According to the equation below taken from [here](http://colah.github.io/posts/2015-08-Understanding-LSTMs/), there is a single bias_f defined to get f_t. But, mxnet's LSTM parameters contain two biases for this equation: 1. i2h_f_bias and 2. h2h_f_bias. Is b_f here simply i2h_f_bias + h2h_f_bias? Or there is some other relation? ![screen shot 2017-08-15 at 3 13 07 pm](https://user-images.githubusercontent.com/29802784/29338879-4f8cadbc-81cc-11e7-992c-8dd4ce63896d.png) I am also seeing that i2h_f_bias and h2h_f_bias are same sometimes. Thank You This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lxn2 closed pull request #10: Fix more links
lxn2 closed pull request #10: Fix more links URL: https://github.com/apache/incubator-mxnet-site/pull/10 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] madjam commented on issue #7455: Distributed training is slow
madjam commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322593478 For context, that barrier was added since an operation such as: ``` kv.init(2, mx.nd.zeros((50, 50))) ``` would access memory that is not fully initialized and therefore causes a segfault. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kevinthesun opened a new pull request #7485: Fix more links
kevinthesun opened a new pull request #7485: Fix more links URL: https://github.com/apache/incubator-mxnet/pull/7485 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold commented on issue #7419: Add resnet50_v2, resnet101_V2 and resnet152_v2 gluon pre-trained model
zhreshold commented on issue #7419: Add resnet50_v2, resnet101_V2 and resnet152_v2 gluon pre-trained model URL: https://github.com/apache/incubator-mxnet/pull/7419#issuecomment-322586548 @kevinthesun @szha Validation on these three models are bad, basically around 0.001 accuracy. So I guess these weights are not correctly handled. The iterator I used: ``` val_iter = mx.image.ImageIter(args.batch_size, data_shape, path_imgrec=args.val_rec, shuffle=False, mean=True, std=True, resize=256) ``` I've used the same iterator to test v1 models, which are good. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold opened a new pull request #7484: add gluon resnet18_v2, resnet34_v2 models
zhreshold opened a new pull request #7484: add gluon resnet18_v2, resnet34_v2 models URL: https://github.com/apache/incubator-mxnet/pull/7484 resnet18v2: validation: accuracy=0.696827, top_k_accuracy_5=0.888473 resnet34v2: validation: accuracy=0.732103, top_k_accuracy_5=0.910415 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (a21d3e0 -> 7d6385a)
This is an automated email from the ASF dual-hosted git repository. jxie pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a21d3e0 Fix more broken links (#7480) add 7d6385a fix autograd memory cost (#7478) No new revisions were added by this update. Summary of changes: nnvm| 2 +- python/mxnet/gluon/data/dataloader.py | 10 +++- python/mxnet/gluon/data/dataset.py | 7 ++- python/mxnet/gluon/nn/basic_layers.py | 12 +++-- python/mxnet/gluon/parameter.py | 75 src/executor/attach_op_execs_pass.cc| 40 +-- src/ndarray/autograd.cc | 87 +++-- src/ndarray/autograd.h | 2 +- tests/python/gpu/test_operator_gpu.py | 11 + tests/python/unittest/test_gluon.py | 13 - tests/python/unittest/test_gluon_rnn.py | 1 + 11 files changed, 211 insertions(+), 49 deletions(-) -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org"'].
[GitHub] piiswrong closed pull request #7478: fix autograd memory cost
piiswrong closed pull request #7478: fix autograd memory cost URL: https://github.com/apache/incubator-mxnet/pull/7478 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong opened a new pull request #7478: fix autograd memory cost
piiswrong opened a new pull request #7478: fix autograd memory cost URL: https://github.com/apache/incubator-mxnet/pull/7478 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong commented on issue #7434: fix a formula typo in doc
piiswrong commented on issue #7434: fix a formula typo in doc URL: https://github.com/apache/incubator-mxnet/pull/7434#issuecomment-322556923 Looks like should be channel instead of num_channel This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zjjxsjh opened a new pull request #7434: fix a formula typo in doc
zjjxsjh opened a new pull request #7434: fix a formula typo in doc URL: https://github.com/apache/incubator-mxnet/pull/7434 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on a change in pull request #7476: R-package RNN refactor
thirdwing commented on a change in pull request #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#discussion_r133259264 ## File path: R-package/R/rnn.graph.R ## @@ -0,0 +1,123 @@ +library(mxnet) + +# RNN graph design +rnn.graph <- function(num.rnn.layer, + input.size, + num.embed, + num.hidden, + num.label, + dropout = 0, + ignore_label = 0, + init.state = NULL, + config, + cell.type="gru", + masking = T, + output_last_state = F) { + + # define input arguments + label <- mx.symbol.Variable("label") + data <- mx.symbol.Variable("data") + seq.mask <- mx.symbol.Variable("seq.mask") + + embed.weight <- mx.symbol.Variable("embed.weight") + rnn.params.weight <- mx.symbol.Variable("rnn.params.weight") + rnn.state.weight <- mx.symbol.Variable("rnn.state.weight") + if (cell.type == "lstm") rnn.state.cell.weight <- mx.symbol.Variable("rnn.state.cell.weight") + cls.weight <- mx.symbol.Variable("cls.weight") + cls.bias <- mx.symbol.Variable("cls.bias") + + data <- mx.symbol.transpose(data=data) + # seq.mask <- mx.symbol.stop_gradient(seq.mask, name="seq.mask") + + embed <- mx.symbol.Embedding(data=data, input_dim=input.size, + weight=embed.weight, output_dim=num.embed, name="embed") + + if (cell.type == "lstm") { +rnn <- mx.symbol.RNN(data=embed, state=rnn.state.weight, state_cell = rnn.state.cell.weight, parameters=rnn.params.weight, state.size=num.hidden, num.layers=num.rnn.layer, bidirectional=F, mode=cell.type, state.outputs=F, p=dropout, name=paste(cell.type, num.rnn.layer, "layer", sep="_")) + + } else { +rnn <- mx.symbol.RNN(data=embed, state=rnn.state.weight, parameters=rnn.params.weight, state.size=num.hidden, num.layers=num.rnn.layer, bidirectional=F, mode=cell.type, state.outputs=F, p=dropout, name=paste(cell.type, num.rnn.layer, "layer", sep="_")) + } + + if (config=="seq-to-one") { + +if (masking) mask <- mx.symbol.SequenceLast(data=rnn[[1]], use.sequence.length = T, sequence_length = seq.mask, name = "mask") else + mask <- mx.symbol.identity(data = rnn[[1]], name = "mask") + +fc <- mx.symbol.FullyConnected(data=mask, + weight=cls.weight, + bias=cls.bias, + num.hidden=num.label, + name = "decode") + +loss <- mx.symbol.SoftmaxOutput(data=fc, name="sm", label=label, ignore_label=ignore_label) + + } else if (config=="one-to-one"){ + +if (masking) mask <- mx.symbol.SequenceMask(data = rnn[[1]], use.sequence.length = T, sequence_length = seq.mask, name = "mask") else + mask <- mx.symbol.identity(data = rnn[[1]], name = "mask") + +reshape = mx.symbol.reshape(mask, shape=c(num.hidden, -1)) + +fc <- mx.symbol.FullyConnected(data=reshape, + weight=cls.weight, + bias=cls.bias, + num.hidden=num.label, + name = "decode") + +label <- mx.symbol.reshape(data=label, shape=c(-1)) +loss <- mx.symbol.SoftmaxOutput(data=fc, name="sm", label=label, ignore_label=ignore_label) + + } + + if (output_last_state){ +# group <- mx.symbol.Group(c(unlist(last.states), loss)) +# return(group) +return(loss) + } else return(loss) +} + + + +# data <- mx.symbol.Variable("data") Review comment: Can we remove the testing code or move into unit test? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Fix broken links
This is an automated email from the ASF dual-hosted git repository. lxn2 pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new d777c9c Fix broken links d777c9c is described below commit d777c9c8cb629d0e93ba61c7fbbaf73a5f5fc6ec Author: WangAuthorDate: Tue Aug 15 11:06:05 2017 -0700 Fix broken links --- get_started/windows_setup.html | 2 +- model_zoo/index.html | 10 +- versions/master/get_started/windows_setup.html | 2 +- versions/master/model_zoo/index.html | 10 +- 4 files changed, 12 insertions(+), 12 deletions(-) diff --git a/get_started/windows_setup.html b/get_started/windows_setup.html index 7645a4e..ff5d687 100644 --- a/get_started/windows_setup.html +++ b/get_started/windows_setup.html @@ -259,7 +259,7 @@ This produces a library called To build and install MXNet yourself, you need the following dependencies. Install the required dependencies: If https://www.visualstudio.com/downloads/;>Microsoft Visual Studio 2013 is not already installed, download and install it. You can download and install the free community edition. -Install https://www.microsoft.com/en-us/download/details.aspx?id=41151;>Visual C++ Compiler Nov 2013 CTP. +Install http://landinghub.visualstudio.com/visual-cpp-build-tools;>Visual C++ Compiler. Back up all of the files in the C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC folder to a different location. Copy all of the files in the C:\Program Files (x86)\Microsoft Visual C++ Compiler Nov 2013 CTP folder (or the folder where you extracted the zip archive) to the C:\ProgramDownload and install http://sourceforge.net/projects/opencvlibrary/files/opencv-win/3.0.0/opencv-3.0.0.exe/download;>OpenCV. diff --git a/model_zoo/index.html b/model_zoo/index.html index 5b72005..7b69d56 100644 --- a/model_zoo/index.html +++ b/model_zoo/index.html @@ -269,7 +269,7 @@ ongoing project to collect complete models, with python scripts, pre-trained wei http://places2.csail.mit.edu/download.html;>Places2: There are 1.6 million train images from 365 scene categories in the Places365-Standard, which are used to train the Places365 CNNs. There are 50 images per category in the validation set and 900 images per category in the testing set. Compared to the train set of Places365-Standard, the train set of Places365-Challenge has 6.2 million extra images, leading to totally 8 million train images fo [...] https://aws.amazon.com/public-datasets/multimedia-commons/;>Multimedia Commons: YFCC100M (99.2 million images and 0.8 million videos from Flickr) and supplemental material (pre-extracted features, additional annotations). -For instructions on using these models, see https://mxnet.incubator.apache.org/tutorials/python/predict_imagenet.html;>the python tutorial on using pre-trained ImageNet models. +For instructions on using these models, see https://mxnet.incubator.apache.org/tutorials/python/predict_image.html;>the python tutorial on using pre-trained ImageNet models. @@ -364,12 +364,12 @@ ongoing project to collect complete models, with python scripts, pre-trained wei Recurrent Neural Networks (RNNs) including LSTMs¶ -MXNet supports many types of recurrent neural networks (RNNs), including Long Short-Term Memory (http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf;>LSTM) +MXNet supports many types of recurrent neural networks (RNNs), including Long Short-Term Memory (http://www.bioinf.jku.at/publications/older/2604.pdf;>LSTM) and Gated Recurrent Units (GRU) networks. Some available datasets include: -https://www.cis.upenn.edu/~treebank/;>Penn Treebank (PTB): Text corpus with ~1 million words. Vocabulary is limited to 10,000 words. The task is predicting downstream words/characters. +https://catalog.ldc.upenn.edu/LDC95T7;>Penn Treebank (PTB): Text corpus with ~1 million words. Vocabulary is limited to 10,000 words. The task is predicting downstream words/characters. http://cs.stanford.edu/people/karpathy/char-rnn/;>Shakespeare: Complete text from Shakespeare’s works. -https://s3.amazonaws.com/text-datasets;>IMDB reviews: 25,000 movie reviews, labeled as positive or negative +https://getsatisfaction.com/imdb/topics/imdb-data-now-available-in-amazon-s3;>IMDB reviews: 25,000 movie reviews, labeled as positive or negative https://research.facebook.com/researchers/1543934539189348;>Facebook bAbI: As a set of 20 question answer tasks, each with 1,000 training examples. http://mscoco.org/;>Flickr8k, COCO: Images with associated caption (sentences). Flickr8k consists of 8,092 images captioned by AmazonTurkers with ~40,000 captions. COCO has 328,000 images, each with 5 captions. The COCO images also come with labeled objects using segmentation algorithms. @@ -393,7 +393,7 @@ and Gated
[GitHub] piiswrong commented on issue #7393: add depthwise convolution's gpu version optimization
piiswrong commented on issue #7393: add depthwise convolution's gpu version optimization URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-322552001 Could you rebase to master and push again? Somehow test is failing This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cheshirecats commented on issue #7481: [v0.11.0] Amalgamation for Javascript JS has unresolved symbol: __cxa_thread_atexit
cheshirecats commented on issue #7481: [v0.11.0] Amalgamation for Javascript JS has unresolved symbol: __cxa_thread_atexit URL: https://github.com/apache/incubator-mxnet/issues/7481#issuecomment-322547610 For now I am using #define DMLC_CXX11_THREAD_LOCAL 0 in amalgamation.py to solve it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] statist-bhfz opened a new issue #7483: MXNet - R API broken link
statist-bhfz opened a new issue #7483: MXNet - R API broken link URL: https://github.com/apache/incubator-mxnet/issues/7483 "MXNet R Reference Manual" on http://www.mxnet.io/api/r/index.html actually contains Julia reference. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sandeep-krishnamurthy opened a new pull request #7482: Adding developer keys for sandeep
sandeep-krishnamurthy opened a new pull request #7482: Adding developer keys for sandeep URL: https://github.com/apache/incubator-mxnet/pull/7482 @nswamy @lxn2 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #7455: Distributed training is slow
szha commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322544218 Thanks, @ptrendx. @madjam for more context. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kevinthesun opened a new pull request #9: Fix broken links
kevinthesun opened a new pull request #9: Fix broken links URL: https://github.com/apache/incubator-mxnet-site/pull/9 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on a change in pull request #7476: R-package RNN refactor
thirdwing commented on a change in pull request #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#discussion_r133259264 ## File path: R-package/R/rnn.graph.R ## @@ -0,0 +1,123 @@ +library(mxnet) + +# RNN graph design +rnn.graph <- function(num.rnn.layer, + input.size, + num.embed, + num.hidden, + num.label, + dropout = 0, + ignore_label = 0, + init.state = NULL, + config, + cell.type="gru", + masking = T, + output_last_state = F) { + + # define input arguments + label <- mx.symbol.Variable("label") + data <- mx.symbol.Variable("data") + seq.mask <- mx.symbol.Variable("seq.mask") + + embed.weight <- mx.symbol.Variable("embed.weight") + rnn.params.weight <- mx.symbol.Variable("rnn.params.weight") + rnn.state.weight <- mx.symbol.Variable("rnn.state.weight") + if (cell.type == "lstm") rnn.state.cell.weight <- mx.symbol.Variable("rnn.state.cell.weight") + cls.weight <- mx.symbol.Variable("cls.weight") + cls.bias <- mx.symbol.Variable("cls.bias") + + data <- mx.symbol.transpose(data=data) + # seq.mask <- mx.symbol.stop_gradient(seq.mask, name="seq.mask") + + embed <- mx.symbol.Embedding(data=data, input_dim=input.size, + weight=embed.weight, output_dim=num.embed, name="embed") + + if (cell.type == "lstm") { +rnn <- mx.symbol.RNN(data=embed, state=rnn.state.weight, state_cell = rnn.state.cell.weight, parameters=rnn.params.weight, state.size=num.hidden, num.layers=num.rnn.layer, bidirectional=F, mode=cell.type, state.outputs=F, p=dropout, name=paste(cell.type, num.rnn.layer, "layer", sep="_")) + + } else { +rnn <- mx.symbol.RNN(data=embed, state=rnn.state.weight, parameters=rnn.params.weight, state.size=num.hidden, num.layers=num.rnn.layer, bidirectional=F, mode=cell.type, state.outputs=F, p=dropout, name=paste(cell.type, num.rnn.layer, "layer", sep="_")) + } + + if (config=="seq-to-one") { + +if (masking) mask <- mx.symbol.SequenceLast(data=rnn[[1]], use.sequence.length = T, sequence_length = seq.mask, name = "mask") else + mask <- mx.symbol.identity(data = rnn[[1]], name = "mask") + +fc <- mx.symbol.FullyConnected(data=mask, + weight=cls.weight, + bias=cls.bias, + num.hidden=num.label, + name = "decode") + +loss <- mx.symbol.SoftmaxOutput(data=fc, name="sm", label=label, ignore_label=ignore_label) + + } else if (config=="one-to-one"){ + +if (masking) mask <- mx.symbol.SequenceMask(data = rnn[[1]], use.sequence.length = T, sequence_length = seq.mask, name = "mask") else + mask <- mx.symbol.identity(data = rnn[[1]], name = "mask") + +reshape = mx.symbol.reshape(mask, shape=c(num.hidden, -1)) + +fc <- mx.symbol.FullyConnected(data=reshape, + weight=cls.weight, + bias=cls.bias, + num.hidden=num.label, + name = "decode") + +label <- mx.symbol.reshape(data=label, shape=c(-1)) +loss <- mx.symbol.SoftmaxOutput(data=fc, name="sm", label=label, ignore_label=ignore_label) + + } + + if (output_last_state){ +# group <- mx.symbol.Group(c(unlist(last.states), loss)) +# return(group) +return(loss) + } else return(loss) +} + + + +# data <- mx.symbol.Variable("data") Review comment: Can we remove the testing code? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cheshirecats opened a new issue #7481: [v0.11.0] Amalgamation for Javascript JS has unresolved symbol: __cxa_thread_atexit
cheshirecats opened a new issue #7481: [v0.11.0] Amalgamation for Javascript JS has unresolved symbol: __cxa_thread_atexit URL: https://github.com/apache/incubator-mxnet/issues/7481 The amalgamation for Javascript in mxnet v0.9.2 worked fine, however for the latest 0.11.0 version, I got this error while running "make clean libmxnet_predict.js MIN=1": `warning: unresolved symbol: __cxa_thread_atexit` and the compiled js code crashed upon loading (with the same unresolved symbol error). It seems this may have something to do with C++11 features support in emcc. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on a change in pull request #7476: R-package RNN refactor
thirdwing commented on a change in pull request #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#discussion_r133259192 ## File path: R-package/R/rnn.infer.R ## @@ -0,0 +1,77 @@ +library(mxnet) + +source("rnn.R") Review comment: Please remove these two lines. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on a change in pull request #7476: R-package RNN refactor
thirdwing commented on a change in pull request #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#discussion_r133259025 ## File path: R-package/R/rnn.graph.R ## @@ -0,0 +1,123 @@ +library(mxnet) Review comment: Please remove "library(mxnet)". This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #7480: Fix more broken links
piiswrong closed pull request #7480: Fix more broken links URL: https://github.com/apache/incubator-mxnet/pull/7480 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on a change in pull request #7476: R-package RNN refactor
thirdwing commented on a change in pull request #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#discussion_r133258978 ## File path: R-package/R/rnn.R ## @@ -1,342 +1,101 @@ -# rnn cell symbol -rnn <- function(num.hidden, indata, prev.state, param, seqidx, -layeridx, dropout=0., batch.norm=FALSE) { -if (dropout > 0. ) -indata <- mx.symbol.Dropout(data=indata, p=dropout) -i2h <- mx.symbol.FullyConnected(data=indata, -weight=param$i2h.weight, -bias=param$i2h.bias, -num.hidden=num.hidden, -name=paste0("t", seqidx, ".l", layeridx, ".i2h")) -h2h <- mx.symbol.FullyConnected(data=prev.state$h, -weight=param$h2h.weight, -bias=param$h2h.bias, -num.hidden=num.hidden, -name=paste0("t", seqidx, ".l", layeridx, ".h2h")) -hidden <- i2h + h2h +library(mxnet) Review comment: Please remove "library(mxnet)". This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on a change in pull request #7476: R-package RNN refactor
thirdwing commented on a change in pull request #7476: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7476#discussion_r133258942 ## File path: R-package/R/mx.io.bucket.iter.R ## @@ -0,0 +1,110 @@ +library(mxnet) Review comment: Please remove "library(mxnet)". This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sandeep-krishnamurthy opened a new pull request #7480: Fix more broken links
sandeep-krishnamurthy opened a new pull request #7480: Fix more broken links URL: https://github.com/apache/incubator-mxnet/pull/7480 @kevinthesun @lxn2 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] larroy commented on a change in pull request #7416: update submoules with android fixes
larroy commented on a change in pull request #7416: update submoules with android fixes URL: https://github.com/apache/incubator-mxnet/pull/7416#discussion_r133256329 ## File path: src/operator/c_lapack_api.h ## @@ -73,8 +73,13 @@ using namespace mshadow; extern "C" { // Fortran signatures - #define MXNET_LAPACK_FSIGNATURE1(func, dtype) \ -void func##_(char *uplo, int *n, dtype *a, int *lda, int *info); + #ifdef __ANDROID__ +#define MXNET_LAPACK_FSIGNATURE1(func, dtype) \ + int func##_(char* uplo, int* n, dtype* a, int* lda, int *info); Review comment: https://github.com/xianyi/OpenBLAS/blob/develop/common_lapack.h#L43 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on issue #7461: [R] update tutorial link. close #6536
thirdwing commented on issue #7461: [R] update tutorial link. close #6536 URL: https://github.com/apache/incubator-mxnet/pull/7461#issuecomment-322538191 @sandeep-krishnamurthy Please have a look at this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on issue #7461: [R] update tutorial link. close #6536
thirdwing commented on issue #7461: [R] update tutorial link. close #6536 URL: https://github.com/apache/incubator-mxnet/pull/7461#issuecomment-322538068 This will add an index page at http://mxnet.io/tutorials/r/index.html All the R tutorials are already on our website (http://mxnet.io/tutorials/r/), but no links to them. ![screenshot from 2017-08-15 10-44-53](https://user-images.githubusercontent.com/1547093/29328282-e4909e8a-81a6-11e7-821c-01025d649fc6.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #7478: fix autograd memory cost
piiswrong closed pull request #7478: fix autograd memory cost URL: https://github.com/apache/incubator-mxnet/pull/7478 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong opened a new pull request #7479: Fix autograd memory
piiswrong opened a new pull request #7479: Fix autograd memory URL: https://github.com/apache/incubator-mxnet/pull/7479 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong commented on a change in pull request #7416: update submoules with android fixes
piiswrong commented on a change in pull request #7416: update submoules with android fixes URL: https://github.com/apache/incubator-mxnet/pull/7416#discussion_r133249270 ## File path: src/operator/c_lapack_api.h ## @@ -73,8 +73,13 @@ using namespace mshadow; extern "C" { // Fortran signatures - #define MXNET_LAPACK_FSIGNATURE1(func, dtype) \ -void func##_(char *uplo, int *n, dtype *a, int *lda, int *info); + #ifdef __ANDROID__ +#define MXNET_LAPACK_FSIGNATURE1(func, dtype) \ + int func##_(char* uplo, int* n, dtype* a, int* lda, int *info); Review comment: why does it need to be int? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong opened a new pull request #7478: fix autograd memory cost
piiswrong opened a new pull request #7478: fix autograd memory cost URL: https://github.com/apache/incubator-mxnet/pull/7478 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #7477: fix autograd memory cost
piiswrong closed pull request #7477: fix autograd memory cost URL: https://github.com/apache/incubator-mxnet/pull/7477 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thomasmooon opened a new issue #7475: Paradox VRAM demand as a function of batch size: Low batch size, high VRAM demand
thomasmooon opened a new issue #7475: Paradox VRAM demand as a function of batch size: Low batch size, high VRAM demand URL: https://github.com/apache/incubator-mxnet/issues/7475 Dear community, I'm running mxnet on the environment as mentioned below with the following hardware: - 32 GB RAM - 8 x NVIDIA 1080 Ti, 8 GB VRAM - 12 CPUs MXnet is utlized to train a stacked denoising autoencoder reconstruction of an array with dimension about 62.000 x 100. I was looking for a batch size with minimum VRAM demand. Therefore I measured the VRAM demand per single Card for a given combination of "number of GPUs" and "array.batch.size". The results are shown in the attached table: Columns indicate array.batch.size and rows indicate number of GPUs. Each entry indicates the GPU memory usage per Card. ![paradox_vram_batchsize](https://user-images.githubusercontent.com/29228225/29312390-5b2b6c60-81b5-11e7-9258-f877551427e1.jpg) As one can see if array.batch.size is below or above a particular value, VRAM demand rises. Given an array.batch.size of 12, computation crashes due to "cuda memory allocation error" (red style). Given a array.batch.size of 128 (2 and 4 GPUS) or 256 (6) results in a minimum VRAM demand of about 3GB per card (green style). My question is: **Why does VRAM hunger rise below a particular array.batch.size value?** ## Environment info Operating System: REDHAT 7.3 Package used (Python/R/Scala/Julia): R MXNet version: 0.1.0.1 R `sessionInfo()`: R version 3.4.0 (2017-04-21) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server 7.3 (Maipo) Matrix products: default BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8LC_PAPER=en_GB.UTF-8 [8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.0 tools_3.4.0 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] wanderingpj opened a new issue #7474: Training error when using cifar100.
wanderingpj opened a new issue #7474: Training error when using cifar100. URL: https://github.com/apache/incubator-mxnet/issues/7474 I train the network given in https://github.com/dmlc/mxnet-notebooks/blob/master/python/moved-from-mxnet/cifar-100.ipynb with cifar100. But here comes a training error like this. ![qq 20170815175608](https://user-images.githubusercontent.com/24189081/29311528-df0311c6-81e4-11e7-9613-ff7c055b7a12.png) What's the problem?@aileli This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] wanderingpj closed issue #7473: Training error when using cifar100.
wanderingpj closed issue #7473: Training error when using cifar100. URL: https://github.com/apache/incubator-mxnet/issues/7473 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] wanderingpj opened a new issue #7473: Training error when using cifar100.
wanderingpj opened a new issue #7473: Training error when using cifar100. URL: https://github.com/apache/incubator-mxnet/issues/7473 I train the network given in https://github.com/dmlc/mxnet-notebooks/blob/master/python/moved-from-mxnet/cifar-100.ipynb with cifar100. But here comes a training error like this. file:///C:/Users/Administrator/Desktop/QQ%E6%88%AA%E5%9B%BE20170815173949.png What's the problem?@aileli This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] starimpact commented on issue #7455: Distributed training is slow
starimpact commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322418256 I am using mxnet0.8.0, HAHAHA... I noticed that your "one server " is actually "local", because that your "kvstore=device". the kvstore will use gpu to update parameters. And , your "two server" is really the distributed mode. in the "dist ..." mode, kvstore will use cpu to update the parameters. So... your speed descending is normal. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] squidszyd commented on issue #7427: how to set dataiter with multi data?
squidszyd commented on issue #7427: how to set dataiter with multi data? URL: https://github.com/apache/incubator-mxnet/issues/7427#issuecomment-322420119 Use collections.namedtuple: Batch = namedtuple('Batch',['data', 'label']) def __iter__(self): ... yield Batch(data=[data1, data2, data3,...], label=[...]) @property def provide_data(self): return [('data1', shape1), ('data2', shape2), ('data3', shape3), ...] This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] starimpact commented on issue #7455: Distributed training is slow
starimpact commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322418256 I am using mxnet0.8.0, HAHAHA... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kurt-o-sys opened a new issue #7472: continuously train rnn - training data stream?
kurt-o-sys opened a new issue #7472: continuously train rnn - training data stream? URL: https://github.com/apache/incubator-mxnet/issues/7472 ## Question Usually, a neural network is trained by using a training, validation and test set. Having a continuous series of data, an event (new training data) occurring every 1-5 seconds, is it possible to continuously train (update) a recurrent neural network using mxnet? I don't need to care to reuse previous (training) data points: I just want to update the weights slightly(!) on each new event. It's for a behaviour/game like system: depending on the (expressed/intentional) behaviour of the players (the features), the output of the system should be estimated and continuously adapted (for further processing). The system has to learn on the way, and being able to cope with, to a certain extend, changing player behaviour and it needs to remember certain patterns from weeks and if possible, months, ago. (I'd probably be mainly an LSTM.) Storing all data and retrain the system on that data is close to impossible because: 1. I estimate there's about 10-100GB of data per day (will be varying) 2. retraining every time, let's say, 10 seconds, on all existing data would take too long. I want a system that continuously trains itself on the real data, not splitting into training/testing/validation sets: 1. The training set is the real data, comparing the actual state of the system with the prediction previously made 2. There's not validation, besides the fact that the system validates itself 3. Testing is done on every new event. The predictive power will be continuously determined. Can this be done with mxnet, having a training data stream? ## Environment info This is not really relevant, but well, I don't mind providing it :) Operating System: ``` $ uname -ar Linux flipflap 4.4.0-57-generic #78-Ubuntu SMP Fri Dec 9 23:50:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ``` Compiler: ? Package used (Python/R/Scala/Julia): R MXNet version: ``` > packageVersion("mxnet") [1] ?0.10.1? > sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Linux Mint 18 Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.18.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_BE.UTF-8 [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=de_BE.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mxnet_0.10.1 httr_1.2.1 jsonlite_1.5 loaded via a namespace (and not attached): [1] Rcpp_0.12.12 compiler_3.4.1 RColorBrewer_1.1-2 influenceR_0.1.0 plyr_1.8.4 bindr_0.1 viridis_0.4.0 [8] tools_3.4.1digest_0.6.12 tibble_1.3.3 gtable_0.2.0 viridisLite_0.2.0 rgexf_0.15.3 pkgconfig_2.0.1 [15] rlang_0.1.1igraph_1.1.2 rstudioapi_0.6 curl_2.4 bindrcpp_0.2 gridExtra_2.2.1stringr_1.2.0 [22] DiagrammeR_0.9.0 dplyr_0.7.2htmlwidgets_0.9grid_3.4.1 glue_1.1.1 R6_2.2.2 Rook_1.1-1 [29] XML_3.98-1.9 ggplot2_2.2.1 magrittr_1.5 codetools_0.2-15 scales_0.4.1 htmltools_0.3.6assertthat_0.1 [36] colorspace_1.3-2 brew_1.0-6 stringi_1.1.5 visNetwork_2.0.1 lazyeval_0.2.0 munsell_0.4.3 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cuteding closed issue #7471: Why are resnet's RELU and BN set before CONV?
cuteding closed issue #7471: Why are resnet's RELU and BN set before CONV? URL: https://github.com/apache/incubator-mxnet/issues/7471 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thirdwing commented on issue #7470: R-package RNN refactor
thirdwing commented on issue #7470: R-package RNN refactor URL: https://github.com/apache/incubator-mxnet/pull/7470#issuecomment-322392069 Thank you for this. I suggest you not update submodules in this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] leoxiaobin commented on issue #7455: Distributed training is slow
leoxiaobin commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322390365 @starimpact , I have tried to use 4 servers per machine, I got almost the same result. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] leoxiaobin commented on issue #7455: Distributed training is slow
leoxiaobin commented on issue #7455: Distributed training is slow URL: https://github.com/apache/incubator-mxnet/issues/7455#issuecomment-322390219 @szha , every server has 8 TitanXp GPUs and 2 Intel Xeon CPU E5-2650 v2@ 2.60GHz. The two servers are connected with IB cards. The test is using --benchmark = 1 configuration, so there is no disk I/O operation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services