Hi Jeongyoon,
This looks like a user limit problem. Can you check if you have "max locked
memory" set to a high enough value or unlimited (use "ulimit -l"). You can
set memlock in /etc/security/limits.conf e.g.:
* soft memlock unlimited
* hard memlock unlimited
Regards,
Jonas
On Thu, 20 Jun 2019 16:40:59 +0900
Jeongyoon Eo <[email protected]> wrote:
Hi,
I'm trying TeraSort example on Spark-Crail using RDMA by using
latest
incubator-crail and disni, crail-spark-io, crail-spark-terasort from
https://github.com/zrlio.
I'm using two machines with Ubuntu 18.04, one for CrailNameNode and
the
other for StorageServer.
When running start-crail.sh, following error appears from
StorageServer
Crail log.
19/06/20 14:58:42 INFO crail: connected to namenode(s)
/172.30.100.4:9060
Exception in thread "main" java.io.IOException: j2c::regMr:
ibv_reg_mr
failed: Cannot allocate memory
at com.ibm.disni.verbs.impl.NativeDispatcher._regMr(Native
Method)
at
com.ibm.disni.verbs.impl.NatRegMrCall.execute(NatRegMrCall.java:91)
at
com.ibm.disni.verbs.impl.NatRegMrCall.execute(NatRegMrCall.java:36)
at
org.apache.crail.storage.rdma.RdmaStorageServer.allocateResource(RdmaStorageServer.java:112)
at
org.apache.crail.storage.StorageServer.main(StorageServer.java:152)
When testing RDMA by C code, ibv_reg_mr succeeded, so I think there
might
be some conflict between libdisni.so which Crail uses(or other Crail
components?) and the underlying RDMA libraries.
Is there anyone who experienced this kind of Cannot allocate memory
errors?
If so, could you share your troubleshooting story?
Any other help would be great!
Thank you in advance.
- Jeongyoon