[ 
https://issues.apache.org/jira/browse/CASSANDRA-16538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yolanda Tang updated CASSANDRA-16538:
-------------------------------------
    Description: 
Hi,

 

When switching to use Cassandra medus to fulfill our work for node data 
restore, we encountered some issues.

When using pssh remotely we are getting timeout issue, when trying the command 
on one node of Cassandra, we  get

 
{code:java}
pssh -H XXXX medusa -vvv restore-node --in-place --no-verify --backup-name 
2021031803 --temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
 [1] 06:52:08 [FAILURE] sha8392 Timed out, Killed by signal 9
 When further looking into the timeout issue, we get logs as
 [2021-03-25 02:23:50,113] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
"GET /XX/XX/10.44.XX.XX/2021031803/meta/schema.cql?Version=2006-03-01 HTTP/1.1" 
200 24005[2021-03-25 02:23:50,114] DEBUG: [Storage] Getting object 
sre_dev_cass_sha/10.44.79.15/2021031803/meta/tokenmap.json
 [2021-03-25 02:23:50,151] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
"HEAD /XX HTTP/1.1" 200 0[2021-03-25 02:23:50,201] DEBUG: 
https://s3.cn-north-1.amazonaws.com.cn:443 "HEAD 
/XX/XX/10.44.79.15/2021031803/meta/tokenmap.json HTTP/1.1" 200 0[2021-03-25 
02:23:50,202] DEBUG: Downloading 
/tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a/medusa-restore-197b6c82-4cd5-4c5b-b3c2-9d98863c1b3f
 as single part
 [2021-03-25 02:23:50,254] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
"GET /XX/XX/10.44.XX.XX/2021031803/meta/tokenmap.json?Version=2006-03-01 
HTTP/1.1" 200 1535[2021-03-25 02:23:50,255] INFO: Stopping Cassandra
+ /usr/bin/nodetool u cassandra -pw if9te8ohKei9xaep drain+ /usr/bin/nodetool 
-u cassandra -pw if9te8ohKei9xaep drainerror: null- StackTrace 
--java.io.EOFException at 
java.io.DataInputStream.readByte(DataInputStream.java:267) at 
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222) at 
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at 
com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at 
javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020)
 at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
 at com.sun.proxy.$Proxy8.drain(Unknown Source) at 
org.apache.cassandra.tools.NodeProbe.drain(NodeProbe.java:371) at 
org.apache.cassandra.tools.nodetool.Drain.execute(Drain.java:36) at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:244) at 
org.apache.cassandra.tools.NodeTool.main(NodeTool.java:158)
 + ls -l /var/run/cassandra/cassandra.pidls: cannot access 
/var/run/cassandra/cassandra.pid: No such file or directory+ sleep 10+ echo -n 
'Shutdown Cassandra: 'Shutdown Cassandra: ++ cat 
/var/run/cassandra/cassandra.pidcat: /var/run/cassandra/cassandra.pid: No such 
file or directory+ su cassandra -c 'kill 'kill: usage: kill [-s sigspec | -n 
signum | -sigspec] pid | jobspec ... or kill -l [sigspec]++ seq 40+ for t in 
'`seq 40`'+ /etc/init.d/cassandra status+ break+ sleep 5+ echo OKOK
{code}
But we can get a successful run of the command on one node for
{code:java}
export LC_ALL=en_US.UTF-8; export LANG=en_US.UTF-8; export 
https_proxy=http://proxy.XX:3128 ; export 
PATH=$PATH:/usr/share/cassandra-medusa/bin; sudo su; mkdir 
/tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a; cd 
/tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a;
medusa-wrapper sudo 
medusa -vvv restore-node --in-place --no-verify --backup-name 2021031803 
--temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a{code}
We are running the command on 
{code:java}
uname -a
Linux XXXX 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 2020 
x86_64 x86_64 x86_64 GNU/Linux{code}
Could you please have a look at the issue?

Thanks

  was:
Hi,

 

When switching to use Cassandra medus to fulfill our work for node data 
restore, we encountered some issues.

When using pssh remotely we are getting timeout issue, when trying the command 
on one node of Cassandra, we  get

 
{code:java}
pssh -H XXXX medusa -vvv restore-node --in-place --no-verify --backup-name 
2021031803 --temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
 [1] 06:52:08 [FAILURE] sha8392 Timed out, Killed by signal 9
 When further looking into the timeout issue, we get logs as
 [2021-03-25 02:23:50,113] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
"GET /XX/XX/10.44.XX.XX/2021031803/meta/schema.cql?Version=2006-03-01 HTTP/1.1" 
200 24005[2021-03-25 02:23:50,114] DEBUG: [Storage] Getting object 
sre_dev_cass_sha/10.44.79.15/2021031803/meta/tokenmap.json
 [2021-03-25 02:23:50,151] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
"HEAD /XX HTTP/1.1" 200 0[2021-03-25 02:23:50,201] DEBUG: 
https://s3.cn-north-1.amazonaws.com.cn:443 "HEAD 
/XX/XX/10.44.79.15/2021031803/meta/tokenmap.json HTTP/1.1" 200 0[2021-03-25 
02:23:50,202] DEBUG: Downloading 
/tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a/medusa-restore-197b6c82-4cd5-4c5b-b3c2-9d98863c1b3f
 as single part
 [2021-03-25 02:23:50,254] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
"GET /XX/XX/10.44.XX.XX/2021031803/meta/tokenmap.json?Version=2006-03-01 
HTTP/1.1" 200 1535[2021-03-25 02:23:50,255] INFO: Stopping Cassandra
+ /usr/bin/nodetool u cassandra -pw if9te8ohKei9xaep drain+ /usr/bin/nodetool 
-u cassandra -pw if9te8ohKei9xaep drainerror: null- StackTrace 
--java.io.EOFException at 
java.io.DataInputStream.readByte(DataInputStream.java:267) at 
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222) at 
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at 
com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at 
javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020)
 at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
 at com.sun.proxy.$Proxy8.drain(Unknown Source) at 
org.apache.cassandra.tools.NodeProbe.drain(NodeProbe.java:371) at 
org.apache.cassandra.tools.nodetool.Drain.execute(Drain.java:36) at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:244) at 
org.apache.cassandra.tools.NodeTool.main(NodeTool.java:158)
 + ls -l /var/run/cassandra/cassandra.pidls: cannot access 
/var/run/cassandra/cassandra.pid: No such file or directory+ sleep 10+ echo -n 
'Shutdown Cassandra: 'Shutdown Cassandra: ++ cat 
/var/run/cassandra/cassandra.pidcat: /var/run/cassandra/cassandra.pid: No such 
file or directory+ su cassandra -c 'kill 'kill: usage: kill [-s sigspec | -n 
signum | -sigspec] pid | jobspec ... or kill -l [sigspec]++ seq 40+ for t in 
'`seq 40`'+ /etc/init.d/cassandra status+ break+ sleep 5+ echo OKOK
{code}

 But we can get a successful run of the command on one node for
{code:java}
export LC_ALL=en_US.UTF-8; export LANG=en_US.UTF-8; export 
https_proxy=http://proxy.XX:3128 ; export 
PATH=$PATH:/usr/share/cassandra-medusa/bin; sudo su; mkdir 
/tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a; cd 
/tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a;
medusa-wrapper sudo 
medusa -vvv restore-node --in-place --no-verify --backup-name 2021031803 
--temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a{code}
We are running the command on 
{code:java}
uname -a
Linux sha8392 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 
2020 x86_64 x86_64 x86_64 GNU/Linux{code}
Could you please have a look at the issue?

Thanks


> Cannot run restore for a list of Cassandra nodes
> ------------------------------------------------
>
>                 Key: CASSANDRA-16538
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16538
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Yolanda Tang
>            Priority: Normal
>
> Hi,
>  
> When switching to use Cassandra medus to fulfill our work for node data 
> restore, we encountered some issues.
> When using pssh remotely we are getting timeout issue, when trying the 
> command on one node of Cassandra, we  get
>  
> {code:java}
> pssh -H XXXX medusa -vvv restore-node --in-place --no-verify --backup-name 
> 2021031803 --temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
>  [1] 06:52:08 [FAILURE] sha8392 Timed out, Killed by signal 9
>  When further looking into the timeout issue, we get logs as
>  [2021-03-25 02:23:50,113] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
> "GET /XX/XX/10.44.XX.XX/2021031803/meta/schema.cql?Version=2006-03-01 
> HTTP/1.1" 200 24005[2021-03-25 02:23:50,114] DEBUG: [Storage] Getting object 
> sre_dev_cass_sha/10.44.79.15/2021031803/meta/tokenmap.json
>  [2021-03-25 02:23:50,151] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
> "HEAD /XX HTTP/1.1" 200 0[2021-03-25 02:23:50,201] DEBUG: 
> https://s3.cn-north-1.amazonaws.com.cn:443 "HEAD 
> /XX/XX/10.44.79.15/2021031803/meta/tokenmap.json HTTP/1.1" 200 0[2021-03-25 
> 02:23:50,202] DEBUG: Downloading 
> /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a/medusa-restore-197b6c82-4cd5-4c5b-b3c2-9d98863c1b3f
>  as single part
>  [2021-03-25 02:23:50,254] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443 
> "GET /XX/XX/10.44.XX.XX/2021031803/meta/tokenmap.json?Version=2006-03-01 
> HTTP/1.1" 200 1535[2021-03-25 02:23:50,255] INFO: Stopping Cassandra
> + /usr/bin/nodetool u cassandra -pw if9te8ohKei9xaep drain+ /usr/bin/nodetool 
> -u cassandra -pw if9te8ohKei9xaep drainerror: null- StackTrace 
> --java.io.EOFException at 
> java.io.DataInputStream.readByte(DataInputStream.java:267) at 
> sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222) at 
> sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at 
> com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at 
> javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at 
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020)
>  at 
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
>  at com.sun.proxy.$Proxy8.drain(Unknown Source) at 
> org.apache.cassandra.tools.NodeProbe.drain(NodeProbe.java:371) at 
> org.apache.cassandra.tools.nodetool.Drain.execute(Drain.java:36) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:244) at 
> org.apache.cassandra.tools.NodeTool.main(NodeTool.java:158)
>  + ls -l /var/run/cassandra/cassandra.pidls: cannot access 
> /var/run/cassandra/cassandra.pid: No such file or directory+ sleep 10+ echo 
> -n 'Shutdown Cassandra: 'Shutdown Cassandra: ++ cat 
> /var/run/cassandra/cassandra.pidcat: /var/run/cassandra/cassandra.pid: No 
> such file or directory+ su cassandra -c 'kill 'kill: usage: kill [-s sigspec 
> | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]++ seq 40+ for 
> t in '`seq 40`'+ /etc/init.d/cassandra status+ break+ sleep 5+ echo OKOK
> {code}
> But we can get a successful run of the command on one node for
> {code:java}
> export LC_ALL=en_US.UTF-8; export LANG=en_US.UTF-8; export 
> https_proxy=http://proxy.XX:3128 ; export 
> PATH=$PATH:/usr/share/cassandra-medusa/bin; sudo su; mkdir 
> /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a; cd 
> /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a;
> medusa-wrapper sudo 
> medusa -vvv restore-node --in-place --no-verify --backup-name 2021031803 
> --temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a{code}
> We are running the command on 
> {code:java}
> uname -a
> Linux XXXX 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 
> 2020 x86_64 x86_64 x86_64 GNU/Linux{code}
> Could you please have a look at the issue?
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to