CQL Copy command will not work in case if you are trying to copy from all NODES because COPY command will check all N nodes UP and RUNNING Status. If you want to complete then you have 2 options:- 1) Remove DOWN NODE from COPY command 2) Make it UP and NORMAL status.
On Mon, Jul 2, 2018 at 9:15 AM, Anup Shirolkar < anup.shirol...@instaclustr.com> wrote: > Hi, > > The error shows that, the cqlsh connection with down node is failed. > So, you should debug why it happened. > > Although, you have mentioned other node in cqlsh command '10.0.0.154' > my guess is, the down node was present in connection pool, hence it was > attempted for connection. > > Ideally the availability of data should not be hampered due > to unavailability of one replica out of 5. > Also the stack trace is about 'cqlsh' connection error. > > I think once you get your connection sorted, the COPY should work as usual. > > Regards, > Anup > > > On 30 June 2018 at 15:05, Dmitry Simonov <dimmobor...@gmail.com> wrote: > >> Hello! >> >> I have cassandra cluster with 5 nodes. >> There is a (relatively small) keyspace X with RF5. >> One node goes down. >> >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host >> ID Rack >> UN 10.0.0.82 253.64 MB 256 100.0% >> 839bef9d-79af-422c-a21f-33bdcf4493c1 rack1 >> UN 10.0.0.154 255.92 MB 256 100.0% >> ce23f3a7-67d2-47c0-9ece-7a5dd67c4105 rack1 >> UN 10.0.0.76 461.26 MB 256 100.0% >> c8e18603-0ede-43f0-b713-3ff47ad92323 rack1 >> UN 10.0.0.94 575.78 MB 256 100.0% >> 9a324dbc-5ae1-4788-80e4-d86dcaae5a4c rack1 >> DN 10.0.0.47 ? 256 100.0% >> 7b628ca2-4e47-457a-ba42-5191f7e5374b rack1 >> >> I try to export some data using COPY TO, but it fails after long retries. >> Why does it fail? >> How can I make a copy? >> There must be 4 copies of each row on other (alive) replicas. >> >> cqlsh 10.0.0.154 -e "COPY X.Y TO 'backup/X.Y' WITH NUMPROCESSES=1" >> >> Using 1 child processes >> >> Starting copy of X.Y with columns [key, column1, value]. >> 2018-06-29 19:12:23,661 Failed to create connection pool for new host >> 10.0.0.47: >> Traceback (most recent call last): >> File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/cluster.py", >> line 2476, in run_add_or_renew_pool >> new_pool = HostConnection(host, distance, self) >> File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/pool.py", >> line 332, in __init__ >> self._connection = session.cluster.connection_factory(host.address) >> File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/cluster.py", >> line 1205, in connection_factory >> return self.connection_class.factory(address, self.connect_timeout, >> *args, **kwargs) >> File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/connection.py", >> line 332, in factory >> conn = cls(host, *args, **kwargs) >> File >> "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/io/asyncorereactor.py", >> line 344, in __init__ >> self._connect_socket() >> File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/connection.py", >> line 371, in _connect_socket >> raise socket.error(sockerr.errno, "Tried connecting to %s. Last >> error: %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr)) >> OSError: [Errno None] Tried connecting to [('10.0.0.47', 9042)]. Last >> error: timed out >> 2018-06-29 19:12:23,665 Host 10.0.0.47 has been marked down >> 2018-06-29 19:12:29,674 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 2.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:12:36,684 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 4.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:12:45,696 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 8.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:12:58,716 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 16.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:13:19,756 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 32.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:13:56,834 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 64.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:15:05,887 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 128.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:17:18,982 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 256.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> 2018-06-29 19:21:40,064 Error attempting to reconnect to 10.0.0.47, >> scheduling retry in 512.0 seconds: [Errno None] Tried connecting to >> [('10.0.0.47', 9042)]. Last error: timed out >> <stdin>:1:(4, 'Interrupted system call') >> IOError: >> IOError: >> IOError: >> IOError: >> IOError: >> >> >> -- >> Best Regards, >> Dmitry Simonov >> > > > > -- > > Anup Shirolkar > > Consultant > > +61 420 602 338 > > <https://www.instaclustr.com/solutions/managed-apache-kafka/> > > <https://www.facebook.com/instaclustr> <https://twitter.com/instaclustr> > <https://www.linkedin.com/company/instaclustr> > > Read our latest technical blog posts here > <https://www.instaclustr.com/blog/>. >