Re: Bootstraping is failing

2020-05-11 Thread Reid Pinchback
If you’re correct that the issue you linked to is the bug you are hitting, then 
it was fixed in 3.11.3.  You may have no choice but to upgrade.  From the 
discussion it doesn’t read as if any tuning tweaks avoided the issue, just the 
patch fixed it.

If you do, I’d suggest going to at least 3.11.5.

Note that usable memory for a setting > 31 gb may not be what you think. At 
32gb you cross a boundary that triggers object pointers to double in size.  The 
only way you really win is when an app has only a modest number of objects, but 
some of those objects have large non-object-granularity allocations, e.g. like 
a few huge byte arrays.  C* does use some large buffers, but it also generates 
a lot of small objects.

I’d consider TCP tunings a likely red herring in this, if you are correct about 
the leak.  Doesn’t mean you can’t have better settings per suggestions made, 
just that it seems like it could be a case of refining behavior on the 
periphery of the problem, not anything directly addressing it.


From: Surbhi Gupta 
Reply-To: "user@cassandra.apache.org" 
Date: Saturday, May 9, 2020 at 11:51 AM
To: "user@cassandra.apache.org" 
Subject: Re: Bootstraping is failing

Message from External Sender
I tried to change the heap size from 31GB to 62GB on the bootstrapping node 
because , I noticed that , when it reached the mid way of bootstrapping , heap 
reached to around 90% or more and node just freeze .
But still it is the same behavior , it again reached midway and heap again 
reached 90% or more and node just freeze and none of the node tool command 
returns the output, other node also removed this node from the joining as they 
were not able to gossip.
We are on 3.11.0 .

I tried to take heap dump when the node had 90% + heap utilization of 62GB heap 
size and opened the leak report and found 3 leak suspect and out of three 2 
were as below:

1. The thread io.netty.util.concurrent.FastThreadLocalThread @ 0x7fbe9533bf98 
StreamReceiveTask:26 keeps local variables with total size 16,898,023,552 
(31.10%)bytes.
The memory is accumulated in one instance of 
"io.netty.util.Recycler$DefaultHandle[]" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x7fb917c76dc8".

2. The thread io.netty.util.concurrent.FastThreadLocalThread @ 0x7fbb846fb800 
StreamReceiveTask:29 keeps local variables with total size 11,696,214,424 
(21.53%)bytes.
The memory is accumulated in one instance of 
"io.netty.util.Recycler$DefaultHandle[]" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x7fb917c76dc8".

Am I getting hit by 
https://issues.apache.org/jira/browse/CASSANDRA-13929<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D13929=DwMFaQ=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc=LYdnPGldpP4IB6pDevPWk1Scr0tsTaFmsqx5uslKvCo=P-0rAJBdSvwhDOArjtaJ1LvgjJ56dTlvzIEcBZGbo8Y=>

I haven't changed the tcp settings . My tcp settings are more than recommended, 
what I wanted to understand , how tcp settings can effect the bootstrapping 
process ?

Thanks
Surbhi

On Thu, 7 May 2020 at 17:01, Surbhi Gupta 
mailto:surbhi.gupt...@gmail.com>> wrote:
When we are starting the node, it is starting bootstrap automatically and 
restreaming the whole data again.  It is not resuming .

On Thu, May 7, 2020 at 4:47 PM Adam Scott 
mailto:adam.c.sc...@gmail.com>> wrote:
I think you want to run `nodetool bootstrap resume` 
(https://cassandra.apache.org/doc/latest/tools/nodetool/bootstrap.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_tools_nodetool_bootstrap.html=DwMFaQ=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc=LYdnPGldpP4IB6pDevPWk1Scr0tsTaFmsqx5uslKvCo=hQxh8KK3IQK5yln8hl6kjyHW6bJlzCQMxzHhy3E6zYU=>)
  to pick up where it last left off. Sorry for the late reply.


On Thu, May 7, 2020 at 2:22 PM Surbhi Gupta 
mailto:surbhi.gupt...@gmail.com>> wrote:
So after failed bootstrapped , if we start cassandra again on the new node , 
will it resume bootstrap or will it start over?

On Thu, 7 May 2020 at 13:32, Adam Scott 
mailto:adam.c.sc...@gmail.com>> wrote:
I recommend it on all nodes.  This will eliminate that as a source of trouble 
further on down the road.


On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta 
mailto:surbhi.gupt...@gmail.com>> wrote:
streaming_socket_timeout_in_ms is 24 hour.
  So tcp settings should be changed on the new bootstrap node or on all nodes ?


On Thu, 7 May 2020 at 13:23, Adam Scott 
mailto:adam.c.sc...@gmail.com>> wrote:

edit /etc/sysctl.conf


net.ipv4.tcp_keepalive_time=60

net.ipv4.tcp_keepalive_probes=3

net.ipv4.tcp_keepalive_intvl=10
then run sysctl -p to cause the kernel to reload the settings

5 minutes (300) seconds is probably too long.

On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
mailto:surbhi.gupt...@gmail.com>> wrote

Re: Bootstraping is failing

2020-05-09 Thread Surbhi Gupta
I tried to change the heap size from 31GB to 62GB on the bootstrapping node
because , I noticed that , when it reached the mid way of bootstrapping ,
heap reached to around 90% or more and node just freeze .
But still it is the same behavior , it again reached midway and heap again
reached 90% or more and node just freeze and none of the node tool command
returns the output, other node also removed this node from the joining as
they were not able to gossip.
We are on 3.11.0 .

I tried to take heap dump when the node had 90% + heap utilization of 62GB
heap size and opened the leak report and found 3 leak suspect and out of
three 2 were as below:

1. The thread *io.netty.util.concurrent.FastThreadLocalThread @
0x7fbe9533bf98 StreamReceiveTask:26* keeps local variables with total
size *16,898,023,552
(31.10%)*bytes.
The memory is accumulated in one instance of
*"io.netty.util.Recycler$DefaultHandle[]"* loaded by
*"sun.misc.Launcher$AppClassLoader
@ 0x7fb917c76dc8"*.

2. The thread *io.netty.util.concurrent.FastThreadLocalThread @
0x7fbb846fb800 StreamReceiveTask:29* keeps local variables with total
size *11,696,214,424
(21.53%)*bytes.
The memory is accumulated in one instance of
*"io.netty.util.Recycler$DefaultHandle[]"* loaded by
*"sun.misc.Launcher$AppClassLoader
@ 0x7fb917c76dc8"*.

Am I getting hit by https://issues.apache.org/jira/browse/CASSANDRA-13929

I haven't changed the tcp settings . My tcp settings are more than
recommended, what I wanted to understand , how tcp settings can effect the
bootstrapping process ?

Thanks
Surbhi

On Thu, 7 May 2020 at 17:01, Surbhi Gupta  wrote:

> When we are starting the node, it is starting bootstrap automatically and
> restreaming the whole data again.  It is not resuming .
>
> On Thu, May 7, 2020 at 4:47 PM Adam Scott  wrote:
>
>> I think you want to run `nodetool bootstrap resume` (
>> https://cassandra.apache.org/doc/latest/tools/nodetool/bootstrap.html)
>> to pick up where it last left off. Sorry for the late reply.
>>
>>
>> On Thu, May 7, 2020 at 2:22 PM Surbhi Gupta 
>> wrote:
>>
>>> So after failed bootstrapped , if we start cassandra again on the new
>>> node , will it resume bootstrap or will it start over?
>>>
>>> On Thu, 7 May 2020 at 13:32, Adam Scott  wrote:
>>>
 I recommend it on all nodes.  This will eliminate that as a source of
 trouble further on down the road.


 On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta 
 wrote:

> streaming_socket_timeout_in_ms is 24 hour.
>   So tcp settings should be changed on the new bootstrap node or on
> all nodes ?
>
>
> On Thu, 7 May 2020 at 13:23, Adam Scott 
> wrote:
>
>>
>> *edit
>> /etc/sysctl.confnet.ipv4.tcp_keepalive_time=60 
>> net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
>> then run sysctl -p to cause the kernel to reload the settings
>>
>> 5 minutes (300) seconds is probably too long.
>>
>> On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
>> wrote:
>>
>>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time
>>>
>>> 300
>>>
>>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>>>
>>> 30
>>>
>>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>>
>>> 9
>>>
>>> On Thu, 7 May 2020 at 12:32, Adam Scott 
>>> wrote:
>>>
 Maybe a firewall killing a connection?

 What does the following show?
 cat /proc/sys/net/ipv4/tcp_keepalive_time
 cat /proc/sys/net/ipv4/tcp_keepalive_intvl
 cat /proc/sys/net/ipv4/tcp_keepalive_probes

 On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta <
 surbhi.gupt...@gmail.com> wrote:

> Hi,
>
> We are trying to expand a datacenter and trying to add nodes but
> when node is bootstrapping , it goes half way through and then fail 
> with
> below error, We have increased stremthroughput from 200 to 400 when 
> we were
> trying for the 2nd time but still it failed. We are on 3.11.0 , using 
> G1GC
> with 31GB heap.
>
> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07
> 09:42:38,933 CassandraDaemon.java:228 - Exception in thread
> Thread[MessagingService-Incoming-/10.X.X.X,main]
>
> java.io.IOError: java.io.EOFException: Stream ended prematurely
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
When we are starting the node, it is starting bootstrap automatically and
restreaming the whole data again.  It is not resuming .

On Thu, May 7, 2020 at 4:47 PM Adam Scott  wrote:

> I think you want to run `nodetool bootstrap resume` (
> https://cassandra.apache.org/doc/latest/tools/nodetool/bootstrap.html)
> to pick up where it last left off. Sorry for the late reply.
>
>
> On Thu, May 7, 2020 at 2:22 PM Surbhi Gupta 
> wrote:
>
>> So after failed bootstrapped , if we start cassandra again on the new
>> node , will it resume bootstrap or will it start over?
>>
>> On Thu, 7 May 2020 at 13:32, Adam Scott  wrote:
>>
>>> I recommend it on all nodes.  This will eliminate that as a source of
>>> trouble further on down the road.
>>>
>>>
>>> On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta 
>>> wrote:
>>>
 streaming_socket_timeout_in_ms is 24 hour.
   So tcp settings should be changed on the new bootstrap node or on all
 nodes ?


 On Thu, 7 May 2020 at 13:23, Adam Scott  wrote:

>
> *edit
> /etc/sysctl.confnet.ipv4.tcp_keepalive_time=60 
> net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
> then run sysctl -p to cause the kernel to reload the settings
>
> 5 minutes (300) seconds is probably too long.
>
> On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
> wrote:
>
>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time
>>
>> 300
>>
>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>>
>> 30
>>
>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>
>> 9
>>
>> On Thu, 7 May 2020 at 12:32, Adam Scott 
>> wrote:
>>
>>> Maybe a firewall killing a connection?
>>>
>>> What does the following show?
>>> cat /proc/sys/net/ipv4/tcp_keepalive_time
>>> cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>>> cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>>
>>> On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta <
>>> surbhi.gupt...@gmail.com> wrote:
>>>
 Hi,

 We are trying to expand a datacenter and trying to add nodes but
 when node is bootstrapping , it goes half way through and then fail 
 with
 below error, We have increased stremthroughput from 200 to 400 when we 
 were
 trying for the 2nd time but still it failed. We are on 3.11.0 , using 
 G1GC
 with 31GB heap.

 ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
 CassandraDaemon.java:228 - Exception in thread
 Thread[MessagingService-Incoming-/10.X.X.X,main]

 java.io.IOError: java.io.EOFException: Stream ended prematurely

 at
 org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 Caused by: java.io.EOFException: Stream ended prematurely

 

Re: Bootstraping is failing

2020-05-07 Thread Adam Scott
I think you want to run `nodetool bootstrap resume` (
https://cassandra.apache.org/doc/latest/tools/nodetool/bootstrap.html)  to
pick up where it last left off. Sorry for the late reply.


On Thu, May 7, 2020 at 2:22 PM Surbhi Gupta 
wrote:

> So after failed bootstrapped , if we start cassandra again on the new node
> , will it resume bootstrap or will it start over?
>
> On Thu, 7 May 2020 at 13:32, Adam Scott  wrote:
>
>> I recommend it on all nodes.  This will eliminate that as a source of
>> trouble further on down the road.
>>
>>
>> On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta 
>> wrote:
>>
>>> streaming_socket_timeout_in_ms is 24 hour.
>>>   So tcp settings should be changed on the new bootstrap node or on all
>>> nodes ?
>>>
>>>
>>> On Thu, 7 May 2020 at 13:23, Adam Scott  wrote:
>>>

 *edit
 /etc/sysctl.confnet.ipv4.tcp_keepalive_time=60 
 net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
 then run sysctl -p to cause the kernel to reload the settings

 5 minutes (300) seconds is probably too long.

 On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
 wrote:

> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time
>
> 300
>
> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>
> 30
>
> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
>
> 9
>
> On Thu, 7 May 2020 at 12:32, Adam Scott 
> wrote:
>
>> Maybe a firewall killing a connection?
>>
>> What does the following show?
>> cat /proc/sys/net/ipv4/tcp_keepalive_time
>> cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>> cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>
>> On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta <
>> surbhi.gupt...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We are trying to expand a datacenter and trying to add nodes but
>>> when node is bootstrapping , it goes half way through and then fail with
>>> below error, We have increased stremthroughput from 200 to 400 when we 
>>> were
>>> trying for the 2nd time but still it failed. We are on 3.11.0 , using 
>>> G1GC
>>> with 31GB heap.
>>>
>>> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
>>> CassandraDaemon.java:228 - Exception in thread
>>> Thread[MessagingService-Incoming-/10.X.X.X,main]
>>>
>>> java.io.IOError: java.io.EOFException: Stream ended prematurely
>>>
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> Caused by: java.io.EOFException: Stream ended prematurely
>>>
>>> at
>>> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
>>> ~[lz4-1.3.0.jar:na]
>>>
>>> at
>>> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
>>> ~[lz4-1.3.0.jar:na]
>>>
>>> at
>>> 

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
So after failed bootstrapped , if we start cassandra again on the new node
, will it resume bootstrap or will it start over?

On Thu, 7 May 2020 at 13:32, Adam Scott  wrote:

> I recommend it on all nodes.  This will eliminate that as a source of
> trouble further on down the road.
>
>
> On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta 
> wrote:
>
>> streaming_socket_timeout_in_ms is 24 hour.
>>   So tcp settings should be changed on the new bootstrap node or on all
>> nodes ?
>>
>>
>> On Thu, 7 May 2020 at 13:23, Adam Scott  wrote:
>>
>>>
>>> *edit
>>> /etc/sysctl.confnet.ipv4.tcp_keepalive_time=60 
>>> net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
>>> then run sysctl -p to cause the kernel to reload the settings
>>>
>>> 5 minutes (300) seconds is probably too long.
>>>
>>> On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
>>> wrote:
>>>
 [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time

 300

 [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl

 30

 [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes

 9

 On Thu, 7 May 2020 at 12:32, Adam Scott  wrote:

> Maybe a firewall killing a connection?
>
> What does the following show?
> cat /proc/sys/net/ipv4/tcp_keepalive_time
> cat /proc/sys/net/ipv4/tcp_keepalive_intvl
> cat /proc/sys/net/ipv4/tcp_keepalive_probes
>
> On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
> wrote:
>
>> Hi,
>>
>> We are trying to expand a datacenter and trying to add nodes but when
>> node is bootstrapping , it goes half way through and then fail with below
>> error, We have increased stremthroughput from 200 to 400 when we were
>> trying for the 2nd time but still it failed. We are on 3.11.0 , using 
>> G1GC
>> with 31GB heap.
>>
>> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
>> CassandraDaemon.java:228 - Exception in thread
>> Thread[MessagingService-Incoming-/10.X.X.X,main]
>>
>> java.io.IOError: java.io.EOFException: Stream ended prematurely
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> Caused by: java.io.EOFException: Stream ended prematurely
>>
>> at
>> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
>> ~[lz4-1.3.0.jar:na]
>>
>> at
>> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
>> ~[lz4-1.3.0.jar:na]
>>
>> at
>> net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
>> ~[lz4-1.3.0.jar:na]
>>
>> at
>> java.io.DataInputStream.readFully(DataInputStream.java:195) 
>> ~[na:1.8.0_242]
>>
>> at
>> java.io.DataInputStream.readFully(DataInputStream.java:169) 
>> ~[na:1.8.0_242]
>>
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
>> 

Re: Bootstraping is failing

2020-05-07 Thread Adam Scott
I recommend it on all nodes.  This will eliminate that as a source of
trouble further on down the road.


On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta 
wrote:

> streaming_socket_timeout_in_ms is 24 hour.
>   So tcp settings should be changed on the new bootstrap node or on all
> nodes ?
>
>
> On Thu, 7 May 2020 at 13:23, Adam Scott  wrote:
>
>>
>> *edit
>> /etc/sysctl.confnet.ipv4.tcp_keepalive_time=60 
>> net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
>> then run sysctl -p to cause the kernel to reload the settings
>>
>> 5 minutes (300) seconds is probably too long.
>>
>> On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
>> wrote:
>>
>>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time
>>>
>>> 300
>>>
>>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>>>
>>> 30
>>>
>>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>>
>>> 9
>>>
>>> On Thu, 7 May 2020 at 12:32, Adam Scott  wrote:
>>>
 Maybe a firewall killing a connection?

 What does the following show?
 cat /proc/sys/net/ipv4/tcp_keepalive_time
 cat /proc/sys/net/ipv4/tcp_keepalive_intvl
 cat /proc/sys/net/ipv4/tcp_keepalive_probes

 On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
 wrote:

> Hi,
>
> We are trying to expand a datacenter and trying to add nodes but when
> node is bootstrapping , it goes half way through and then fail with below
> error, We have increased stremthroughput from 200 to 400 when we were
> trying for the 2nd time but still it failed. We are on 3.11.0 , using G1GC
> with 31GB heap.
>
> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
> CassandraDaemon.java:228 - Exception in thread
> Thread[MessagingService-Incoming-/10.X.X.X,main]
>
> java.io.IOError: java.io.EOFException: Stream ended prematurely
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> Caused by: java.io.EOFException: Stream ended prematurely
>
> at
> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
> ~[lz4-1.3.0.jar:na]
>
> at
> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
> ~[lz4-1.3.0.jar:na]
>
> at
> net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
> ~[lz4-1.3.0.jar:na]
>
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> ~[na:1.8.0_242]
>
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> ~[na:1.8.0_242]
>
> at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> 

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
streaming_socket_timeout_in_ms is 24 hour.
  So tcp settings should be changed on the new bootstrap node or on all
nodes ?


On Thu, 7 May 2020 at 13:23, Adam Scott  wrote:

>
> *edit
> /etc/sysctl.confnet.ipv4.tcp_keepalive_time=60 
> net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
> then run sysctl -p to cause the kernel to reload the settings
>
> 5 minutes (300) seconds is probably too long.
>
> On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
> wrote:
>
>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time
>>
>> 300
>>
>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>>
>> 30
>>
>> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>
>> 9
>>
>> On Thu, 7 May 2020 at 12:32, Adam Scott  wrote:
>>
>>> Maybe a firewall killing a connection?
>>>
>>> What does the following show?
>>> cat /proc/sys/net/ipv4/tcp_keepalive_time
>>> cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>>> cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>>
>>> On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
>>> wrote:
>>>
 Hi,

 We are trying to expand a datacenter and trying to add nodes but when
 node is bootstrapping , it goes half way through and then fail with below
 error, We have increased stremthroughput from 200 to 400 when we were
 trying for the 2nd time but still it failed. We are on 3.11.0 , using G1GC
 with 31GB heap.

 ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
 CassandraDaemon.java:228 - Exception in thread
 Thread[MessagingService-Incoming-/10.X.X.X,main]

 java.io.IOError: java.io.EOFException: Stream ended prematurely

 at
 org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 Caused by: java.io.EOFException: Stream ended prematurely

 at
 net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
 ~[lz4-1.3.0.jar:na]

 at
 net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
 ~[lz4-1.3.0.jar:na]

 at
 net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
 ~[lz4-1.3.0.jar:na]

 at java.io.DataInputStream.readFully(DataInputStream.java:195)
 ~[na:1.8.0_242]

 at java.io.DataInputStream.readFully(DataInputStream.java:169)
 ~[na:1.8.0_242]

 at
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:665)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

 at
 org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:606)
 

Re: Bootstraping is failing

2020-05-07 Thread Adam Scott
*edit
/etc/sysctl.confnet.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=10*
then run sysctl -p to cause the kernel to reload the settings

5 minutes (300) seconds is probably too long.

On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta 
wrote:

> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time
>
> 300
>
> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>
> 30
>
> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
>
> 9
>
> On Thu, 7 May 2020 at 12:32, Adam Scott  wrote:
>
>> Maybe a firewall killing a connection?
>>
>> What does the following show?
>> cat /proc/sys/net/ipv4/tcp_keepalive_time
>> cat /proc/sys/net/ipv4/tcp_keepalive_intvl
>> cat /proc/sys/net/ipv4/tcp_keepalive_probes
>>
>> On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
>> wrote:
>>
>>> Hi,
>>>
>>> We are trying to expand a datacenter and trying to add nodes but when
>>> node is bootstrapping , it goes half way through and then fail with below
>>> error, We have increased stremthroughput from 200 to 400 when we were
>>> trying for the 2nd time but still it failed. We are on 3.11.0 , using G1GC
>>> with 31GB heap.
>>>
>>> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
>>> CassandraDaemon.java:228 - Exception in thread
>>> Thread[MessagingService-Incoming-/10.X.X.X,main]
>>>
>>> java.io.IOError: java.io.EOFException: Stream ended prematurely
>>>
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> Caused by: java.io.EOFException: Stream ended prematurely
>>>
>>> at
>>> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
>>> ~[lz4-1.3.0.jar:na]
>>>
>>> at
>>> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
>>> ~[lz4-1.3.0.jar:na]
>>>
>>> at
>>> net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
>>> ~[lz4-1.3.0.jar:na]
>>>
>>> at java.io.DataInputStream.readFully(DataInputStream.java:195)
>>> ~[na:1.8.0_242]
>>>
>>> at java.io.DataInputStream.readFully(DataInputStream.java:169)
>>> ~[na:1.8.0_242]
>>>
>>> at
>>> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:665)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:606)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at
>>> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197)
>>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>>
>>> at 

RE: Bootstraping is failing

2020-05-07 Thread ZAIDI, ASAD
heck if  [streaming_socket_timeout_in_ms ] setting in Cassandra.yaml file if 
that sufficient enough before streaming is interrupted ?
~Asad




From: Surbhi Gupta [mailto:surbhi.gupt...@gmail.com]
Sent: Thursday, May 7, 2020 3:09 PM
To: user@cassandra.apache.org
Subject: Re: Bootstraping is failing


[root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time

300

[root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl

30

[root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes

9

On Thu, 7 May 2020 at 12:32, Adam Scott 
mailto:adam.c.sc...@gmail.com>> wrote:
Maybe a firewall killing a connection?

What does the following show?
cat /proc/sys/net/ipv4/tcp_keepalive_time
cat /proc/sys/net/ipv4/tcp_keepalive_intvl
cat /proc/sys/net/ipv4/tcp_keepalive_probes

On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
mailto:surbhi.gupt...@gmail.com>> wrote:
Hi,

We are trying to expand a datacenter and trying to add nodes but when node is 
bootstrapping , it goes half way through and then fail with below error, We 
have increased stremthroughput from 200 to 400 when we were trying for the 2nd 
time but still it failed. We are on 3.11.0 , using G1GC with 31GB heap.


ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933 
CassandraDaemon.java:228 - Exception in thread 
Thread[MessagingService-Incoming-/10.X.X.X,main]

java.io.IOError: java.io.EOFException: Stream ended prematurely

at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

Caused by: java.io.EOFException: Stream ended prematurely

at 
net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218) 
~[lz4-1.3.0.jar:na]

at 
net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150) 
~[lz4-1.3.0.jar:na]

at 
net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117) 
~[lz4-1.3.0.jar:na]

at java.io.DataInputStream.readFully(DataInputStream.java:195) 
~[na:1.8.0_242]

at java.io.DataInputStream.readFully(DataInputStream.java:169) 
~[na:1.8.0_242]

at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:665)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:606)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at org.apache.cassandra.db.Columns.apply(Columns.java:377) 
~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600)
 ~[apache-cassandra-3.11.0.jar:3.11.0]

at 
org.apache.cassandra.db.rows.UnfilteredSerializer.de

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
[root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time

300

[root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl

30

[root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_probes

9

On Thu, 7 May 2020 at 12:32, Adam Scott  wrote:

> Maybe a firewall killing a connection?
>
> What does the following show?
> cat /proc/sys/net/ipv4/tcp_keepalive_time
> cat /proc/sys/net/ipv4/tcp_keepalive_intvl
> cat /proc/sys/net/ipv4/tcp_keepalive_probes
>
> On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
> wrote:
>
>> Hi,
>>
>> We are trying to expand a datacenter and trying to add nodes but when
>> node is bootstrapping , it goes half way through and then fail with below
>> error, We have increased stremthroughput from 200 to 400 when we were
>> trying for the 2nd time but still it failed. We are on 3.11.0 , using G1GC
>> with 31GB heap.
>>
>> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
>> CassandraDaemon.java:228 - Exception in thread
>> Thread[MessagingService-Incoming-/10.X.X.X,main]
>>
>> java.io.IOError: java.io.EOFException: Stream ended prematurely
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> Caused by: java.io.EOFException: Stream ended prematurely
>>
>> at
>> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
>> ~[lz4-1.3.0.jar:na]
>>
>> at
>> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
>> ~[lz4-1.3.0.jar:na]
>>
>> at
>> net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
>> ~[lz4-1.3.0.jar:na]
>>
>> at java.io.DataInputStream.readFully(DataInputStream.java:195)
>> ~[na:1.8.0_242]
>>
>> at java.io.DataInputStream.readFully(DataInputStream.java:169)
>> ~[na:1.8.0_242]
>>
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:665)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:606)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at org.apache.cassandra.db.Columns.apply(Columns.java:377)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at
>> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>
>> at

Re: Bootstraping is failing

2020-05-07 Thread Adam Scott
Maybe a firewall killing a connection?

What does the following show?
cat /proc/sys/net/ipv4/tcp_keepalive_time
cat /proc/sys/net/ipv4/tcp_keepalive_intvl
cat /proc/sys/net/ipv4/tcp_keepalive_probes

On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta 
wrote:

> Hi,
>
> We are trying to expand a datacenter and trying to add nodes but when node
> is bootstrapping , it goes half way through and then fail with below error,
> We have increased stremthroughput from 200 to 400 when we were trying for
> the 2nd time but still it failed. We are on 3.11.0 , using G1GC with 31GB
> heap.
>
> ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
> CassandraDaemon.java:228 - Exception in thread
> Thread[MessagingService-Incoming-/10.X.X.X,main]
>
> java.io.IOError: java.io.EOFException: Stream ended prematurely
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> Caused by: java.io.EOFException: Stream ended prematurely
>
> at
> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
> ~[lz4-1.3.0.jar:na]
>
> at
> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
> ~[lz4-1.3.0.jar:na]
>
> at
> net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
> ~[lz4-1.3.0.jar:na]
>
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> ~[na:1.8.0_242]
>
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> ~[na:1.8.0_242]
>
> at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:665)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:606)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at org.apache.cassandra.db.Columns.apply(Columns.java:377)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:431)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>
> ... 11 common frames omitted
>
> Thanks
> Surbhi
>


Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
Hi,

We are trying to expand a datacenter and trying to add nodes but when node
is bootstrapping , it goes half way through and then fail with below error,
We have increased stremthroughput from 200 to 400 when we were trying for
the 2nd time but still it failed. We are on 3.11.0 , using G1GC with 31GB
heap.

ERROR [MessagingService-Incoming-/10.X.X.X] 2020-05-07 09:42:38,933
CassandraDaemon.java:228 - Exception in thread
Thread[MessagingService-Incoming-/10.X.X.X,main]

java.io.IOError: java.io.EOFException: Stream ended prematurely

at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:814)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
~[apache-cassandra-3.11.0.jar:3.11.0]

at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
~[apache-cassandra-3.11.0.jar:3.11.0]

Caused by: java.io.EOFException: Stream ended prematurely

at
net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218)
~[lz4-1.3.0.jar:na]

at
net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150)
~[lz4-1.3.0.jar:na]

at
net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
~[lz4-1.3.0.jar:na]

at java.io.DataInputStream.readFully(DataInputStream.java:195)
~[na:1.8.0_242]

at java.io.DataInputStream.readFully(DataInputStream.java:169)
~[na:1.8.0_242]

at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:665)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:606)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242)
~[apache-cassandra-3.11.0.jar:3.11.0]

at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197)
~[apache-cassandra-3.11.0.jar:3.11.0]

at org.apache.cassandra.db.Columns.apply(Columns.java:377)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:431)
~[apache-cassandra-3.11.0.jar:3.11.0]

at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
~[apache-cassandra-3.11.0.jar:3.11.0]

... 11 common frames omitted

Thanks
Surbhi