Re: inter dc bandwidth calculation

2020-01-27 Thread Georg Brandemann
Hello,

just as a small addition: The numbers also depend on your consistency level
used for reads. It will behave like that if you just read on local nodes.
If you do reads on ALL,  QUORUM or EACH_QUORUM etc. you need also include
the read volume in the calculation.

Regards,
Georg

Am Mi., 15. Jan. 2020 um 19:35 Uhr schrieb Osman Yozgatlıoğlu <
osman.yozgatlio...@gmail.com>:

> Thank you. I have an insight now.
>
> Regards,
> Osman
>
> On Wed, 15 Jan 2020 at 19:18, Reid Pinchback 
> wrote:
> >
> > Oh, duh.  Revise that.  I was forgetting that multi-dc writes are sent
> to a single node in the other dc and tagged to be forwarded to other nodes
> within the dc.
> >
> > So your quick-and-dirty estimate would be more like (write volume) x 2
> to leave headroom for random other mechanics.
> >
> > R
> >
> >
> > On 1/15/20, 11:07 AM, "Reid Pinchback" 
> wrote:
> >
> >  Message from External Sender
> >
> > I would think that it would be largely driven by the replication
> factor.  It isn't that the sstables are forklifted from one dc to another,
> it's just that the writes being made to the memtables are also shipped
> around by the coordinator nodes as the writes happen.  Operations at the
> sstable level, like compactions, are local to the node.
> >
> > One potential wrinkle that I'm unclear on, is related to repairs.  I
> don't know if merkle trees are biased to mostly bounce around only
> intra-dc, versus how often they are communicated inter-dc.  Note that even
> queries can trigger some degree of repair traffic if you have a usage
> pattern of trying to read data recently written, because at the bleeding
> edge of the recent changes you'll have more cases of rows not having had
> time to settle to a consistent state.
> >
> > If you want a quick-and-dirty heuristic, I'd probably take (write
> volume) x (replication factor) x 2 as a guestimate so you have some
> headroom for C* and TCP mechanics, but then monitor to see what your real
> use is.
> >
> > R
> >
> >
> > On 1/15/20, 4:14 AM, "Osman Yozgatlıoğlu" <
> osman.yozgatlio...@gmail.com> wrote:
> >
> >  Message from External Sender
> >
> > Hello,
> >
> > Is there any way to calculate inter dc bandwidth requirements for
> > proper operation?
> > I can't find any info about this subject.
> > Can we say, how much sstable collected at one dc has to be
> transferred to other?
> > I can calculate bandwidth with generated sstable then.
> > I have twcs with one hour window.
> >
> > Regards,
> > Osman
> >
> >
>  -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
> >
> >
> >
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: inter dc bandwidth calculation

2020-01-15 Thread Osman Yozgatlıoğlu
Thank you. I have an insight now.

Regards,
Osman

On Wed, 15 Jan 2020 at 19:18, Reid Pinchback  wrote:
>
> Oh, duh.  Revise that.  I was forgetting that multi-dc writes are sent to a 
> single node in the other dc and tagged to be forwarded to other nodes within 
> the dc.
>
> So your quick-and-dirty estimate would be more like (write volume) x 2 to 
> leave headroom for random other mechanics.
>
> R
>
>
> On 1/15/20, 11:07 AM, "Reid Pinchback"  wrote:
>
>  Message from External Sender
>
> I would think that it would be largely driven by the replication factor.  
> It isn't that the sstables are forklifted from one dc to another, it's just 
> that the writes being made to the memtables are also shipped around by the 
> coordinator nodes as the writes happen.  Operations at the sstable level, 
> like compactions, are local to the node.
>
> One potential wrinkle that I'm unclear on, is related to repairs.  I 
> don't know if merkle trees are biased to mostly bounce around only intra-dc, 
> versus how often they are communicated inter-dc.  Note that even queries can 
> trigger some degree of repair traffic if you have a usage pattern of trying 
> to read data recently written, because at the bleeding edge of the recent 
> changes you'll have more cases of rows not having had time to settle to a 
> consistent state.
>
> If you want a quick-and-dirty heuristic, I'd probably take (write volume) 
> x (replication factor) x 2 as a guestimate so you have some headroom for C* 
> and TCP mechanics, but then monitor to see what your real use is.
>
> R
>
>
> On 1/15/20, 4:14 AM, "Osman Yozgatlıoğlu"  
> wrote:
>
>  Message from External Sender
>
> Hello,
>
> Is there any way to calculate inter dc bandwidth requirements for
> proper operation?
> I can't find any info about this subject.
> Can we say, how much sstable collected at one dc has to be 
> transferred to other?
> I can calculate bandwidth with generated sstable then.
> I have twcs with one hour window.
>
> Regards,
> Osman
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>
>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: inter dc bandwidth calculation

2020-01-15 Thread Reid Pinchback
Oh, duh.  Revise that.  I was forgetting that multi-dc writes are sent to a 
single node in the other dc and tagged to be forwarded to other nodes within 
the dc.

So your quick-and-dirty estimate would be more like (write volume) x 2 to leave 
headroom for random other mechanics.

R


On 1/15/20, 11:07 AM, "Reid Pinchback"  wrote:

 Message from External Sender

I would think that it would be largely driven by the replication factor.  
It isn't that the sstables are forklifted from one dc to another, it's just 
that the writes being made to the memtables are also shipped around by the 
coordinator nodes as the writes happen.  Operations at the sstable level, like 
compactions, are local to the node.

One potential wrinkle that I'm unclear on, is related to repairs.  I don't 
know if merkle trees are biased to mostly bounce around only intra-dc, versus 
how often they are communicated inter-dc.  Note that even queries can trigger 
some degree of repair traffic if you have a usage pattern of trying to read 
data recently written, because at the bleeding edge of the recent changes 
you'll have more cases of rows not having had time to settle to a consistent 
state.

If you want a quick-and-dirty heuristic, I'd probably take (write volume) x 
(replication factor) x 2 as a guestimate so you have some headroom for C* and 
TCP mechanics, but then monitor to see what your real use is.

R


On 1/15/20, 4:14 AM, "Osman Yozgatlıoğlu"  
wrote:

 Message from External Sender

Hello,

Is there any way to calculate inter dc bandwidth requirements for
proper operation?
I can't find any info about this subject.
Can we say, how much sstable collected at one dc has to be transferred 
to other?
I can calculate bandwidth with generated sstable then.
I have twcs with one hour window.

Regards,
Osman

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org







Re: inter dc bandwidth calculation

2020-01-15 Thread Reid Pinchback
I would think that it would be largely driven by the replication factor.  It 
isn't that the sstables are forklifted from one dc to another, it's just that 
the writes being made to the memtables are also shipped around by the 
coordinator nodes as the writes happen.  Operations at the sstable level, like 
compactions, are local to the node.

One potential wrinkle that I'm unclear on, is related to repairs.  I don't know 
if merkle trees are biased to mostly bounce around only intra-dc, versus how 
often they are communicated inter-dc.  Note that even queries can trigger some 
degree of repair traffic if you have a usage pattern of trying to read data 
recently written, because at the bleeding edge of the recent changes you'll 
have more cases of rows not having had time to settle to a consistent state.

If you want a quick-and-dirty heuristic, I'd probably take (write volume) x 
(replication factor) x 2 as a guestimate so you have some headroom for C* and 
TCP mechanics, but then monitor to see what your real use is.

R


On 1/15/20, 4:14 AM, "Osman Yozgatlıoğlu"  wrote:

 Message from External Sender

Hello,

Is there any way to calculate inter dc bandwidth requirements for
proper operation?
I can't find any info about this subject.
Can we say, how much sstable collected at one dc has to be transferred to 
other?
I can calculate bandwidth with generated sstable then.
I have twcs with one hour window.

Regards,
Osman

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org





inter dc bandwidth calculation

2020-01-15 Thread Osman Yozgatlıoğlu
Hello,

Is there any way to calculate inter dc bandwidth requirements for
proper operation?
I can't find any info about this subject.
Can we say, how much sstable collected at one dc has to be transferred to other?
I can calculate bandwidth with generated sstable then.
I have twcs with one hour window.

Regards,
Osman

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org