subject:"\[j\-nsp\] MX304 Port Layout"

Re: [j-nsp] MX304 Port Layout

2023-07-05 Thread Saku Ytti via juniper-nsp

On Wed, 5 Jul 2023 at 04:45, Mark Tinka  wrote:

> This is one of the reasons I prefer to use Ethernet switches to
> interconnect devices in large data centre deployments.
>
> Connecting stuff directly into the core routers or directly together
> eats up a bunch of ports, without necessarily using all the available
> capacity.
>
> But to be fair, at the scale AWS run, I'm not exactly sure how I'd do
> things.

I'm sure it's perfectly reasonable, with some upsides and some
downsides compared to hiding the overhead ports inside chassis fabric
instead of exposing them in front-plate.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-04 Thread Mark Tinka via juniper-nsp





On 7/4/23 09:11, Saku Ytti wrote:


You must have misunderstood. When they fully scale the current design,
the design offers 100T capacity, but they've bought 400T of ports. 3/4
ports are overhead to build the design, to connect the pizzaboxes
together. All ports are used, but only 1/4 are revenue.


Thanks, makes sense.

This is one of the reasons I prefer to use Ethernet switches to 
interconnect devices in large data centre deployments.


Connecting stuff directly into the core routers or directly together 
eats up a bunch of ports, without necessarily using all the available 
capacity.


But to be fair, at the scale AWS run, I'm not exactly sure how I'd do 
things.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-04 Thread Saku Ytti via juniper-nsp

On Tue, 4 Jul 2023 at 08:34, Mark Tinka  wrote:

> Yes, I watched this NANOG session and was also quite surprised when they
> mentioned that they only plan for 25% usage of the deployed capacity.
> Are they giving themselves room to peak before they move to another chip
> (considering that they are likely in a never-ending installation/upgrade
> cycle), or trying to maintain line-rate across a vast number of packet
> sizes? Or both?

You must have misunderstood. When they fully scale the current design,
the design offers 100T capacity, but they've bought 400T of ports. 3/4
ports are overhead to build the design, to connect the pizzaboxes
together. All ports are used, but only 1/4 are revenue.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-03 Thread Mark Tinka via juniper-nsp





On 7/2/23 18:04, Saku Ytti wrote:


Not disagreeing here, but how do we define oversubscribed here? Are
all boxes oversubscribed which can't do a) 100% at max size packet and
b) 100% at min size packet and c) 100% of packets to delay buffer, I
think this would be quite reasonable definition, but as far as I know,
no current device of non-modest scale would satisfy each 3, almost all
of them would only satisfy a).


Well, the typical operator will use "oversubscribed" in the context of 
number of ports vs. chip capacity. However, it is not unwise to consider 
packet handling as a function of oversubscription too.




Let's consider first gen trio serdes
1) 2/4 goes to fabric (btree replication)
2) 1/4 goes to delay buffer
3) 1/4 goes to WAN port
(and actually like 0.2 additionally goes to lookup engine)

So you're selling less than 1/4th of the serdes you ship, more than
3/4 are 'overhead'. Compared to say Silicon1, which is partially
buffered, they're selling almost 1/2 of the serdes they ship. You
could in theory put ports on all of these serdes in BPS terms, but not
in PPS terms at least not with off-chip memory.


To be fair, although Silicon One is Cisco's first iteration of the chip, 
it's not fair to compare it to Trio 1 :-).


But I take your point.



And in each case, in a pizza box case, you could sell those fabric
ports, as there is no fabric. So given NPU has always ~2x the bps in
pizza box format (but usually no more pps). And in MX80/MX104 Juniper
did just this, they sell 80G WAN ports, when in linecard mode it only
is 40G WAN port device. I don't consider it oversubscribed, even
though the minimum packet size went up, because the lookup capacity
didn't increase.


Makes sense, but what that means is that you are more concerned with pps 
while someone else could be more concerned with bps. I guess it depends 
on if your operation is more pps-heavy, while someone else's is more 
bps-heavy with average packet size.




Curiously AMZN told Nanog their ratio, when design is fully scaled to
100T is 1/4, 400T bought ports, 100T useful ports. Unclear how long
100T was going to scale, but obviously they wouldn't launch
architecture which needs to be redone next year, so when they decided
100T cap for the scale, they didn't have 100T need yet. This design
was with 112Gx128 chips, and boxes were single chip, so all serdes
connect ports, no fabrics, i.e. true pizzabox.
I found this very interesting, because the 100T design was, I think 3
racks? And last year 50T asics shipped, next year we'd likely get 100T
asics (224Gx512? or 112Gx1024?). So even hyperscalers are growing
slower than silicon, and can basically put their dc-in-a-chip, greatly
reducing cost (both CAPEX and OPEX) as no need for wasting 3/4th of
the investment on overhead.


Yes, I watched this NANOG session and was also quite surprised when they 
mentioned that they only plan for 25% usage of the deployed capacity. 
Are they giving themselves room to peak before they move to another chip 
(considering that they are likely in a never-ending installation/upgrade 
cycle), or trying to maintain line-rate across a vast number of packet 
sizes? Or both?




The scale also surprised me, even though perhaps it should not have,
they quoted +1M network devices, considering they quote +20M nitro
system shipped, that's like <20 revenue generating compute per network
device. Depending on the refresh cycle, this means amazon is buying
15-30k network devices per month, which I expect is significantly more
than cisco+juniper+nokia ship combined to SP infra, so no wonder SPs
get little love.


Well, the no-love to service providers has been going on for some time 
now. It largely started with the optical vendors, around 2015 when 
coherent gave us 400Gbps waves over medium-haul distances, and the 
content folk began deploying DCI's. Around the same time, submarine 
systems began deploying uncompensated cables, and with most of them 
being funded by the content folk, optical vendors focused 90% of their 
attention that way, ignoring the service providers.


The content folk are largely IETF people, so they had options around 
what they could do to optimize routing and switching (including building 
their own gear). But I see that there is some interest in what they can 
do with chips from Cisco, Juniper and Nokia if they have arrangements 
where those are opened up to them for self development; not to mention 
Broadcom, which means we - as network operators - are likely to see even 
far less love from routing/switching vendors, going forward.


But with AWS deploying that many nodes, even with tooling, it must be a 
mission staying on top of software (and hardware) upgrades.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Saku Ytti via juniper-nsp

On Sun, 2 Jul 2023 at 17:15, Mark Tinka  wrote:

> Technically, do we not think that an oversubscribed Juniper box with a
> single Trio 6 chip with no fabric is feasible? And is it not being built
> because Juniper don't want to cannibalize their other distributed
> compact boxes?
>
> The MX204, for example, is a single Trio 3 chip that is oversubscribed
> by an extra 240Gbps. So we know they can do it. The issue with the MX204
> is that most customers will run out of ports before they run out of
> bandwidth.

Not disagreeing here, but how do we define oversubscribed here? Are
all boxes oversubscribed which can't do a) 100% at max size packet and
b) 100% at min size packet and c) 100% of packets to delay buffer, I
think this would be quite reasonable definition, but as far as I know,
no current device of non-modest scale would satisfy each 3, almost all
of them would only satisfy a).

Let's consider first gen trio serdes
1) 2/4 goes to fabric (btree replication)
2) 1/4 goes to delay buffer
3) 1/4 goes to WAN port
(and actually like 0.2 additionally goes to lookup engine)

So you're selling less than 1/4th of the serdes you ship, more than
3/4 are 'overhead'. Compared to say Silicon1, which is partially
buffered, they're selling almost 1/2 of the serdes they ship. You
could in theory put ports on all of these serdes in BPS terms, but not
in PPS terms at least not with off-chip memory.

And in each case, in a pizza box case, you could sell those fabric
ports, as there is no fabric. So given NPU has always ~2x the bps in
pizza box format (but usually no more pps). And in MX80/MX104 Juniper
did just this, they sell 80G WAN ports, when in linecard mode it only
is 40G WAN port device. I don't consider it oversubscribed, even
though the minimum packet size went up, because the lookup capacity
didn't increase.

Curiously AMZN told Nanog their ratio, when design is fully scaled to
100T is 1/4, 400T bought ports, 100T useful ports. Unclear how long
100T was going to scale, but obviously they wouldn't launch
architecture which needs to be redone next year, so when they decided
100T cap for the scale, they didn't have 100T need yet. This design
was with 112Gx128 chips, and boxes were single chip, so all serdes
connect ports, no fabrics, i.e. true pizzabox.
I found this very interesting, because the 100T design was, I think 3
racks? And last year 50T asics shipped, next year we'd likely get 100T
asics (224Gx512? or 112Gx1024?). So even hyperscalers are growing
slower than silicon, and can basically put their dc-in-a-chip, greatly
reducing cost (both CAPEX and OPEX) as no need for wasting 3/4th of
the investment on overhead.
The scale also surprised me, even though perhaps it should not have,
they quoted +1M network devices, considering they quote +20M nitro
system shipped, that's like <20 revenue generating compute per network
device. Depending on the refresh cycle, this means amazon is buying
15-30k network devices per month, which I expect is significantly more
than cisco+juniper+nokia ship combined to SP infra, so no wonder SPs
get little love.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Mark Tinka via juniper-nsp





On 7/2/23 15:19, Saku Ytti wrote:


Right as is MX304.

I don't think this is 'my definition', everything was centralised
originally, until Cisco7500 came out, which then had distributed
forwarding capabilities.

Now does centralisation truly mean BOM benefit to vendors? Probably
not, but it may allow to address one lower margin market which as
lower per-port performance needs, without cannibilising larger margin
market.


Technically, do we not think that an oversubscribed Juniper box with a 
single Trio 6 chip with no fabric is feasible? And is it not being built 
because Juniper don't want to cannibalize their other distributed 
compact boxes?


The MX204, for example, is a single Trio 3 chip that is oversubscribed 
by an extra 240Gbps. So we know they can do it. The issue with the MX204 
is that most customers will run out of ports before they run out of 
bandwidth.


I don't think it's that vendors using Broadcom to oversubscribe a 
high-capacity chip is the issue. It's that other vendors with in-house 
silicon won't do the same with their own silicon.



Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Saku Ytti via juniper-nsp

On Sun, 2 Jul 2023 at 15:53, Mark Tinka via juniper-nsp
 wrote:

> Well, by your definition, the ASR9903, for example, is a distributed
> platform, which has a fabric ASIC via the RP, with 4x NPU's on the fixed
> line card, 2x NPU's on the 800Gbps PEC and 4x NPU's on the 2Tbps PEC.

Right as is MX304.

I don't think this is 'my definition', everything was centralised
originally, until Cisco7500 came out, which then had distributed
forwarding capabilities.

Now does centralisation truly mean BOM benefit to vendors? Probably
not, but it may allow to address one lower margin market which as
lower per-port performance needs, without cannibilising larger margin
market.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Mark Tinka via juniper-nsp





On 6/28/23 09:29, Saku Ytti via juniper-nsp wrote:


This of course makes it more redundant than distributed box, because
distributed boxes don't have NPU redundancy.


Well, by your definition, the ASR9903, for example, is a distributed 
platform, which has a fabric ASIC via the RP, with 4x NPU's on the fixed 
line card, 2x NPU's on the 800Gbps PEC and 4x NPU's on the 2Tbps PEC.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Mark Tinka via juniper-nsp





On 7/2/23 11:18, Saku Ytti wrote:


In this context, these are all distributed platforms, they have
multiple NPUs and fabric. Centralised has a single forwarding chip,
and significantly more ports than bandwidth.


So to clarify your definition of "centralized", even if there is no 
replaceable fabric, and the line cards communicate via a fixed fabric 
ASIC, you'd still define that as a distributed platform?


By your definition, you are speaking about fixed form factor platforms 
with neither a replaceable fabric nor fabric ASIC, like the MX204, 
ASR920, ACX7024, 7520-IXR, e.t.c.?


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Saku Ytti via juniper-nsp

On Sun, 2 Jul 2023 at 12:11, Mark Tinka  wrote:

> Well, for data centre aggregation, especially for 100Gbps transit ports
> to customers, centralized routers make sense (MX304, MX10003, ASR9903,
> e.t.c.). But those boxes don't make sense as Metro-E routers... they can
> aggregate Metro-E routers, but can't be Metro-E routers due to their cost.

In this context, these are all distributed platforms, they have
multiple NPUs and fabric. Centralised has a single forwarding chip,
and significantly more ports than bandwidth.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Mark Tinka via juniper-nsp





On 7/2/23 10:42, Saku Ytti wrote:


Yes. Satellite is basically VLAN aggregation, but a little bit less
broken. Both are much inferior to MPLS.


I agree that using vendor satellites solves this problem. The issue, 
IIRC, is was what happens when you need to have the satellites in rings?


Satellites work well when fibre is not an issue, and each satellite can 
hang off the PE router like a spur. But if you need to build rings in 
order to cover as many areas as possible at a reasonable cost, 
satellites seemed to struggled to have scalable ring topologies. This 
could have changed over time, not sure. I stopped tracking satellite 
technologies around 2010.




  But usually that's not the
comparison due to real or perceived cost reasons. So in absence of a
vendor selling you the front-plate you need, option space often
considered is satellite or vlan aggregation, instead of connecting
some smaller MPLS edge boxes to bigger aggregation MPLS boxes, which
would be, in my opinion, obviously better.


The cost you pay for a small Metro-E router optimized for ring 
deployments is more than paid back in the operational simplicity that 
comes with MPLS-based rings. Having ran such architectures for close to 
15 years now (since the Cisco ME3600X/3800X), I can tell you how much 
easier it has been for us to scale and keep customers because we did not 
have to run Layer 2 rings like our competitors did.




But as discussed, centralised chassis boxes are appearing as a new
option to the option space.


Well, for data centre aggregation, especially for 100Gbps transit ports 
to customers, centralized routers make sense (MX304, MX10003, ASR9903, 
e.t.c.). But those boxes don't make sense as Metro-E routers... they can 
aggregate Metro-E routers, but can't be Metro-E routers due to their cost.


I think there is still a use-case for distributed boxes like the MX480 
and MX960, for cases where you have to aggregate plenty of 1Gbps and 
10Gbps customers. Those line cards, especially the ones that are now 
EoS/EoL, are extremely cheap and more than capable of supporting 1Gbps 
and 10Gbps services in the data centre. At the moment, with modern 
centralized routers optimized for 100Gbps and 400Gbps, using them to 
aggregate 10Gbps services or lower maybe be costlier than, say, an MX480 
or MX960 with MPC2E or MPC7E line cards attached to a dense Ethernet 
switch via 802.1Q.


For the moment, the Metro-E router that makes the most sense to us is 
the ACX7024. Despite its Broadcom base, we seem to have found a way to 
make it work for us, and replace the ASR920.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Saku Ytti via juniper-nsp

On Sun, 2 Jul 2023 at 11:38, Mark Tinka  wrote:

> So all the above sounds to me like scenarios where Metro-E rings are
> built on 802.1Q/Q-in-Q/REP/STP/e.t.c., rather than IP/MPLS.

Yes. Satellite is basically VLAN aggregation, but a little bit less
broken. Both are much inferior to MPLS. But usually that's not the
comparison due to real or perceived cost reasons. So in absence of a
vendor selling you the front-plate you need, option space often
considered is satellite or vlan aggregation, instead of connecting
some smaller MPLS edge boxes to bigger aggregation MPLS boxes, which
would be, in my opinion, obviously better.

But as discussed, centralised chassis boxes are appearing as a new
option to the option space.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-07-02 Thread Mark Tinka via juniper-nsp





On 6/28/23 08:44, Saku Ytti wrote:


Apart from obvious stuff like QoS getting difficult, not full feature
parity with VLAN and main interface, or counters becoming less useful
as many are port level so identifying true source port may not be
easy. There are things that you'll just discover over time that don't
even come to your mind, and I don't know what those will be in your
deployment. I can give anecdotes

2*VXR termination of metro L2 ring
 - everything is 'ok'
 - ethernet pseudowire service is introduced to customers
 - occasionally there are loops now
 - well VXR goes to promisc mode when you add ethernet pseudowire,
because while it has VLAN local significancy, it doesn't have per-vlan
MAC filter.
 - now unrelated L3 VLAN, which is redundantly terminated to both
VXR has customer CE down in the L2 metro
 - because ARP timeout is 4h, and MAC timeout is 300s, the the
metro will forget the MAC fast, L3 slowly
 - so primary PE gets packet off of internet, sends to metro, metro
floods to all ports, including secondary PE
 - secondary PE sends packet to primary PE, over WAN
 - now you learned 'oh yeah, i should have ensured there is
per-vlan mac filter' and 'oh yeah, my MAC/ARP timeouts are
misconfigured'
 - but these are probably not the examples you'll learn, they'll be
something different
 - when you do satellite, you can solve lot of the problem scope by
software as you control L2 and L3, and can do proprietary code

L2 transparency
 - You do QinQ in L2 aggregation, to pass customer frame to
aggregation termination
 - You do MAC rewrite in/out of the L2 aggregation (customer MAC
addresses get rewritten coming in from customer, and mangled back to
legitimate MAC going out to termination). You need this to pass STP
and such in pseudowires from customer to termination
 - In termination hardware physically doesn't consider VLAN+ISIS
legitimate packet and will kill it, so you have no way of supporting
ISIS inside pseudowire when you have L2 aggregation to customer.
Technically it's not valid, technically ISIS isn't EthernetII, and
802.3 doesn't have VLANs. But technically correct rarely reduces the
red hue in customers faces when they inform about issues they are
experiencing.
 - even if this works, there are plenty of other ways pseudowire
transparency suffers with L2 aggregation, as you are experiencing set
of limitations from two box, instead of one box when it comes to
transparency, and these sets wont be identical
 - you will introduce MAC limit to your point-to-point martini
product, which didn't previously exist. Because your L2 ring is
redundant and you need mac learning. If it's just single switch, you
can turn off MAC learning per VLAN, and be closer to satellite
solution

Convergence
 - your termination no longer observes hardware liveness detection,
so you need some solution to transfer L2 port state to VLAN. Which
will occasionally break, as it's new complexity.


So all the above sounds to me like scenarios where Metro-E rings are 
built on 802.1Q/Q-in-Q/REP/STP/e.t.c., rather than IP/MPLS.


We run fairly large Metro-E rings, but we run them as IP/MPLS rings, and 
all the issues you describe above are the reasons we pushed the vendors 
(Cisco in particular) to provide boxes that were optimized for the 
Metro-E applications, but had proper IP/MPLS support. In other words, 
these are largely solved problems.


I think many - if not all - of the issues you raise above can be fixed 
by, say, a Cisco ASR920 deployed at scale in the Metro, running IP/MPLS 
for the backbone, end-to-end.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-28 Thread Saku Ytti via juniper-nsp

On Tue, 27 Jun 2023 at 19:47, Tarko Tikan via juniper-nsp
 wrote:

> Single NPU doesn't mean non-redundant - those devices run two (or 4 in
> ACX case) BCM NPUs and switch "linecards" over to backup NPU when
> required. All without true fabric and distributed NPUs to keep the cost
> down.

This of course makes it more redundant than distributed box, because
distributed boxes don't have NPU redundancy.

Somewhat analogous how RR makes your network more redundant than
full-mesh. Because in full-mesh every iBGP flap is out of order,
whereas in RR a single iBGP flap has no impact. Of course parallel
continues to scope of outage, in full-mesh losing single iBGP isn't a
big outage, in RR it's binary, either nothing is broken or all is.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-28 Thread Saku Ytti via juniper-nsp

On Tue, 27 Jun 2023 at 19:32, Mark Tinka  wrote:

> > Yes.
>
> How?

Apart from obvious stuff like QoS getting difficult, not full feature
parity with VLAN and main interface, or counters becoming less useful
as many are port level so identifying true source port may not be
easy. There are things that you'll just discover over time that don't
even come to your mind, and I don't know what those will be in your
deployment. I can give anecdotes

2*VXR termination of metro L2 ring
- everything is 'ok'
- ethernet pseudowire service is introduced to customers
- occasionally there are loops now
- well VXR goes to promisc mode when you add ethernet pseudowire,
because while it has VLAN local significancy, it doesn't have per-vlan
MAC filter.
- now unrelated L3 VLAN, which is redundantly terminated to both
VXR has customer CE down in the L2 metro
- because ARP timeout is 4h, and MAC timeout is 300s, the the
metro will forget the MAC fast, L3 slowly
- so primary PE gets packet off of internet, sends to metro, metro
floods to all ports, including secondary PE
- secondary PE sends packet to primary PE, over WAN
- now you learned 'oh yeah, i should have ensured there is
per-vlan mac filter' and 'oh yeah, my MAC/ARP timeouts are
misconfigured'
- but these are probably not the examples you'll learn, they'll be
something different
- when you do satellite, you can solve lot of the problem scope by
software as you control L2 and L3, and can do proprietary code

L2 transparency
- You do QinQ in L2 aggregation, to pass customer frame to
aggregation termination
- You do MAC rewrite in/out of the L2 aggregation (customer MAC
addresses get rewritten coming in from customer, and mangled back to
legitimate MAC going out to termination). You need this to pass STP
and such in pseudowires from customer to termination
- In termination hardware physically doesn't consider VLAN+ISIS
legitimate packet and will kill it, so you have no way of supporting
ISIS inside pseudowire when you have L2 aggregation to customer.
Technically it's not valid, technically ISIS isn't EthernetII, and
802.3 doesn't have VLANs. But technically correct rarely reduces the
red hue in customers faces when they inform about issues they are
experiencing.
- even if this works, there are plenty of other ways pseudowire
transparency suffers with L2 aggregation, as you are experiencing set
of limitations from two box, instead of one box when it comes to
transparency, and these sets wont be identical
- you will introduce MAC limit to your point-to-point martini
product, which didn't previously exist. Because your L2 ring is
redundant and you need mac learning. If it's just single switch, you
can turn off MAC learning per VLAN, and be closer to satellite
solution

Convergence
- your termination no longer observes hardware liveness detection,
so you need some solution to transfer L2 port state to VLAN. Which
will occasionally break, as it's new complexity.

> > Like cat6500/7600 linecards without DFC, so SP gear with centralised
> > logic, and dumb 'low performance' linecards. Given low performance
> > these days is multi Tbps chips.
>
> While I'm not sure operators want that, they will take a look if the
> lower price does not impact performance.
>
> There is more to just raw speed.

I mean of course it affects performance, as you are now having all
ports in single chip, instead of having many chips.  But when it comes
to PPS people are confused about performance, no one* (well maybe 1 in
100k running some esoteric application) cares about wire-rate.
If you are running a card like 4x100 ASR9k, you absolutely want
wire-speed, because there is 1 chip per port, and you want the pool
port is drawing from to have 1 port wire rate free, to ingest dos in
mostly idle interface. But if you have 128x100GE in a chip, you're
happy with 1/3 PPS easily, probably much much less. Because you're not
gonna exhaust that massive pool in any practical scenario, and several
interfaces simultaneously can ingest dos.




-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Mark Tinka via juniper-nsp





On 6/27/23 19:44, Gert Doering wrote:


The issues we see / have with "satellites that are not real satellites"
are

  - l2 link down -> l3 not going down
  - (H-)QoS on the L3 device to avoid sending 10/100G worth of traffic
down to a 100M customer port, saturating the "vlan on trunk" link
for no good

(we run config-management-system managed EX3400s off an Arista MLAG pair,
so the benefits outweigh the drawbacks for us)


Same here.

I think this is the architecture most operators run, especially because 
despite centralized routers being less disjointed, they are still 
pricier than attaching a switch to a router port via an 802.1Q trunk. If 
you are aggregating plenty of 1Gbps or 10Gbps ports that aren't each 
carrying lots of traffic, centralized ports will be more expensive than 
a traditional switch<=>router architecture compared to even the cheapest 
and most dense centralized router.


I think centralized routers make a lot of sense when aggregating 100Gbps 
services, because doing these on a switch<=>router architecture is going 
to be tough when you try to aggregate N x 100Gbps links, or even a 
handful of 400Gbps links for the 802.1Q trunk, as most of those 
customers will be riding their 100Gbps port really hard, i.e., there is 
enough distance between 10Gbps and 100Gbps, while there really isn't any 
between 100Gbps and 400Gbps.


One of the biggest risks I find with a standard switch<=>router 
architecture is the outage that could happen if that trunk goes down. 
Yes, you could mitigate that by making the trunk a LAG of multiple 
links, but the challenge that creates is how to do policing on the 
router sub-interface because the policer is split across the member 
links in the LAG. On the other hand, one could get around this 
commercially by letting customers know the SLA associated with being 
connected to only "one device", and provide the option of connecting to 
"a second device".


The other issue I find with the switch<=>router architecture is where to 
do policing. With LAG's, shaping on the switch ports makes sense because 
you don't run into having to police across member links in a LAG on the 
router port. Most aggregation switches do not support egress policing 
with Broadcom chips, so shaping is your only tool. It has worked well 
enough for us on the Arista switches that we have deployed for this, but 
was not a reasonable option when we used to run the EX4550 that only had 
4MB of packet buffers.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Gert Doering via juniper-nsp

Hi,

On Tue, Jun 27, 2023 at 06:32:49PM +0200, Mark Tinka via juniper-nsp wrote:
> How?

The issues we see / have with "satellites that are not real satellites" 
are

 - l2 link down -> l3 not going down
 - (H-)QoS on the L3 device to avoid sending 10/100G worth of traffic
   down to a 100M customer port, saturating the "vlan on trunk" link
   for no good

(we run config-management-system managed EX3400s off an Arista MLAG pair,
so the benefits outweigh the drawbacks for us)

gert
-- 
"If was one thing all people took for granted, was conviction that if you 
 feed honest figures into a computer, honest figures come out. Never doubted 
 it myself till I met a computer with a sense of humor."
 Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany g...@greenie.muc.de


signature.asc
Description: PGP signature
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Mark Tinka via juniper-nsp





On 6/27/23 18:45, Tarko Tikan via juniper-nsp wrote:

Previously mentioned centralized boxes are actually becoming more and 
more common now (in addition to non-redundant pizzabox formfactor that 
has been available for ages) that single NPU can do 2+ Tbps. For BCM 
J2 see ACX7509, Nokia IXR-R6d or certain Cisco NCS models. All pretty 
good fits for SP aggregation networks.


Single NPU doesn't mean non-redundant - those devices run two (or 4 in 
ACX case) BCM NPUs and switch "linecards" over to backup NPU when 
required. All without true fabric and distributed NPUs to keep the 
cost down.


IMHO these are very compelling options for SP market, you get 
selection of linecards for different applications/architectures yet 
the overall size of the device (and power consumption) is small. I've 
been trying to explain to one of the major vendors that they should 
build similar device with their own silicon so you get similar 
benefits while keeping the rich featureset that comes with vendor 
silicon compared to BCM.


Ah, okay - I'm with you now.

I had a totally different definition in my head for what "centralized" 
meant.


Yes, agreed - these make sense because of just how cheap, yet fast, 
Broadcom chips are.


Like you, I've been pushing our friendly vendors to build similar 
architectures with their in-house silicon, and like you, it has fallen 
on deaf ears.


Because IP traffic is becoming more and more public, the need for 
features that drove end-to-end on-net VPN's in silicon will continue to 
decline. If that helps operators to simplify their product technical 
configuration to where Broadcom can handle 90% of all scenarios, giving 
up the other 10% may be worth the capex/opex savings. It is that 10% 
that many operators want to keep, and the traditional vendors are 
locking up that 10% in their own silicon to print money.


With every passing year, Broadcom adds to the things I want that it 
could not do. I am close to the 90% for some of these platforms, to 
where I can consider them now.


Mark.

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Tarko Tikan via juniper-nsp


hey,

While I'm not sure operators want that, they will take a look if the 
lower price does not impact performance.


Previously mentioned centralized boxes are actually becoming more and 
more common now (in addition to non-redundant pizzabox formfactor that 
has been available for ages) that single NPU can do 2+ Tbps. For BCM J2 
see ACX7509, Nokia IXR-R6d or certain Cisco NCS models. All pretty good 
fits for SP aggregation networks.


Single NPU doesn't mean non-redundant - those devices run two (or 4 in 
ACX case) BCM NPUs and switch "linecards" over to backup NPU when 
required. All without true fabric and distributed NPUs to keep the cost 
down.


IMHO these are very compelling options for SP market, you get selection 
of linecards for different applications/architectures yet the overall 
size of the device (and power consumption) is small. I've been trying to 
explain to one of the major vendors that they should build similar 
device with their own silicon so you get similar benefits while keeping 
the rich featureset that comes with vendor silicon compared to BCM.


--
tarko

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Mark Tinka via juniper-nsp





On 6/27/23 17:07, Saku Ytti wrote:



Yes.


How?



Like cat6500/7600 linecards without DFC, so SP gear with centralised
logic, and dumb 'low performance' linecards. Given low performance
these days is multi Tbps chips.


While I'm not sure operators want that, they will take a look if the 
lower price does not impact performance.


There is more to just raw speed.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Saku Ytti via juniper-nsp

On Tue, 27 Jun 2023 at 17:40, Mark Tinka  wrote:

> Would that be high-density face-plate solutions for access aggregation
> in the data centre, that they are> Are you suggesting standard 802.1Q/Q-in-Q 
> trunking from a switch into a
> "pricey" router line card that support locally-significant VLAN's per
> port is problematic?

Yes.

> I'm still a bit unclear on what you mean by "centralized"... in the
> context of satellite, or standalone?

Like cat6500/7600 linecards without DFC, so SP gear with centralised
logic, and dumb 'low performance' linecards. Given low performance
these days is multi Tbps chips.

--
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Mark Tinka via juniper-nsp





On 6/27/23 09:02, Saku Ytti wrote:


Juniper messaging seems to be geo-specific, in EU their sales seems to
sell them more willingly than in US. My understanding is that
basically fusion is dead, but they don't actually have solution for
access/SP market front-plate, so some sales channels are still
pitching it as the solution.


Would that be high-density face-plate solutions for access aggregation 
in the data centre, that they are struggling with?


I haven't used their EX platform since the EX4600 broke things, and I 
have never tried their QFX platform. So not sure if Juniper have any 
decent switch offerings for large scale data centre aggregation in 1U 
form factors, that customers are actually happy with.




Nokia seems very committed to it.

I think the solution space is
a) centralised lookup engines - so you have cheap(er) line cards
for high density low pps/bps
b) satellite
c) vlan aggregation

Satellite is basically a specific scenario of c), but it does bring
significant derisking compared to vlan aggregation, as a single
instance is designing it and can solve some problems better than can
be solved by vendor agnostic vlan aggregation. Vlan aggregation looks
very simple on the surface but is fraught with problems, many of which
are slightly better solved in satellites, and these problems will not
be identified ahead of time but during the next two decades of
operation.


Are you suggesting standard 802.1Q/Q-in-Q trunking from a switch into a 
"pricey" router line card that support locally-significant VLAN's per 
port is problematic?




Centralised boxes haven't been available for quite a few years, but
hopefully Cisco is changing that, I think it's the right compromise
for SPs.


I'm still a bit unclear on what you mean by "centralized"... in the 
context of satellite, or standalone?


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Sander Steffann via juniper-nsp

Hi,

> On 26 Jun 2023, at 20:56, Jackson, William via juniper-nsp 
>  wrote:
> 
>> The MX204 is an MPC7E, so whatever H-QoS is on the MPC7E is what the 
>> MX204 will also do.
> 
>> We have used them as an edge router on a temporary basis at new sites, 
>> with an Arista switch hanging off of them via an 802.1Q trunk, until we 
>> can get our standard MX480 to site. They are capable for such a 
>> use-case. But usually, we use them for peering and value-added traffic.
> 
> Similar use case here but we use a QFX as a fusion satellite if port 
> expansion is required.
> Works well as an small site start up option.

I have done the same. One thing to remember is that even though you can run 
dual-RE on the MX304, the QFX won’t be able to do an upgrade without rebooting. 
Don’t plan to do ISSU in such a setup.

Cheers,
Sander

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-27 Thread Saku Ytti via juniper-nsp

On Tue, 27 Jun 2023 at 06:02, Mark Tinka via juniper-nsp
 wrote:

> > Similar use case here but we use a QFX as a fusion satellite if port 
> > expansion is required.
> > Works well as an small site start up option.
>
> Are vendors still pushing their satellite switches :-)?
>
> That technology looked dodgy to me when Cisco first proposed it with
> 9000v, and then Juniper and Nokia followed with their own implementations.

Juniper messaging seems to be geo-specific, in EU their sales seems to
sell them more willingly than in US. My understanding is that
basically fusion is dead, but they don't actually have solution for
access/SP market front-plate, so some sales channels are still
pitching it as the solution.

Nokia seems very committed to it.

I think the solution space is
   a) centralised lookup engines - so you have cheap(er) line cards
for high density low pps/bps
   b) satellite
   c) vlan aggregation

Satellite is basically a specific scenario of c), but it does bring
significant derisking compared to vlan aggregation, as a single
instance is designing it and can solve some problems better than can
be solved by vendor agnostic vlan aggregation. Vlan aggregation looks
very simple on the surface but is fraught with problems, many of which
are slightly better solved in satellites, and these problems will not
be identified ahead of time but during the next two decades of
operation.

Centralised boxes haven't been available for quite a few years, but
hopefully Cisco is changing that, I think it's the right compromise
for SPs.

But in reality I'm not sure if centralised actually makes sense, since
I don't think we can axiomatically assume it costs less to the vendor,
even though there is less BOM, the centralised design does add more
engineering cost. It might be basically a way to sell boxes to some
market at lower margins, while ensuring that hyperscalers don't buy
them, instead of directly benefiting from the cost reduction.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-26 Thread Mark Tinka via juniper-nsp





On 6/26/23 20:56, Jackson, William via juniper-nsp wrote:


Similar use case here but we use a QFX as a fusion satellite if port expansion 
is required.
Works well as an small site start up option.


Are vendors still pushing their satellite switches :-)?

That technology looked dodgy to me when Cisco first proposed it with 
9000v, and then Juniper and Nokia followed with their own implementations.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-26 Thread Jackson, William via juniper-nsp

>The MX204 is an MPC7E, so whatever H-QoS is on the MPC7E is what the 
>MX204 will also do.

>We have used them as an edge router on a temporary basis at new sites, 
>with an Arista switch hanging off of them via an 802.1Q trunk, until we 
>can get our standard MX480 to site. They are capable for such a 
>use-case. But usually, we use them for peering and value-added traffic.

Similar use case here but we use a QFX as a fusion satellite if port expansion 
is required.
Works well as an small site start up option.

-Original Message-
From: juniper-nsp  On Behalf Of Mark Tinka 
via juniper-nsp
Sent: Saturday, June 10, 2023 11:03 AM
To: juniper-nsp@puck.nether.net
Subject: Re: [j-nsp] MX304 Port Layout

On 6/9/23 17:46, Andrey Kostin via juniper-nsp wrote:

>  We have two MX204s running in pair with 2x100G taken for links 
> between them and remaining BW is 6x100G for actual forwarding in/out.
> In this case it's kind of at the same level for price/100G value.

Yeah, using the MX204 like this (edge router functions) is costly on the ports 
it's already lacking.

>
> I agree, and that's why I asked about HQoS experience, just to add 
> more inexpensive low-speed switch ports via trunk but still be able to 
> treat them more like separate ports from a router perspective.

The MX204 is an MPC7E, so whatever H-QoS is on the MPC7E is what the 
MX204 will also do.

We have used them as an edge router on a temporary basis at new sites, 
with an Arista switch hanging off of them via an 802.1Q trunk, until we 
can get our standard MX480 to site. They are capable for such a 
use-case. But usually, we use them for peering and value-added traffic.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-17 Thread Mark Tinka via juniper-nsp


Junos 22.4R2 that fixes this issue has been released...

Mark.

On 6/10/23 13:55, Mark Tinka wrote:



On 6/10/23 13:50, Jason Lixfeld wrote:


Do either of you two have PRs for your respective issues?  If you could share, 
I, for one anyway, would be grateful :)


For the PTX1000 issue:

https://supportportal.juniper.net/s/article/PTX1000-resources-exhaustion-causing-host-loopback-wedge
https://prsearch.juniper.net/problemreport/PR1695183

Mark.


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-10 Thread Mark Tinka via juniper-nsp





On 6/10/23 13:50, Jason Lixfeld wrote:


Do either of you two have PRs for your respective issues?  If you could share, 
I, for one anyway, would be grateful :)


For the PTX1000 issue:

https://supportportal.juniper.net/s/article/PTX1000-resources-exhaustion-causing-host-loopback-wedge
https://prsearch.juniper.net/problemreport/PR1695183

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-10 Thread Mark Tinka via juniper-nsp




On 6/9/23 17:46, Andrey Kostin via juniper-nsp wrote:

 We have two MX204s running in pair with 2x100G taken for links 
between them and remaining BW is 6x100G for actual forwarding in/out. 
In this case it's kind of at the same level for price/100G value.


Yeah, using the MX204 like this (edge router functions) is costly on the 
ports it's already lacking.





I agree, and that's why I asked about HQoS experience, just to add 
more inexpensive low-speed switch ports via trunk but still be able to 
treat them more like separate ports from a router perspective.


The MX204 is an MPC7E, so whatever H-QoS is on the MPC7E is what the 
MX204 will also do.


We have used them as an edge router on a temporary basis at new sites, 
with an Arista switch hanging off of them via an 802.1Q trunk, until we 
can get our standard MX480 to site. They are capable for such a 
use-case. But usually, we use them for peering and value-added traffic.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Saku Ytti via juniper-nsp

On Fri, 9 Jun 2023 at 20:37, Andrey Kostin  wrote:

> Sounds more like a datacenter setup, and for DC operator it could be
> attractive to do at scale. For a traditional ISP with relatively small
> PoPs spread across the country it may be not the case.

Sure, not suggesting everyone is in the target market, but suggesting
the target market includes people who are not developers with no
interest in being one. For a typical access network with multiple
pops, it may be the wrong optimisation point.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp


Hi Saku,

Saku Ytti писал(а) 2023-06-09 12:09:

On Fri, 9 Jun 2023 at 18:46, Andrey Kostin  wrote:


I'm not in this market, have no qualification and resources for
development. The demand in such devices should be really massive to
justify a process like this.


Are you not? You use a lot of open source software, because someone
else did the hard work, and you have something practical.

The same would be the thesis here,  You order the PCI NPU from newegg,
and you have an ecosystem of practical software to pull from various
sources. Maybe you'll contribute something back, maybe not.


Well, technically maybe I could do it. But putting it in production is 
another story. I have to not only make it run but also make sure that 
there are people who can support it 24x7. I think you said it before and 
I agree that the cost of capital investment in routers is just a small 
fraction in expenses for service providers. Cable infrastructure, 
facilities, payroll, etc. make a bigger part, but risk of a router 
failure extends to business risks like reputation and financial loss and 
may have a catastrophic impact. We all know how long and difficult can 
be troubleshooting and fixing a complex issue with vendor's TAC but I 
consider the price we pay hardware vendors for their TAC support 
partially as a liability insurance.



Very typical network is a border router or two, which needs features
and performance, then switches to connect to compute. People who have
no resources or competence to write software could still be users in
this market.


Sounds more like a datacenter setup, and for DC operator it could be 
attractive to do at scale. For a traditional ISP with relatively small 
PoPs spread across the country it may be not the case.


Kind regards,
Andrey
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp

Thank you very much, Jeff, for sharing your experience. Will watch 
closely Release Notes for upcoming Junos releases. And kudos to Juniper 
for finding and fixing it, 1,5 week is very fast reaction!.


Kind regards,
Andrey

Litterick, Jeff  (BIT) писал(а) 2023-06-09 12:41:

This is why we got the MX304.  It was a test to replace our MX10008
Chassis, which we bought a few of because we had to get at a
reasonable price into 100G at high density at multiple sites a few
years back now.  Though we really only need 4 line cards, with 2 being
for redundancy.   The MX1004 was not available at the time back then
(Wish it had been.  The MX10008 is a heavy beast indeed and we had to
use fork lifts to move them around into the data centers).But
after handling the MX304 we will most likely for 400G go to the
MX10004 line for the future and just use the MX304 at very small edge
sites if needed.   Mainly due to full FPC redundancy requirements at
many of our locations.   And yes we had multiple full FPC failures in
the past on the MX10008 line.  We went through at first an RMA cycle
with multiple line cards which in the end was due to just 1 line cards
causing full FPC failure on a different line card in the chassis
around every 3 months or so.   Only having everything redundant across
both FPCs allowed us not to have serious downtime.



___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Litterick, Jeff (BIT) via juniper-nsp

This is why we got the MX304.  It was a test to replace our MX10008 Chassis, 
which we bought a few of because we had to get at a reasonable price into 100G 
at high density at multiple sites a few years back now.  Though we really only 
need 4 line cards, with 2 being for redundancy.   The MX1004 was not available 
at the time back then (Wish it had been.  The MX10008 is a heavy beast indeed 
and we had to use fork lifts to move them around into the data centers).But 
after handling the MX304 we will most likely for 400G go to the MX10004 line 
for the future and just use the MX304 at very small edge sites if needed.   
Mainly due to full FPC redundancy requirements at many of our locations.   And 
yes we had multiple full FPC failures in the past on the MX10008 line.  We went 
through at first an RMA cycle with multiple line cards which in the end was due 
to just 1 line cards causing full FPC failure on a different line card in the 
chassis  around every 3 months or so.   Only having everything redundant across 
both FPCs allowed us not to have serious downtime. 

-Original Message-
From: juniper-nsp  On Behalf Of Andrey 
Kostin via juniper-nsp
Sent: Friday, June 9, 2023 11:09 AM
To: Mark Tinka 
Cc: Saku Ytti ; juniper-nsp 
Subject: Re: [EXT] [j-nsp] MX304 Port Layout

Mark Tinka писал(а) 2023-06-09 10:26:
> On 6/9/23 16:12, Saku Ytti wrote:
> 
>> I expect many people in this list have no need for more performance 
>> than single Trio YT in any pop at all, yet they need ports. And they 
>> are not adequately addressed by vendors. But they do need the deep 
>> features of NPU.
> 
> This.
> 
> There is sufficient performance in Trio today (even a single Trio chip 
> on the board) that people are willing to take an oversubscribed box or 
> line card because in real life, they will run out of ports long before 
> they run out of aggregate forwarding capacity.
> 
> The MX204, even though it's a pizza box, is a good example of how it 
> could do with 8x 100Gbps ports, even though Trio on it will only 
> forward 400Gbps. Most use-cases will require another MX204 chassis, 
> just for ports, before the existing one has hit anywhere close to 
> capacity.

Agree, there is a gap between 204 and 304, but don't forget that they belong to 
different generations. 304 is shiny new with a next level performance that's 
replacing MX10k3. The previous generation was announced to retire, but life of 
MX204 was extended because Juniper realized that they don't have anything atm 
to replace it and probably will lose revenue. Maybe this gap was caused by 
covid that slowed down the new platform. And possibly we may see a single NPU 
model based on the new gen chip, because chips for 204 are finite. At least it 
would be logical to make it, considering success of MX204.
> 
> Really, folk are just chasing the Trio capability, otherwise they'd 
> have long solved their port-count problems by choosing any 
> Broadcom-based box on the market. Juniper know this, and they are 
> using it against their customers, knowingly or otherwise. Cisco was 
> good at this back in the day, over-subscribing line cards on their 
> switches and routers. Juniper have always been a little more purist, 
> but the market can't handle it because the rate of traffic growth is 
> being out-paced by what a single Trio chip can do for a couple of 
> ports, in the edge.

I think that it's not rational to make another chipset with lower bandwidth, 
easier to limit an existing more powerful chip. Then it leads to 
MX5/MX10/MX40/MX80 hardware and licensing model. It could be a single
Trio6 with up to 1.6T in access ports and 1.6T in uplink ports with low 
features. Maybe it will come, who knows, let's watch ;)

Kind regards,
Andrey
___
juniper-nsp mailing list juniper-nsp@puck.nether.net 
https://puck.nether.net/mailman/listinfo/juniper-nsp
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Saku Ytti via juniper-nsp

On Fri, 9 Jun 2023 at 19:15, Andrey Kostin  wrote:

> Can anything else be inserted in this socket? If not, then what's the
> point? For server CPUs there are many models with different clocking and
> number of cores, so socket provides a flexibility. If there is only one
> chip that fits the socket, then the socket is a redundant part.

Not that I know. I think the point may be decouplement. BRCM doesn't
want to do business with just everyone. This allows someone to build
the switch, without providing the chips. Then customers can buy a
switch from this vendor and chip directly from BRCM.
I could imagine some big players like FB and AMZN designing their own
switch, having some random shop actually build it. But Broadcom saying
'no, we don't do business with you'.  This way they could actually get
the switch from anywhere, while having a direct chip relationship with
BRCM.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Litterick, Jeff (BIT) via juniper-nsp

Not sure, but anything shipped before May most likely would be affected, if not 
into May a bit.   Since we were one of the first if not the first customer to 
get the fixed applied to the equipment we got at the end of March.   

We never knew the real root cause outside that when it happened the primary RE 
would lock up-sold and not respond to any command or input or allow access to 
any management port and there were no crash dumps or logs.  The Backup RE would 
NOT Take over and get stuck in a loop trying to take over the mastership, but 
the backup RE would still respond to management, but even a reboot of it would 
not allow it to take mastership.   The only solution was a full power plug 
removal or RE removal from the chassis for a power reset.   But they were able 
to find this in the lab at Juniper right after we reported it and they worked 
on a fix and got it to use about 1.5 weeks later.We got lucky in that one 
of the 3 boxes would never last more than 6 hours after a reboot before the 
lockup of the master RE  (No matter if it was RE0 or RE1 as master)  The other 
2 boxes could go a week or more before locking up.   So we were a good test to 
see if the fixed work since in the lab it would take up to 8 days before 
locking up.

FYI:  The one that would lock up inside 6 hours was our spare and had no 
traffic at all or even a optic plugged into any port and not even test traffic 
which the other 2 had going.  We did not go into production until 2 weeks after 
the fix was applied to make sure.

This problem would only surface also if you have more then one RE plugged into 
the system.  Even if failover was not configured.  It was just the presence of 
the 2nd RE that would trigger it.   I understand that the engineering team is 
now fully regressive testing all releases with multiple REs now.  I guess that 
was not true before we found the bug.

-Original Message-
From: juniper-nsp  On Behalf Of Mark Tinka 
via juniper-nsp
Sent: Thursday, June 8, 2023 10:53 PM
To: juniper-nsp@puck.nether.net
Subject: Re: [EXT] [j-nsp] MX304 Port Layout

On 6/9/23 00:03, Litterick, Jeff (BIT) via juniper-nsp wrote:

> The big issue we ran into is if you have redundant REs then there is a super 
> bad bug that after 6 hours (1 of our 3 would lock up after reboot quickly and 
> the other 2 would take a very long time) to 8 days will lock the entire 
> chassis up solid where we had to pull the REs physical out to reboot them.
>  It is fixed now, but they had to manually poke new firmware into the ASICs 
> on each RE when they were in a half-powered state,  Was a very complex 
> procedure with tech support and the MX304 engineering team.  It took about 3 
> hours to do all 3 MX304s  one RE at a time.   We have not seen an update with 
> this built-in yet.  (We just did this back at the end of April)

Oh dear, that's pretty nasty. So did they say new units shipping today would 
come with the RE's already fixed?

We've been suffering a somewhat similar issue on the PTX1000, where a bug was 
introduced via regression in Junos 21.4, 22.1 and 22.2 that causes CPU queues 
to get filled up by unknown MAC address frames, and are not cleared. It takes 
64 days for this packet accumulation to grow to a point where the queues get 
exhausted, causing a host loopback wedge.

You would see an error like this in the logs:

   alarmd[27630]: Alarm set: FPC id=150995048, color=RED, 
class=CHASSIS, reason=FPC 0 Major Errorsfpc0 
Performing action cmalarm for error 
/fpc/0/pfe/0/cm/0/Host_Loopback/0/HOST_LOOPBACK_MAKE_CMERROR_ID[1]
(0x20002) in module: Host Loopback with scope: pfe category: functional
level: major
   fpc0 Cmerror Op Set: Host Loopback: HOST LOOPBACK 
WEDGE DETECTED IN PATH ID 1  (URI: 
/fpc/0/pfe/0/cm/0/Host_Loopback/0/HOST_LOOPBACK_MAKE_CMERROR_ID[1])
Apr 1 03:52:28  PTX1000 fpc0 CMError: 
/fpc/0/pfe/0/cm/0/Host_Loopback/0/HOST_LOOPBACK_MAKE_CMERROR_ID[3]
(0x20004), in module: Host Loopback with scope: pfe category: functional
level: major

This causes the router to drop all control plane traffic, which, basically, 
makes it unusable. One has to reboot the box to get it back up and running, 
until it happens again 64 days later.

The issue is resolved in Junos 21.4R3-S4, 22.4R2, 23.2R1 and 23.3R1.

However, these releases are not shipping yet, so Juniper gave us a workaround 
SLAX script that automatically runs and clears the CPU queues before the 64 
days are up.

We are currently running Junos 22.1R3.9 on this platform, and will move to 
22.4R2 in a few weeks to permanently fix this.

Junos 20.2, 20.3 and 20.4 are not affected, nor is anything after 23.2R1.

I understand it may also affect the QFX and MX, but I don't have details on 
that.

Mark.

___
juniper-nsp mailing list juniper-nsp@puck.nether.net 
https://puck.nether.net/mailman/listinfo/juniper-nsp
___
juniper-n

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp


Saku Ytti писал(а) 2023-06-09 10:35:


LGA8371 socketed BRCM TH4. Ostensibly this allows a lot more switches
to appear in the market, as the switch maker doesn't need to be
friendly with BRCM. They make the switch, the customer buys the chip
and sockets it. Wouldn't surprise me if FB, AMZN and the likes would
have pressed for something like this, so they could use cheaper
sources to make the rest of the switch, sources which BRCM didn't want
to play ball with.


Can anything else be inserted in this socket? If not, then what's the 
point? For server CPUs there are many models with different clocking and 
number of cores, so socket provides a flexibility. If there is only one 
chip that fits the socket, then the socket is a redundant part.


Kind regards,
Andrey
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Saku Ytti via juniper-nsp

On Fri, 9 Jun 2023 at 18:46, Andrey Kostin  wrote:

> I'm not in this market, have no qualification and resources for
> development. The demand in such devices should be really massive to
> justify a process like this.

Are you not? You use a lot of open source software, because someone
else did the hard work, and you have something practical.

The same would be the thesis here,  You order the PCI NPU from newegg,
and you have an ecosystem of practical software to pull from various
sources. Maybe you'll contribute something back, maybe not.

Very typical network is a border router or two, which needs features
and performance, then switches to connect to compute. People who have
no resources or competence to write software could still be users in
this market.
-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp


Mark Tinka писал(а) 2023-06-09 10:26:

On 6/9/23 16:12, Saku Ytti wrote:


I expect many people in this list have no need for more performance
than single Trio YT in any pop at all, yet they need ports. And they
are not adequately addressed by vendors. But they do need the deep
features of NPU.


This.

There is sufficient performance in Trio today (even a single Trio chip
on the board) that people are willing to take an oversubscribed box or
line card because in real life, they will run out of ports long before
they run out of aggregate forwarding capacity.

The MX204, even though it's a pizza box, is a good example of how it
could do with 8x 100Gbps ports, even though Trio on it will only
forward 400Gbps. Most use-cases will require another MX204 chassis,
just for ports, before the existing one has hit anywhere close to
capacity.


Agree, there is a gap between 204 and 304, but don't forget that they 
belong to different generations. 304 is shiny new with a next level 
performance that's replacing MX10k3. The previous generation was 
announced to retire, but life of MX204 was extended because Juniper 
realized that they don't have anything atm to replace it and probably 
will lose revenue. Maybe this gap was caused by covid that slowed down 
the new platform. And possibly we may see a single NPU model based on 
the new gen chip, because chips for 204 are finite. At least it would be 
logical to make it, considering success of MX204.


Really, folk are just chasing the Trio capability, otherwise they'd
have long solved their port-count problems by choosing any
Broadcom-based box on the market. Juniper know this, and they are
using it against their customers, knowingly or otherwise. Cisco was
good at this back in the day, over-subscribing line cards on their
switches and routers. Juniper have always been a little more purist,
but the market can't handle it because the rate of traffic growth is
being out-paced by what a single Trio chip can do for a couple of
ports, in the edge.


I think that it's not rational to make another chipset with lower 
bandwidth, easier to limit an existing more powerful chip. Then it leads 
to MX5/MX10/MX40/MX80 hardware and licensing model. It could be a single 
Trio6 with up to 1.6T in access ports and 1.6T in uplink ports with low 
features. Maybe it will come, who knows, let's watch ;)


Kind regards,
Andrey
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp


Saku Ytti писал(а) 2023-06-09 10:12:

On Fri, 9 Jun 2023 at 16:58, Andrey Kostin via juniper-nsp
 wrote:


Not sure why it's eye-watering. The price of fully populated MX304 is
basically the same as it's predecessor MX10003 but it provides 3.2T BW
capacity vs 2.4T. If you compare with MX204, then MX304 is about 20%
expensive for the same total BW, but MX204 doesn't have redundant RE 
and
if you use it in redundant chassis configuration you will have to 
spend
some BW on "fabric" links, effectively leveling the price if 
calculated

for the same BW. I'm just comparing numbers, not considering any real


That's not it, RE doesn't attach to fabric serdes.


Sorry, I mixed two different points. I wanted to say that redundant RE 
adds more cost to MX304, unrelated to forwarding BW. But if you want to 
have MX204s in redundant configuration, some ports have to be sacrificed 
for connectivity between them. We have two MX204s running in pair with 
2x100G taken for links between them and remaining BW is 6x100G for 
actual forwarding in/out. In this case it's kind of at the same level 
for price/100G value.




I expect many people in this list have no need for more performance
than single Trio YT in any pop at all, yet they need ports. And they
are not adequately addressed by vendors. But they do need the deep
features of NPU.


I agree, and that's why I asked about HQoS experience, just to add more 
inexpensive low-speed switch ports via trunk but still be able to treat 
them more like separate ports from a router perspective.



I keep hoping that someone is so disruptive that they take the
nvidia/gpu approach to npu. That is, you can buy Trio PCI from newegg
for 2 grand, and can program it as you wish. I think this market
remains unidentified and even adjusting to cannibalization would
increase market size.
I can't understand why JNPR is not trying this, they've lost for 20
years to inflation in valuation, what do they have to lose?


I'm not in this market, have no qualification and resources for 
development. The demand in such devices should be really massive to 
justify a process like this.


Kind regards,
Andrey
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Mark Tinka via juniper-nsp





On 6/9/23 16:35, Saku Ytti wrote:


I'm not convinced at all that leaba is being sold. I think it's sold
conditionally when customers would otherwise be lost.


Probably - it's a "grain of salt" situation when you hear the news.

I don't think Meta and Microsoft have not bought zero of the C8000... 
but not to the degree that they would ignore more primary options, I think.




But NPU from newegg and community writes code that doesn't exist, and
I think it should and there would be volume in it, but no large volume
to any single customer.


Not enough foresight from traditional OEM's to see the potential here.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Saku Ytti via juniper-nsp

On Fri, 9 Jun 2023 at 17:26, Mark Tinka  wrote:

> Well, the story is that Cisco are doing this with Meta and Microsoft on
> their C8000 platform, and apparently, doing billions of US$ in business
> on the back of that.

I'm not convinced at all that leaba is being sold. I think it's sold
conditionally when customers would otherwise be lost.

I am reminder of this:
https://www.servethehome.com/this-is-a-broadcom-tomahawk-4-64-port-400gbe-switch-chip-lga8371-intel-amd-ampere/

LGA8371 socketed BRCM TH4. Ostensibly this allows a lot more switches
to appear in the market, as the switch maker doesn't need to be
friendly with BRCM. They make the switch, the customer buys the chip
and sockets it. Wouldn't surprise me if FB, AMZN and the likes would
have pressed for something like this, so they could use cheaper
sources to make the rest of the switch, sources which BRCM didn't want
to play ball with.

But NPU from newegg and community writes code that doesn't exist, and
I think it should and there would be volume in it, but no large volume
to any single customer.

--
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Mark Tinka via juniper-nsp





On 6/9/23 16:12, Saku Ytti wrote:


I expect many people in this list have no need for more performance
than single Trio YT in any pop at all, yet they need ports. And they
are not adequately addressed by vendors. But they do need the deep
features of NPU.


This.

There is sufficient performance in Trio today (even a single Trio chip 
on the board) that people are willing to take an oversubscribed box or 
line card because in real life, they will run out of ports long before 
they run out of aggregate forwarding capacity.


The MX204, even though it's a pizza box, is a good example of how it 
could do with 8x 100Gbps ports, even though Trio on it will only forward 
400Gbps. Most use-cases will require another MX204 chassis, just for 
ports, before the existing one has hit anywhere close to capacity.


Really, folk are just chasing the Trio capability, otherwise they'd have 
long solved their port-count problems by choosing any Broadcom-based box 
on the market. Juniper know this, and they are using it against their 
customers, knowingly or otherwise. Cisco was good at this back in the 
day, over-subscribing line cards on their switches and routers. Juniper 
have always been a little more purist, but the market can't handle it 
because the rate of traffic growth is being out-paced by what a single 
Trio chip can do for a couple of ports, in the edge.




I keep hoping that someone is so disruptive that they take the
nvidia/gpu approach to npu. That is, you can buy Trio PCI from newegg
for 2 grand, and can program it as you wish. I think this market
remains unidentified and even adjusting to cannibalization would
increase market size.
I can't understand why JNPR is not trying this, they've lost for 20
years to inflation in valuation, what do they have to lose?


Well, the story is that Cisco are doing this with Meta and Microsoft on 
their C8000 platform, and apparently, doing billions of US$ in business 
on the back of that.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Mark Tinka via juniper-nsp





On 6/9/23 15:57, Andrey Kostin wrote:


Hi Mark,

Not sure why it's eye-watering. The price of fully populated MX304 is 
basically the same as it's predecessor MX10003 but it provides 3.2T BW 
capacity vs 2.4T.


That's true, but the premium being paid for 400Gbps capability that some 
houses may not yet need is probably what is pushing that price up in 
comparison to the MX10003, which does not support 400Gbps.


But to be fair, it will come down to the discounts you can negotiate 
with Juniper. I'm perhaps more concerned because we got good pricing on 
the MX10003, even when we did a like-for-somewhat-like comparison with 
the MX304.


As much as we have struggled with Cisco in the past, this is forcing me 
to see what is available in their ASR99xx boxes. But off-the-bat, form 
factor and port density is poor on the Cisco side, compared to the MX304 
and MX10003.



If you compare with MX204, then MX304 is about 20% expensive for the 
same total BW, but MX204 doesn't have redundant RE and if you use it 
in redundant chassis configuration you will have to spend some BW on 
"fabric" links, effectively leveling the price if calculated for the 
same BW. I'm just comparing numbers, not considering any real 
topology, which is another can of worms. Most probably it's not worth 
to try to scale MX204s to more than a pair of devices, at least I 
wouldn't do it and consider it ;)


The use-case for MX204 and MX304 is very very different. As you say, 
MX304 is a better alternative for the MX10003 (which I am getting 
conflicting information about re: sale availability from Juniper).


We use the MX204 extensively, but only for peering and routing for 
value-added services too small to plug into a larger MX.



I'd rather call eye-watering prices for MPC7 and MPC10 to upgrade 
existing MX480 routers if you still to use their low-speed ports. Two 
MPC10s with SCB3s upgrade cost more than MX304, but gives 30% less BW 
capacity. For MPC7 this ratio is even worse.


Agreed - the MPC7 and MPC10's only make sense for large capacity 
aggregation or backbone links, not as an access port for 100Gbps 
customers. The MX10003 and MX304 are better boxes for 100Gbps access for 
customers.


Conversely, trying to use the MX304 or MX10003 as a core box is too 
costly, since you are paying the premium for edge features in Trio, when 
all you need is basic Ethernet, IS-IS and MPLS.


So the MPC7/MPC10 vs. MX304/10003 use-cases are clearly defined, if 
money is an object.



This brings a question, does anybody have an experience with HQoS on 
MX304? I mean just per-subinterface queueing on an interface to a 
switch, not BNG subscribers CoS which is probably another big topic. 
At least I'm not dare yet to try MX304 in BNG role, maybe later ;)


In this world where the kind of traffic you will be pushing through an 
MX304 most likely being majority off-net content, do you really need 
H-QoS :-)?


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Saku Ytti via juniper-nsp

On Fri, 9 Jun 2023 at 16:58, Andrey Kostin via juniper-nsp
 wrote:

> Not sure why it's eye-watering. The price of fully populated MX304 is
> basically the same as it's predecessor MX10003 but it provides 3.2T BW
> capacity vs 2.4T. If you compare with MX204, then MX304 is about 20%
> expensive for the same total BW, but MX204 doesn't have redundant RE and
> if you use it in redundant chassis configuration you will have to spend
> some BW on "fabric" links, effectively leveling the price if calculated
> for the same BW. I'm just comparing numbers, not considering any real

That's not it, RE doesn't attach to fabric serdes.

You are right that the MX304 is the successor of MX10003 not MX201.

MX80, M104 and MX201 are unique in that they are true pizzabox Trios.
They have exactly 1 trio, and both WAN and FAB side connect to WAN
ports (not sure if MX201 just leaves them unconnected) Therefore say
40G Trio in linecard mode is 80G Trio in pizza mode (albeit PPS stays
the same) as you're not wasting capacity to non-revenue fabric ports.
This single Trio design makes the box very cost effective, as not only
do you just have one Trio and double the capacity per Trio, but you
also don't have any fabric chip and fabric serdes.

MX304 however has Trio in the linecard, so it really is very much a
normal chassis box. And having multiple Trios it needs fabric.

I do think Juniper and the rest of the vendors keep struggling to
identify 'few to many' markets, and are only good at identifying 'many
to few' markets. MX304 and ever denser 512x112G serdes chips represent
this.

I expect many people in this list have no need for more performance
than single Trio YT in any pop at all, yet they need ports. And they
are not adequately addressed by vendors. But they do need the deep
features of NPU.

I keep hoping that someone is so disruptive that they take the
nvidia/gpu approach to npu. That is, you can buy Trio PCI from newegg
for 2 grand, and can program it as you wish. I think this market
remains unidentified and even adjusting to cannibalization would
increase market size.
I can't understand why JNPR is not trying this, they've lost for 20
years to inflation in valuation, what do they have to lose?

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp


Hi Mark,

Not sure why it's eye-watering. The price of fully populated MX304 is 
basically the same as it's predecessor MX10003 but it provides 3.2T BW 
capacity vs 2.4T. If you compare with MX204, then MX304 is about 20% 
expensive for the same total BW, but MX204 doesn't have redundant RE and 
if you use it in redundant chassis configuration you will have to spend 
some BW on "fabric" links, effectively leveling the price if calculated 
for the same BW. I'm just comparing numbers, not considering any real 
topology, which is another can of worms. Most probably it's not worth to 
try to scale MX204s to more than a pair of devices, at least I wouldn't 
do it and consider it ;)
I'd rather call eye-watering prices for MPC7 and MPC10 to upgrade 
existing MX480 routers if you still to use their low-speed ports. Two 
MPC10s with SCB3s upgrade cost more than MX304, but gives 30% less BW 
capacity. For MPC7 this ratio is even worse.
This brings a question, does anybody have an experience with HQoS on 
MX304? I mean just per-subinterface queueing on an interface to a 
switch, not BNG subscribers CoS which is probably another big topic. At 
least I'm not dare yet to try MX304 in BNG role, maybe later ;)


Kind regards,
Andrey

Mark Tinka via juniper-nsp писал(а) 2023-06-08 12:04:


Trio capacity aside, based on our experience with the MPC7E, MX204 and
MX10003, we expect it to be fairly straight forward.

What is holding us back is the cost. The license for each 16-port line
card is eye-watering. While I don't see anything comparable in ASR99xx
Cisco-land (in terms of form factor and 100Gbps port density), those
prices are certainly going to force Juniper customers to look at other
options. They would do well to get that under control.



___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-09 Thread Andrey Kostin via juniper-nsp


Hi Jeff,

Thank you very mush for sharing this information. Do you know in what 
publicly available release it's going to be fixed? Knowing PR number 
would be the best but I guess it may be internal-only.


Kind regards,
Andrey

Litterick, Jeff  (BIT) via juniper-nsp писал(а) 2023-06-08 18:03:

No, that is not quite right.  We have 2 chassis of MX304 in Production
today and 1 spare all with Redundant REs   You do not need all the
ports filled in a port group.   I know since we mixed in some 40G and
40G is ONLY supported on the bottom row of ports so we have a mix and
had to break stuff out leaving empty ports because of that limitation,
and it is running just fine.But you do have to be careful which
type of optics get plugged into which ports.  IE Port 0/2 vs Port 1/3
in a grouping if you are not using 100G optics.

The big issue we ran into is if you have redundant REs then there is a
super bad bug that after 6 hours (1 of our 3 would lock up after
reboot quickly and the other 2 would take a very long time) to 8 days
will lock the entire chassis up solid where we had to pull the REs
physical out to reboot them. It is fixed now, but they had to
manually poke new firmware into the ASICs on each RE when they were in
a half-powered state,  Was a very complex procedure with tech support
and the MX304 engineering team.  It took about 3 hours to do all 3
MX304s  one RE at a time.   We have not seen an update with this
built-in yet.  (We just did this back at the end of April)



___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Mark Tinka via juniper-nsp




On 6/9/23 00:03, Litterick, Jeff (BIT) via juniper-nsp wrote:


The big issue we ran into is if you have redundant REs then there is a super 
bad bug that after 6 hours (1 of our 3 would lock up after reboot quickly and 
the other 2 would take a very long time) to 8 days will lock the entire chassis 
up solid where we had to pull the REs physical out to reboot them. It is 
fixed now, but they had to manually poke new firmware into the ASICs on each RE 
when they were in a half-powered state,  Was a very complex procedure with tech 
support and the MX304 engineering team.  It took about 3 hours to do all 3 
MX304s  one RE at a time.   We have not seen an update with this built-in yet.  
(We just did this back at the end of April)


Oh dear, that's pretty nasty. So did they say new units shipping today 
would come with the RE's already fixed?


We've been suffering a somewhat similar issue on the PTX1000, where a 
bug was introduced via regression in Junos 21.4, 22.1 and 22.2 that 
causes CPU queues to get filled up by unknown MAC address frames, and 
are not cleared. It takes 64 days for this packet accumulation to grow 
to a point where the queues get exhausted, causing a host loopback wedge.


You would see an error like this in the logs:

   alarmd[27630]: Alarm set: FPC id=150995048, 
color=RED, class=CHASSIS, reason=FPC 0 Major Errors
   fpc0 Performing action cmalarm for error 
/fpc/0/pfe/0/cm/0/Host_Loopback/0/HOST_LOOPBACK_MAKE_CMERROR_ID[1] 
(0x20002) in module: Host Loopback with scope: pfe category: functional 
level: major
   fpc0 Cmerror Op Set: Host Loopback: HOST 
LOOPBACK WEDGE DETECTED IN PATH ID 1  (URI: 
/fpc/0/pfe/0/cm/0/Host_Loopback/0/HOST_LOOPBACK_MAKE_CMERROR_ID[1])
Apr 1 03:52:28  PTX1000 fpc0 CMError: 
/fpc/0/pfe/0/cm/0/Host_Loopback/0/HOST_LOOPBACK_MAKE_CMERROR_ID[3] 
(0x20004), in module: Host Loopback with scope: pfe category: functional 
level: major


This causes the router to drop all control plane traffic, which, 
basically, makes it unusable. One has to reboot the box to get it back 
up and running, until it happens again 64 days later.


The issue is resolved in Junos 21.4R3-S4, 22.4R2, 23.2R1 and 23.3R1.

However, these releases are not shipping yet, so Juniper gave us a 
workaround SLAX script that automatically runs and clears the CPU queues 
before the 64 days are up.


We are currently running Junos 22.1R3.9 on this platform, and will move 
to 22.4R2 in a few weeks to permanently fix this.


Junos 20.2, 20.3 and 20.4 are not affected, nor is anything after 23.2R1.

I understand it may also affect the QFX and MX, but I don't have details 
on that.


Mark.

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Litterick, Jeff (BIT) via juniper-nsp

No, that is not quite right.  We have 2 chassis of MX304 in Production today 
and 1 spare all with Redundant REs   You do not need all the ports filled in a 
port group.   I know since we mixed in some 40G and 40G is ONLY supported on 
the bottom row of ports so we have a mix and had to break stuff out leaving 
empty ports because of that limitation, and it is running just fine.But you 
do have to be careful which type of optics get plugged into which ports.  IE 
Port 0/2 vs Port 1/3 in a grouping if you are not using 100G optics.

The big issue we ran into is if you have redundant REs then there is a super 
bad bug that after 6 hours (1 of our 3 would lock up after reboot quickly and 
the other 2 would take a very long time) to 8 days will lock the entire chassis 
up solid where we had to pull the REs physical out to reboot them. It is 
fixed now, but they had to manually poke new firmware into the ASICs on each RE 
when they were in a half-powered state,  Was a very complex procedure with tech 
support and the MX304 engineering team.  It took about 3 hours to do all 3 
MX304s  one RE at a time.   We have not seen an update with this built-in yet.  
(We just did this back at the end of April)

-Original Message-
From: juniper-nsp  On Behalf Of Thomas 
Bellman via juniper-nsp
Sent: Thursday, June 8, 2023 2:09 PM
To: juniper-nsp 
Subject: Re: [EXT] [j-nsp] MX304 Port Layout

On 2023-06-08 17:18, Kevin Shymkiw via juniper-nsp wrote:

> Along with this - I would suggest looking at Port Checker ( 
> https://apps.juniper.net/home/port-checker/index.html ) to make sure 
> your port combinations are valid.

The port checker claims an interresting "feature": if you have anything in port 
3, then *all* the other ports in that port group must also be occupied.  So if 
you use all those four ports for e.g. 100GE, everything is fine, but if you 
then want to stop using either of ports 0, 1 or 2, the configuration becomes 
invalid...

(And similarly for ports 5, 8 and 14 in their respective groups.)

I hope that's a bug in the port checker, not actual behaviour by the MX304...

--
Thomas Bellman,  National Supercomputer Centre,  Linköping Univ., Sweden "We 
don't understand the software, and sometimes we don't understand  the hardware, 
but we can *see* the blinking lights!"

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Thomas Bellman via juniper-nsp

On 2023-06-08 17:18, Kevin Shymkiw via juniper-nsp wrote:

> Along with this - I would suggest looking at Port Checker (
> https://apps.juniper.net/home/port-checker/index.html ) to make sure
> your port combinations are valid.

The port checker claims an interresting "feature": if you have
anything in port 3, then *all* the other ports in that port group
must also be occupied.  So if you use all those four ports for
e.g. 100GE, everything is fine, but if you then want to stop using
either of ports 0, 1 or 2, the configuration becomes invalid...

(And similarly for ports 5, 8 and 14 in their respective groups.)

I hope that's a bug in the port checker, not actual behaviour by
the MX304...

-- 
Thomas Bellman,  National Supercomputer Centre,  Linköping Univ., Sweden
"We don't understand the software, and sometimes we don't understand
 the hardware, but we can *see* the blinking lights!"

signature.asc
Description: OpenPGP digital signature
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Mark Tinka via juniper-nsp




On 6/8/23 18:39, Giuliano C. Medalha wrote:

but you have the flex model. With license for capacity and features. 
 Advanced and Premium.


Which isn't a new thing with vendors. The prices are just terrible, even 
with discounts.




fib is better now - 12M

sampling rate for ipfix is better too.

but you have other parameters for mpls and bng too


Yes, like I said, increased Trio capacity aside, it's straight forward. 
Just not sure the price is justified. I expect a premium for custom 
silicon over Broadcom, but that seems excessive.




But you need the correct drivers on Junos


Yes, that is assumed, of course, especially if you want to talk to a 
ROADM. There is varying support amongst vendors, but as with everything, 
they will converge in time.



Juniper now has good prices ( common optics ) for 400G ( JCO part 
numbers )


Mixing $vendor_name and "good optics prices" has always ended in tears.



Low 40 km or maximum 80 km direct with ZR high power ( end of the year )


Okay.

Our use-case is 100Gbps customer edge, so within the data centre.

We operate an optical network for anything longer than 80km.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Giuliano C. Medalha via juniper-nsp





> Hello good afternoon.
>
> Please have a look at the following documentation:
>
>
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunity.juniper.net%2Fblogs%2Freema-ray%2F2023%2F03%2F28%2Fmx304-deepdive=05%7C01%7Cgiuliano%40wztech.com.br%7Cc7b64b057b6b488bff0d08db683a0b28%7C584787b077bd4312bf8815412b8ae504%7C1%7C0%7C638218370748741098%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=fhF2D3TSWDQljPw0jepN5ZsB%2FdOSUq4zIx%2F9w1TkkhE%3D=0

Thanks, this is most useful!


> It will have everything you need to do with it, including the pictures.
>
> Our first boxes are arriving this next month in Brazil.
>
> By the specs of the new chipset (TRIO6) it's a very good box. A lot of 
> enhancements.

Trio capacity aside, based on our experience with the MPC7E, MX204 and
MX10003, we expect it to be fairly straight forward.

What is holding us back is the cost. The license for each 16-port line
card is eye-watering. While I don't see anything comparable in ASR99xx
Cisco-land (in terms of form factor and 100Gbps port density), those
prices are certainly going to force Juniper customers to look at other
options. They would do well to get that under control.


but you have the flex model. With license for capacity and features.  Advanced 
and Premium.

fib is better now - 12M

sampling rate for ipfix is better too.

but you have other parameters for mpls and bng too




> And it supports 400G already ( ZR and ZR+ need to check ) ( 16 x 100 or 4 x 
> 400 ) per LMIC.

The LMIC won't care whether it's ZR, ZR+, FR4 or DR4. It will be compatible 
with whatever pluggable is used, as long as it can do 400Gbps.



But you need the correct drivers on Junos

And the chassis must support temperature and power budgets

Juniper now has good prices ( common optics ) for 400G ( JCO part numbers )



Unless, of course, you mean whether Juniper provide an interface into
the optic for use-cases where you are plugging into a ROADM... that, I
don't know.

Are you intending to use this router for long-distance applications?



Low 40 km or maximum 80 km direct with ZR high power ( end of the year )

Thanks

Mark.

WZTECH is registered trademark of WZTECH NETWORKS.
Copyright (c) 2023 WZTECH NETWORKS. All Rights Reserved.

IMPORTANTE:
As informa??es deste e-mail e o conte?do dos eventuais documentos anexos s?o 
confidenciais e para conhecimento exclusivo do destinat?rio. Se o leitor desta 
mensagem n?o for o seu destinat?rio, fica desde j? notificado de que n?o poder? 
divulgar, distribuir ou, sob qualquer forma, dar conhecimento a terceiros das 
informa??es e do conte?do dos documentos anexos. Neste caso, favor comunicar 
imediatamente o remetente, respondendo este e-mail ou telefonando ao mesmo, e 
em seguida apague-o.

CONFIDENTIALITY NOTICE:
The information transmitted in this email message and any attachments are 
solely for the intended recipient and may contain confidential or privileged 
information. If you are not the intended recipient, any review, transmission, 
dissemination or other use of this information is prohibited. If you have 
received this communication in error, please notify the sender immediately and 
delete the material from any computer, including any copies.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Mark Tinka via juniper-nsp





On 6/8/23 17:35, Giuliano C. Medalha wrote:


Hello good afternoon.

Please have a look at the following documentation:


https://community.juniper.net/blogs/reema-ray/2023/03/28/mx304-deepdive


Thanks, this is most useful!



It will have everything you need to do with it, including the pictures.

Our first boxes are arriving this next month in Brazil.

By the specs of the new chipset (TRIO6) it's a very good box. A lot of 
enhancements.


Trio capacity aside, based on our experience with the MPC7E, MX204 and 
MX10003, we expect it to be fairly straight forward.


What is holding us back is the cost. The license for each 16-port line 
card is eye-watering. While I don't see anything comparable in ASR99xx 
Cisco-land (in terms of form factor and 100Gbps port density), those 
prices are certainly going to force Juniper customers to look at other 
options. They would do well to get that under control.




And it supports 400G already ( ZR and ZR+ need to check ) ( 16 x 100 or 4 x 400 
) per LMIC.


The LMIC won't care whether it's ZR, ZR+, FR4 or DR4. It will be 
compatible with whatever pluggable is used, as long as it can do 400Gbps.


Unless, of course, you mean whether Juniper provide an interface into 
the optic for use-cases where you are plugging into a ROADM... that, I 
don't know.


Are you intending to use this router for long-distance applications?

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Giuliano C. Medalha via juniper-nsp

Hello good afternoon.

Please have a look at the following documentation:

https://community.juniper.net/blogs/reema-ray/2023/03/28/mx304-deepdive

It will have everything you need to do with it, including the pictures.

Our first boxes are arriving this next month in Brazil.

By the specs of the new chipset (TRIO6) it's a very good box. A lot of 
enhancements.

And it supports 400G already ( ZR and ZR+ need to check ) ( 16 x 100 or 4 x 400 
) per LMIC.

Best regards

Giuliano

-Original Message-
From: juniper-nsp  On Behalf Of Mark Tinka 
via juniper-nsp
Sent: Thursday, June 8, 2023 12:25 PM
To: Kevin Shymkiw 
Cc: juniper-nsp 
Subject: Re: [j-nsp] MX304 Port Layout

On 6/8/23 17:18, Kevin Shymkiw wrote:
> Along with this - I would suggest looking at Port Checker (
> https://apps/
> .juniper.net%2Fhome%2Fport-checker%2Findex.html=05%7C01%7Cgiuliano%40wztech.com.br%7C28f16ddcdd4440a77d9d08db68348d0b%7C584787b077bd4312bf8815412b8ae504%7C1%7C0%7C638218347183708615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=KbDjdprQEMTa1yXQSPrF%2BytprEB0JAtWWRJtkX2QYAs%3D=0
>  ) to make sure your port combinations are valid.

We've had ample experience with Juniper's MPC7E, MX204, PTX1000 and
PTX10001 to know how they structure this from a philosophical standpoint. So 
not a major drama there.

It's just interesting to me that the data sheet does not mention needing to 
sacrifice an RE to get to the chassis' advertised full port compliment. Unless 
the data sheet was updated and I missed it.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

WZTECH is registered trademark of WZTECH NETWORKS.
Copyright © 2023 WZTECH NETWORKS. All Rights Reserved.

IMPORTANTE:
As informações deste e-mail e o conteúdo dos eventuais documentos anexos são 
confidenciais e para conhecimento exclusivo do destinatário. Se o leitor desta 
mensagem não for o seu destinatário, fica desde já notificado de que não poderá 
divulgar, distribuir ou, sob qualquer forma, dar conhecimento a terceiros das 
informações e do conteúdo dos documentos anexos. Neste caso, favor comunicar 
imediatamente o remetente, respondendo este e-mail ou telefonando ao mesmo, e 
em seguida apague-o.

CONFIDENTIALITY NOTICE:
The information transmitted in this email message and any attachments are 
solely for the intended recipient and may contain confidential or privileged 
information. If you are not the intended recipient, any review, transmission, 
dissemination or other use of this information is prohibited. If you have 
received this communication in error, please notify the sender immediately and 
delete the material from any computer, including any copies.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Mark Tinka via juniper-nsp





On 6/8/23 17:18, Kevin Shymkiw wrote:

Along with this - I would suggest looking at Port Checker (
https://apps.juniper.net/home/port-checker/index.html ) to make sure
your port combinations are valid.


We've had ample experience with Juniper's MPC7E, MX204, PTX1000 and 
PTX10001 to know how they structure this from a philosophical 
standpoint. So not a major drama there.


It's just interesting to me that the data sheet does not mention needing 
to sacrifice an RE to get to the chassis' advertised full port 
compliment. Unless the data sheet was updated and I missed it.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] MX304 Port Layout

2023-06-08 Thread Kevin Shymkiw via juniper-nsp

Along with this - I would suggest looking at Port Checker (
https://apps.juniper.net/home/port-checker/index.html ) to make sure
your port combinations are valid.

Kevin

On Thu, Jun 8, 2023 at 9:16 AM Mark Tinka via juniper-nsp
 wrote:
>
> So, we decided to give the MX304 another sniff, and needed to find out
> why Juniper charge a license for 16x 100Gbps ports per line card, and
> yet the data sheet suggests the box can handle 48x 100Gbps ports
> chassis-wide.
>
> Well, turns out that if you deploy it with redundant RE's, you get 32x
> 100Gbps ports (2x line cards of 16x ports each).
>
> However, to take another 16 ports that gets you to 48x 100Gbps ports,
> you need to sacrifice one RE to use its slot :-).
>
> Mark.
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

[j-nsp] MX304 Port Layout

2023-06-08 Thread Mark Tinka via juniper-nsp

So, we decided to give the MX304 another sniff, and needed to find out 
why Juniper charge a license for 16x 100Gbps ports per line card, and 
yet the data sheet suggests the box can handle 48x 100Gbps ports 
chassis-wide.


Well, turns out that if you deploy it with redundant RE's, you get 32x 
100Gbps ports (2x line cards of 16x ports each).


However, to take another 16 ports that gets you to 48x 100Gbps ports, 
you need to sacrifice one RE to use its slot :-).


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

56 matches

Mail list logo