> Saku Ytti
> Sent: Tuesday, December 20, 2016 7:22 PM
>
> On 20 December 2016 at 18:42,  <adamv0...@netconsultings.com> wrote:
>
> > Both CRS-X and NCS6k are powered by nPower X1e NPU.
> > And my understanding is that it's Homogeneous(Same PPE type) MPSoC
> i.e. Symmetric MultiProcessing (SMP), much like all the chips out there (used
> in ASR9k or MX and PTX, ...).
> > The difference I understand is in the instruction set that the PPE is 
> > running.
> > And my guess is that threads on each PPE are using run to completion
> scheduling.
> > Let me know your thoughts please.
> >
> > And by pipeline with regards to NPU design I understand pipelining of
> arrays of PPEs where each array in the pipeline consists of PPEs dedicated to
> a specific function(parse search modify). -like in ASR9k.
>
> Current gen ASR9k, EZchip, is like Trio, ALU FP or Huawei Solar, many 
> identical
> cores, fully programmable, essentially you're only limited by time in what you
> can do. Where as NCS5k/Arista/Jericho, PTX are ASIC/pipelines, with much
> more specialised hardware with lot less flexibility, but what they do do, they
> do far more efficiently, which means denser boxes are pragmatic.
> Roughly speaking pipeline/ASIC is great for core, DC, in Edge you often may
> require richer features offered by NPU designs, and density isn't that 
> crucial.
>
With regards to raw processing speed comparison I don't think it matter that 
much whether it's an SMP(single PPE completely processes the packet head) or 
Pipeline (packet head is processed through a pipeline PPE stages -each 
specialized for different function (different instructions set)).
I think what matters the most is how much data does the PPE get (size of packet 
head that will be processed) and the amount of instructions in the set (#of 
computations/lookups -and resulting memory accesses).
Obviously apart from clock-rate and number of threads for each PPE of course.

A good example is QFP(ASR1K) and QFA(CRS3),
Same SMP architecture, but QFP PPE gets whole packet bodies and executes a 
massive instruction set on each resulting in very limited pps performance, 
whereas QFA PPE gets only packet heads and executes limited instructions set 
resulting in massive improvement of pps performance.
Another good example is the hyper-mode on MX PFE, by reducing the instruction 
set that each PPE executes on every packet head it needs to process you gain 
some extra pps performance.

What I'm trying to say is that it doesn't matter that much how are the PPEs 
organized on the NPU chip (SMP, Pipeline or even SIMD architecture).

adam


        Adam Vitkovsky
        IP Engineer

T:      0333 006 5936
E:      adam.vitkov...@gamma.co.uk
W:      www.gamma.co.uk

This is an email from Gamma Telecom Ltd, trading as “Gamma”. The contents of 
this email are confidential to the ordinary user of the email address to which 
it was addressed. This email is not intended to create any legal relationship. 
No one else may place any reliance upon it, or copy or forward all or any of it 
in any form (unless otherwise notified). If you receive this email in error, 
please accept our apologies, we would be obliged if you would telephone our 
postmaster on +44 (0) 808 178 9652 or email postmas...@gamma.co.uk

Gamma Telecom Limited, a company incorporated in England and Wales, with 
limited liability, with registered number 04340834, and whose registered office 
is at 5 Fleet Place London EC4M 7RD and whose principal place of business is at 
Kings House, Kings Road West, Newbury, Berkshire, RG14 5BY.
---------------------------------------------------------------------------------------
 This email has been scanned for email related threats and delivered safely by 
Mimecast.
 For more information please visit http://www.mimecast.com
---------------------------------------------------------------------------------------
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Reply via email to