Re: 答复: Re: Reply to comments -- Comparison between FAR and OSPF

Peter Psenak Wed, 11 Jun 2014 01:08:33 -0700

Richard,

On 6/11/14 06:11 , [email protected] wrote:
> Peter:
> 
> "the arguments you use against existing IGPs would be valid 20 years 
> ago, but not today."
> No matter in which year the arguments are referenced, the difference is 
> significant, which is determined by two different design concepts.


that is not a valid argument. The fact that there is a difference does
not mean anything, unless the difference itself makes an impact.

> 
> "First, links have high bandwidth, CPUs are fast and any serious IGP 
> implementation has addressed the bottlenecks you are talking about."
> As the number of nodes increases and the high bandwidth is required, the 
> number of protocol packets which are transmitted and needed to be 
> processed is growing exponentially. But the CPU of a switch can only 
> process a few hundred packets per second; therefore, the processing 
> capacity of the CPU limits the increase of the number of nodes. I have 
> tried to adjust the processing capacity of the CPU in the actual 
> commercial systems, the processing capacity may be increased to 
> thousands packets per second in some way, but at the expense of other 
> protocol processing performance. Therefore, in the large-scale FAT TREE 
> system, the processing capacity of the CPU in those commercial systems 
> cannot cope with a large number of OSPF protocol packets.
> 
> 'These days IGPs can support thousands of nodes in an area without any 
> problem, and converge sub-second, with precomputed backups, even withing 
> few tens of miliseconds.'
> For thousands of nodes, it is still a small scale. Once it comes up to 
> tens of thousands of nodes, IGP will not do the job.

above is an academic statement.

First, you need to provide a real world scenario where tens of thousands
of nodes need to be deployed in a flat area. Secondly you need to
describe why the current IGPs would not be able to do the job or be
improved to do it.


> 
> 'There are real deployments that clearly prove it, it's not an academic 
> statement.'
> I have developed and implemented a lot of routing and switching 
> commercial systems. These issues are not academic statements, but the 
> real lessons.
> 
> 'Even the periodic flooding can be avoided completely using RFC 4136.'
> RFC 4136 is only applicable to medium-sized networks. In terms of tens 
> of thousands of nodes in the data centers nowadays, it cannot do the job.
> 
> We need to be open-minded to adapt to the increasing scales of data 
> centers.

being open minded is fine. Defining a new protocol requires more then
just being open minded, otherwise we would have hundreds of routing
protocols defined already.

thanks,
Peter

> 
> Regards,
> 
> Richard Bin Liu
> 
> 
> 
> *Peter Psenak <[email protected]>*
> 
> 2014/05/15 14:41
> 
>       
> 收件人
>       [email protected], Hannes Gredler <[email protected]>, "Alvaro 
> Retana (aretana)" <[email protected]>,
> 抄送
>       [email protected], ytsun <[email protected]>, 
> [email protected], [email protected]
> 主题
>       Re: Reply to comments -- Comparison between FAR and OSPF
> 
> 
>       
> 
> 
> 
> 
> 
> Liu,
> 
> the arguments you use against existing IGPs would be valid 20 years ago,
> but not today. First, links have high bandwidth, CPUs are fast and any
> serious IGP implementation has addressed the bottlenecks you are talking
> about. These days IGPs can support thousands of nodes in an area without
> any problem, and converge sub-second, with precomputed backups, even
> withing few tens of miliseconds. There are real deployments that clearly
> prove it, it's not an academic statement. Even the periodic flooding can
> be avoided completely using RFC 4136.
> 
> regards,
> Peter
> 
> 
> On 5/15/14 07:08 , [email protected] wrote:
>  >
>  > Greeting all:
>  >
>  > I read comments on our draft, thank you for your comments.
>  >
>  > And some questions had already been replied in our latest FAR
>  > presentation material(not been presented at the meeting because of
>  > hard-deadline):
>  >
>  > --------"Draft is highly subjective. Data Centers are using existing
>  > protocols without problems."
>  > Why OSPF and other conventional routing methods do not work well in a
>  > large-scale network with several thousands of routers?
>  > As everyone knows, the OSPF protocol uses multiple databases, more
>  > topological exchange information (as seen in the following example) and
>  > complicated algorithm. It requires routers to consume more memory and
>  > CPU processing capability. But the processing rate of CPU on the
>  > protocol message per second is very limited. When the network expands,
>  > CPU will quickly approach its processing limits, and at this time OSPF
>  > can not continue to expand the scale of the management. The SPF
>  > algorithm itself does not thoroughly solve these problems.
>  >
>  > On the contrary, FAR does not have the convergence time delay and the
>  > additional CPU overheads, which SPF requires. Because in the initial
>  > stage, FAR already knows the regular information of the whole network
>  > topology and does not need to periodically do SPF operation.
>  >
>  > One of the examples of "more topological exchange information":
>  > In the OSPF protocol, LSA floods every 1800 seconds. Especially in the
>  > larger network, the occupation of CPU and band bandwidth will soon reach
>  > the router’s performance bottleneck.
>  > In order to reduce these adverse effects, OSPF introduced the concept of
>  > Area, which still has not solved the problem thoroughly). By dividing
>  > the OSPF Area into several areas, the routers in the same area do not
>  > need to know the topological details outside their area. (In comparison
>  > with FAR, after  OSPF introducing the concept of Area, the equivalent
>  > paths cannot be selected in the whole network scope)
>  >
>  >   OSPF can achieve the following results by Area :
>  > 1) Routers only need to maintain the same link state databases as other
>  > routers within the same Area, without the necessity of maintaining the
>  > same link state database as all routers in the whole OSPF domain.
>  > 2) The reduction of the link state databases means dealing with
>  > relatively fewer LSA, which reduces the CPU consumption of routers;
>  > 3) The large number of LSAs flood only within the same Area.
>  > But, its negative effect is that the smaller number of routers which can
>  > be managed in each OSPF area.
>  > On the contrary, because FAR does not have the above disadvantages, FAR
>  > can also manage large-scale network even without dividing Areas.
>  >
>  > The aging time of OSPF is set in order to adapt to routing
>  > transformation and protocol message exchange happened frequently in the
>  > irregular topology. Its negative effect is:
>  > when the network does not change, the LSA needs to be refreshed every
>  > 1800 seconds to reset the aging time. In the regular topology, as the
>  > routings are fixed, it does not need the complex protocol message
>  > exchange and aging rules to reflect the routing changes, as long as LFA
>  > mechanism in the FAR is enough.
>  >
>  > Therefore, in FAR, we can omit many unnecessary processing and the
>  > packet exchange. The benefits are fast convergence speed and much larger
>  > network scale than other dynamic routing protocol.
>  > Now there are some successful implementations of simplified routings in
>  > the regular topology in the HPC environment.
>  > Conclusion:As FAR needs few routing entries and the topology is
>  > regular, the database does not need to be updated regularly. Without the
>  > need for aging, there is no need for CPU and bandwidth overhead brought
>  > by LSA flood every 30 minutes, so the expansion of the network has no
>  > obvious effect on the performance of FAR, which is contrary to OSPF.
>  >
>  > --------"Network convergence doesn't follow link state
>  >            dynamics - Fast reroute exists. "
>  >
>  > Comparison of convergence time:
>  > The settings of OSPF spf_delay and spf_hold_time can affect the change
>  > of convergence time. The convergence time of the network with 2480 nodes
>  > is about 15-20 seconds(as seen in the following pages); while the FAR
>  > does not need to calculate the SFP, so there is no such convergence time.
>  > These issues *still exist*in rapid convergence technology of OSPF and
>  > ISIS (such as I-SPF). The convergence speed and network scale constraint
>  > each other. FAR does not have the above problems, and the convergence
>  > time is almost negligible.
>  >
>  > And test data is been include in another pptx material named OSPF in
>  > DCN(2).pptx, which can be download from IETF.
>  >
>  > Looking forward to further discussion.
>  >
>  > Best.
>  >
>  > Richard Bin Liu
>  >
>  >
>  > _______________________________________________
>  > rtgwg mailing list
>  > [email protected]
>  > https://www.ietf.org/mailman/listinfo/rtgwg
>  >
> 
> 

_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Re: 答复: Re: Reply to comments -- Comparison between FAR and OSPF

Reply via email to