Ram, I understand everything you say below. But you aren't commenting on my main point: why is it worth the *extra* cost of running several hashes in parallel to exclude a *small* fraction of mis-identified large flows? To me, that is not a useful engineering trade-off.
Regards Brian On 14/01/2013 17:48, ramki Krishnan wrote: > Hi Brian, > > > >>> I think didn't explain my point sufficiently. I understand why you suggest >>> using several hashes. But this is statistical load balancing we are talking >>> >>about. If a few % of the time, you mistakenly treat several medium flows >>> as one large flow, and therefore rebalance them as a single unit, so what? >>> >>You will still balance the traffic reasonably well. > > > > Each of the flows is eventually learnt/programmed in a flow table (hardware > table resource where the flow is finally committed); there is never a case of > bundling multiple flows. More details below > > > > 1) Scalable detection of long-lived large flows > > a. The goal is to keep the size of the flow table (hardware table resource > where the long-lived large flow is finally committed) bounded and the > processing requirements (CPU utilization) for flow learning bounded. > > b. For satisfying the above goal, several hashes helps. > > > > 2) Scalable load-balancing of long-lived large flows > > a. The goal is to have a scalable load balancing solution which produces > meaningful results while keeping the processing requirements (CPU > utilization) for load-balancing bounded. > > b. For satisfying the above goal, it is not worthwhile to load-balance > medium/small flows. > > > > Thanks, > > ram > > > > -----Original Message----- > From: Brian E Carpenter [mailto:brian.e.carpen...@gmail.com] > Sent: Sunday, January 13, 2013 12:06 AM > To: ramki Krishnan > Cc: draft-krishnan-opsawg-large-flow-load-balanc...@tools.ietf.org; > opsawg@ietf.org > Subject: Re: [OPSAWG] I-D Action: > draft-krishnan-opsawg-large-flow-load-balancing-02.txt > > > > Ram, > > > > On 13/01/2013 05:14, ramki Krishnan wrote: > >> Hi Brian, > > >> Thanks a lot for your comments. Please find answers to some of your >> comments. We will respond to your other comments shortly. > > >>> There may be some false positives due to multiple other flows > >>> masquerading as a large flow; the amount of false positives is > >>> reduced by parallel hashing using different hash functions > > >> Brian: > > >>> To give you some data, with a 20 bit ID space, the FNV1a-32 hash algorithm >>> gives at most 5% collisions, based on IPv6 headers in real packet traces. >>> [https://researchspace.auckland.ac.nz/handle/2292/13240]. I wonder whether >>> the overhead of running several hashes in parallel is justified by this >>> collision rate? > > >> Ram: > > >> The need for multiple hashes is specific to the suggested algorithm on >> automatic hardware identification - this algorithm is similar to a bloom >> filter which uses multiple hash functions. > > > > I think didn't explain my point sufficiently. I understand why you suggest > using several hashes. But this is statistical load balancing we are talking > about. If a few % of the time, you mistakenly treat several medium flows as > one large flow, and therefore rebalance them as a single unit, so what? You > will still balance the traffic reasonably well. > > > > Brian > > > >> "On packet arrival, a new flow is looked up in parallel in all the hash >> tables and the corresponding counter is incremented. If the counter exceeds >> a programmed threshold in a given time interval in all the hash table >> entries, a candidate large flow is learnt and programmed in a hardware table >> resource like TCAM. > > > >> For a short-lived flow to masquerade as a long-lived lived flow, it needs to >> match all the hash table entries which is a joint probability event - thus, >> the amount of false positives due to short-lived flows is reduced. > > > >> Thanks, > > >> ram > > >> -----Original Message----- > >> From: opsawg-boun...@ietf.org<mailto:opsawg-boun...@ietf.org> >> [mailto:opsawg-boun...@ietf.org] On > >> Behalf Of Brian E Carpenter > >> Sent: Saturday, January 12, 2013 8:40 AM > >> To: >> draft-krishnan-opsawg-large-flow-load-balanc...@tools.ietf.org<mailto:draft-krishnan-opsawg-large-flow-load-balanc...@tools.ietf.org> > >> Cc: opsawg@ietf.org<mailto:opsawg@ietf.org> > >> Subject: Re: [OPSAWG] I-D Action: > >> draft-krishnan-opsawg-large-flow-load-balancing-02.txt > > >> Hi, > > >> My comments are on the discussion of flow IDs and hashing. I'm not >> commenting at all on the overall proposal, because I can't judge whether the >> problem is real or the solution is practical. > > >>> A large space of the flow identifications, i.e. finer > > >>> granularity of the flows, conducts more random in spreading the flows > >>> over a set of component links. > > >> That isn't accurate. The requirement is an ID space in which the IDs belong >> to a uniform distribution. Technically speaking, if you have two links, a >> one-bit flow ID is sufficient, as long as the values 0 and 1 are equally >> likely to appear. > >> Therefore, the practical issue is not the size of the ID space but the >> quality of the hash function used to generate the ID of each flow. > >> However, whatever the initial ID space, the final hash has to be down to >> 0..N if you have N+1 alternative paths. > >> I think the reason that your model needs a larger ID space is to reduce the >> probability of two flows colliding by chance in the ID space. > >> That would defeat your wish to separate out large flows. > > >>> The advantages of hashing based load > >>> distribution are the preservation of the packet sequence in a flow > >>> and the real time distribution with the stateless of individual > >>> flows. If the traffic flows randomly spread in the flow > >>> identification space, the flow rates are much smaller compared to the > >>> link capacity, > > >> That sounds like magic. I don't think you mean that at all. > > >>> and the rate differences are not dramatic, > > >> Do you mean that the total traffic rate is more fairly distributed across >> the links? In any case, "dramatic" isn't an engineering term. > > > >>> the hashing > >>> algorithm works very well in general. > > >> How can you say that without specifying a particular algorithm? Also, "very >> well in general" isn't an engineering term either. > > >>> There may be some false positives due to multiple other flows > >>> masquerading as a large flow; the amount of false positives is > >>> reduced by parallel hashing using different hash functions > > >> To give you some data, with a 20 bit ID space, the FNV1a-32 hash algorithm >> gives at most 5% collisions, based on IPv6 headers in real packet traces. > >> [https://researchspace.auckland.ac.nz/handle/2292/13240] > >> I wonder whether the overhead of running several hashes in parallel is >> justified by this collision rate? > > >> Regards > > >> Brian Carpenter > >> _______________________________________________ > > >> OPSAWG mailing list > >> OPSAWG@ietf.org<mailto:OPSAWG@ietf.org<mailto:OPSAWG@ietf.org%3cmailto:OPSAWG@ietf.org>> > >> https://www.ietf.org/mailman/listinfo/opsawg _______________________________________________ OPSAWG mailing list OPSAWG@ietf.org https://www.ietf.org/mailman/listinfo/opsawg