after yesterday's discussion, I decided to do a basic benchmark how scalable the method is - to answer the question "is this going to be a bottleneck - yes or no".
This is back-of-envelope, worst case analysis, without _any_ attempt at optimisation. (such as could be: using faster protobuf encoding methods than per-scalar, substituting the member names by ID references, grouping into submessages with different reporting interval needs and so forth). what size problem? ------------------ Assume we were to report EMC_STAT contents through HAL messaging. Assuming part of the reporting effort comes from per-value serialisation overhead of EMC_STAT members, we need an estimate for the number of scalars. Looking at EMC_STAT with the following assumptions: - just counting scalars - leaving out strings (these occur only in EMC_TASK_STAT which doesnt originate in HAL!) - multiply each EmcPose field by 9, assuming it's unrolled into x,y,z etc discrete scalars - unroll arrays - leave out the tool table altogether - assume current array size build parameters (NUM_DIO, NUM_AIO, MAX_AXES etc) EMC_STAT contains: EMC_TASK_STAT = total of 73 scalars EMC_IO_STAT = 8 scalars (plus tool table which will not live there in the future) EMC_MOTION_STAT = total 508 scalars. NB: of these 256 are digital/analog I/O pins; by default only 4 of 64 possible are exported in HAL though. in total: 589 scalars. Test: ----- I created a fake hal group of 500 float signals, and ran that through the halreport component, which is the single CPU user in the scheme. run variants: - running at 100 and 1000 reports/sec - running without/with protobuf encoding and ZeroMQ message sending (to gauge encoding/transmit cost) results: - on Intel Core2, at 100 reports/sec this creates about 3% load on one core with encoding/sending, and about 1-2% without encoding/sending values. - increasing reporting frequency to 1000/sec (which is absurdly high) raises this to 18% CPU with encoding. Connecting a client raises this by 1-2%. - removing protobuf encoding and sending the ZeroMQ publish message drops this to about 11% at 1000/sec. Actually, at 1000/sec optimization is needed at the client side, not the publisher side - the simple client I referred to uses the (notoriously slow) Python protobuf decoding, even though a compiled-C extension mapping is already supported, which I didnt build yet. The client maxes out a core at 1000/second unoptimized. --- Summary: this is entirely feasible even with a completely dumb brute-force approach to mapping EMC_STAT. - Michael ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Emc-developers mailing list Emc-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/emc-developers