Václav S a écrit : >> The damping dispatcher takes 89% of cpu time to do >> something else, namely : >> - LocateMultivirtualFunctor : 50% >> > Good to know. >
Let me precise this : I'm not sure the dispatching mecanism is slow by itself. What takes a lot of time in the dispatcher is for instance BodyMacroParameters::getBaseClassIndex() (around 50%), or the function find() I spoke about before (10%). I'm attaching a file with the list of function costs in LocateMultivirtualFunctor . Note that getBaseClassIndex() is 10 times slower that getClassIndex(), I wonder why... Bruno > Another way would be to have boost::ublas::compressed_matrix for the 2D > case, if 2D array of some 10000 elements is too much (OK, if we have > 1000 classes once, that would mean 1e6 which is not nice). Should be > reasonably fast as well. No 3D functors, though ;-). > > Will play with that if I find some time. That would sound like big speed > improvement (there is quite a few dispatchers in the simulation, if they > take 30% of time, that means 15% for dispatching... wow. > > I thought originally this stuff was part of Loki and was well optimized. > > >> A simple example : >> >> shared_ptr<PhysicalAction>& PhysicalActionVectorVector::find(unsigned >> int id , int actionIndex ) >> { >> if( current_size <= id ) // this is very rarely executed, only at >> beginning. >> // somebody is accesing out of bounds, make sure he will find, what >> he needs - a resetted PhysicalAction of his type >> { >> .... >> } >> usedIds[id] = true; >> return physicalActions[id][actionIndex]; >> } >> >> 1. There is this test at the begining, that is useless all the time >> except at iteration 1. >> 2. there is this usedIds[id] affectation (same remark again, only >> modified at iteration 1). >> 3. Then return a reference to a shared pointer (while shared_ptr >> operations are slower than normal). >> >> > You're right, I thought that 1. and 2. was the business of ::prepare, > which is called from PhysicalActionContainerInitializer. It should be > the user's responsibility to use valid indices, right? > > UsedIds, that's a different story; it says what index contains a > non-null action so that we can iterate over non-null actions. But this > flag could be put into the action instance itself. Then > physicalActions[id][actionIndex] could be used directly. > >> Of course, this function "find" is used many many times. Result 3.81% of >> total CPU time just for it, while we could just use >> physicalActions[id][actionIndex] instead of find(), and reduce that time >> a lot. But of course physicalActions is private... >> >> > Just change that in the header and make it public, no problem. If the > "user" screws stuff, it it his problem, not mine. I think Olivier liked > having lot of stuff private, but then we have overheads for for > accessors. Sadly, c++ has no way to say: this member is public for > reading and private for writing, which could help in many cases. > >> Janek : I know that containers will be changed anyway, and that is why I >> think this is the right moment to cry about speed. :) >> Users don't care about Godwin's law, they need SPEED!!! >> >> > Cosurgi, should I fiddle with PhysicalActionContainer or is it going to > be changed anyway? > >> Yes, I thought to that too, but I was not expert enough to be sure. >> Can't the compiler check that nothing will be affected in some situations? >> >> > It can in some, but I don't know in which ones. Putting const everywhere > also checks that there are no side-effects of the method on its > instance, for example. > > Another thing is that many methods should be inlined, but grepping > through headers reveals that only very small subset is: for the most > part, the openGl wrapper. Inlining functions is turned on by the -O3 > flag to g++, so we don't gain nothing here probably. But gcc manpage > says for -finline-limit: " This option is particularly useful for > programs that use inlining heavily such as those based on recursive > templates with C++." which is just our case. > > And by the way, I just found out that assert(...) is not compiled-out in > optimized builds of scons, since we don't define NDEBUG. Will be fixed > in next commit. No big deal ;-) > _______________________________________________ > yade-dev mailing list > yade-dev@lists.berlios.de > https://lists.berlios.de/mailman/listinfo/yade-dev > > -- _______________ Chareyre Bruno Maitre de conference Institut National Polytechnique de Grenoble Laboratoire 3S (Soils Solids Structures) - bureau E145 BP 53 - 38041, Grenoble cedex 9 - France Tél : 33 4 56 52 86 21 Fax : 33 4 76 82 70 43 ________________ _______________________________________________ yade-dev mailing list yade-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/yade-dev