Yes, 32 bits. Filip
> One quick question: 32 or 64 bits? Looks as it 32, right? > > Christian > > -- > Christian Schulte, www.it.kth.se/~cschulte/ > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf > Of Filip Konvicka > Sent: Thursday, February 12, 2009 2:21 PM > To: [email protected] > Cc: Luboš Moric > Subject: [gecode-users] Cloning problems > > Hi, > > [Sorry, this is a looong message...] > > we're hunting a serious bug that occurs during space cloning in 2.2.0. > The bug occurs very rarely, but we have a testcase that triggers this > behavior. > > We have many constraints in the problem instance and the solver should > post as many propagators as possible. We have a custom branching for > this, which posts one propagator at a time in commit(), while the > alternative is not to post the propagator (i.e. a no-op). Because we're > only looking for the first solution, in the case of a failure we no > longer need the path back to the root in the recomputation tree, so we > decided to use our own simple search engine for this. The standard DFS > search engine exhibits exactly the same behavior (both with > recomputation on and off), and we don't see any problems with our search > engine. > > Everything seems to work for the vast majority of the test cases, but > there are a few instances that cause problems (probably) during cloning > (can be probably also be caused by some earlier bad subscibe or > unsubscribe). From our point of view, there is nothing wrong or special > about the instances. The crashes occur at the same location both on > Linux and Windows, in both release and debug builds. Changing memory > management (e.g. never deleting Spaces in the search engine) can cause > the crash to occur at slightly different places (e.g. some propagation > during status() after clone() finishes). > > One particular case we're looking at now crashes at core.icc:2270, where > f[0] is a bad pointer (0xfeeefeee at Windows). We're not sure how this > can happen - we know that in this case n==2 at core.icc:2255, so idx[0] > is bad pointer at core.icc:2252. This is also what Valgrind says on > Linux (bad read of size 4). > > When we were trying to debug the other cases, we found out that the > subscription list in a variable in the cloned space contained an actor > link that was probably copied incorrectly as it seemed as a pure > ActorLink like Space::a_actors, having a totally different address than > the rest of the actors (probably belonging to the original space > object). When we tried to find out when this actor link entered the > list, we ended up in VarImp<VIC>::update again. > > We're (of course:-)) using FloatVars in the model, and we eliminated all > other kinds of variables and propagators. In our case, pc_max==1 and > free_bits==0. > > We find it difficult to understand what is happening during cloning. We > would appreciate if someone explaned the basic idea. We only have > floatvars, propagators and one branching (no advisors or other types of > actors/branchings/advisors). > > We know how VarImp<VIC>::resize works, that's easy. In > VarImp<VIC>::enter, we can't see why you do "--idx[0];" as the first > iteration of the for cycle overwrites it (as long as pc>0, of course). > May be just optimization of course. As for VarImp<VIC>::update, we only > guess...we suspect that a) the original x->idx[0] is destroyed somewhere > so it needs to get restored from a memcpy backup at idx[0], b) > ActorLink::_prev is probably used to map old actors to new ones (thus > the "->prev()". We did not dig deep enough to be sure though, so we'd > welcome some guidance here. > > Cheers, > Filip > > > _______________________________________________ > Gecode users mailing list > [email protected] > https://www.gecode.org/mailman/listinfo/gecode-users _______________________________________________ Gecode users mailing list [email protected] https://www.gecode.org/mailman/listinfo/gecode-users
