Hi Gem5-dev,

We have a question concerning garnet that I'm hoping someone out there can 
answer.  In particular, we are encountering a bug when trying to send ordered 
messages across garnet.  Our particular situation is the out-of-order delivery 
of two request messages being sent on the same ordered virtual network (4 
virtual channels per virtual network).  The connection is NIC->fixed 
router->NIC.  There is only one router in the connection.

The particular offending code appears in VCallocator_d::is_invc_candidate().  
The code of this function is pasted below:

         if ((m_router->get_net_ptr())->isVNetOrdered(vnet)) {
             for (int vc_offset = 0; vc_offset < m_vc_per_vnet; vc_offset++) {
                int temp_vc = invc_base + vc_offset;
                 if (m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, VA_,
                                                           
m_router->curCycle()) &&
                    (m_input_unit[inport_iter]->get_route(temp_vc) == outport) 
&&
                    (m_input_unit[inport_iter]->get_enqueue_time(temp_vc) <
                         t_enqueue_time)) {

Unfortunately, like most of Garnet, there are no comments! So we are left to 
use our imagination on trying to figure out what this code is trying to do.  
The three && conditions in the if statement are particularly confusing, 
especially the "need_stage" call.  Does anyone understand what this code is 
trying to do?  It appears that it is trying to ensure that all ordered messages 
are delivered properly.  The problem is if these three are never satisified for 
all the VCs, then the function falls through and returns true.  The result is 
two messages that should be delivered in-order are enqueued on different 
virtual channels.  Thus the round-robin scheduler can easily select the second 
message first in the next stage.

Below is the detailed analysis from gdb for the 4 iterations of the loop.

We are planning to implement a fix for this, but first we need to determine 
what the code was trying to do.

Thanks,

Brad

(gdb) display m_input_unit[inport_iter]->get_enqueue_time(temp_vc)
1: m_input_unit[inport_iter]->get_enqueue_time(temp_vc) = {
  c = 10000
}
(gdb) display t_enqueue_time
2: t_enqueue_time = {
  c = 38936
}
(gdb) display m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, VA_, 
m_router->curCycle())
3: m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, VA_, 
m_router->curCycle()) = false
(gdb) display (m_input_unit[inport_iter]->get_route(temp_vc) == outport)
4: (m_input_unit[inport_iter]->get_route(temp_vc) == outport) = true
(gdb) display m_input_unit[inport_iter]->get_route(temp_vc)
5: m_input_unit[inport_iter]->get_route(temp_vc) = 0
(gdb) p m_vc_per_vnet
$1 = 4
(gdb) c
Continuing.

Breakpoint 1, VCallocator_d::is_invc_candidate (this=0x47607e0, inport_iter=1, 
invc_iter=15)
    at 
build/X86_MESI_Three_Level/mem/ruby/network/garnet/fixed-pipeline/VCallocator_d.cc:139
139                 if (m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, 
VA_,
5: m_input_unit[inport_iter]->get_route(temp_vc) = 0
4: (m_input_unit[inport_iter]->get_route(temp_vc) == outport) = true
3: m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, VA_, 
m_router->curCycle()) = false
2: t_enqueue_time = {
  c = 38936
}
1: m_input_unit[inport_iter]->get_enqueue_time(temp_vc) = {
  c = 10000
}
(gdb) c
Continuing.

Breakpoint 1, VCallocator_d::is_invc_candidate (this=0x47607e0, inport_iter=1, 
invc_iter=15)
    at 
build/X86_MESI_Three_Level/mem/ruby/network/garnet/fixed-pipeline/VCallocator_d.cc:139
139                 if (m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, 
VA_,
5: m_input_unit[inport_iter]->get_route(temp_vc) = 0
4: (m_input_unit[inport_iter]->get_route(temp_vc) == outport) = true
3: m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, VA_, 
m_router->curCycle()) = false
2: t_enqueue_time = {
  c = 38936
}
1: m_input_unit[inport_iter]->get_enqueue_time(temp_vc) = {
  c = 10000
}
(gdb) c
Continuing.

Breakpoint 1, VCallocator_d::is_invc_candidate (this=0x47607e0, inport_iter=1, 
invc_iter=15)
    at 
build/X86_MESI_Three_Level/mem/ruby/network/garnet/fixed-pipeline/VCallocator_d.cc:139
139                 if (m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, 
VA_,
5: m_input_unit[inport_iter]->get_route(temp_vc) = 0
4: (m_input_unit[inport_iter]->get_route(temp_vc) == outport) = true
3: m_input_unit[inport_iter]->need_stage(temp_vc, VC_AB_, VA_, 
m_router->curCycle()) = true
2: t_enqueue_time = {
  c = 38936
}
1: m_input_unit[inport_iter]->get_enqueue_time(temp_vc) = {
  c = 38936
}


_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to