Hi Mark.  Spelunking through the hashpipe_pktsock.h  header file I see that

#define PKT_UDP_DATA(p) (PKT_NET(p) + 0x1c)

In your code you posted earlier, you have this:

memcpy(dest_p, payload, PKT_UDP_SIZE(frame) - 16)  // Ignore both UDP (8
bytes) and packet header (8 bytes)

Have you verified that all these magic numbers add up, that is, the 16 and
the 0x1c, and other constants such as these?  It seems clear from your
description that you are trying to read from unallocated memory, but it's
difficult to see where from the snippets of code we have.  Also, make sure
that any pointer arithmetic uses the correct casts before adding the
offsets.



On Mon, Nov 30, 2020 at 3:45 PM Mark Ruzindana <ruziem...@gmail.com> wrote:

> Hi David,
>
> Hope everything is fine. It's okay if you haven't seen it yet or forgot,
> but I'm still struggling with this issue. Would you mind giving me some
> thoughts on it if you have any? Here is the issue again, along with a
> summary of what I did to catch you up, just in case you need it:
>
> I was able to install hashpipe with the suid bit set as you suggested
> previously. So far, I have been able to capture data with the first round
> of frames of the circular buffer i.e. if I have 160 frames, I am able to
> capture packets of frames 0 to 159 at which point right at the memcpy()
> in the process_packet() function of the net thread, I get a segmentation
> fault.
>
> And the suggestions that you provided were very helpful with diagnosis,
> but the problem hasn't been resolved yet.
>
> I'm currently using gdb to debug and it either tells me that I have a
> segmentation fault at the memcpy() in process_packet() or something very
> strange happens where the starting mcnt of a block greatly exceeds the mcnt
> corresponding to the packet being processed and there's no segmentation
> fault because the mcnt distance becomes negative so the memcpy() is
> skipped. Hopefully that wasn't too hard to track. Very strange problem that
> only occurs with gdb and not when I run hashpipe without it. Without gdb, I
> get the same segmentation fault at the end of the circular buffer as
> mentioned above.
>
> I also omitted the "+ input_databuf_idx(...)" to test for buffer overflow,
> and the same result (segmentation fault).
>
> I checked to make sure that the blocks are large enough for the number of
> frames. Right now, I have 480 total frames and 60 blocks so 8 frames per
> block. And my frame size (8192) is a multiple of the kernel page size
> (4096). I've also tried frame sizes 4096, and 16384 with the same results.
>
> I tried using 'hashpipe_dump_databuf -b "block number"' and I see binary
> symbols in stdout regardless of what values I put in memset(). So that part
> wasn't as helpful with diagnosis as I'd hoped.
>
> I should also mention that there is data being received on the same
> interface from other ports, but the code ignores data from them as far as I
> can tell, and only captures/processes data from the user suggested port.
> But maybe somehow it's causing these issues and I'm not able to see how.
>
> As a test, I also tried removing the release_frame() function after
> process_packet() is called and I got the same segmentation fault. So I
> still think there's something about the implementation of the
> release_frame() function that I'm not doing or it's not releasing the
> frame. I'm not sure.
>
> I appreciate any feedback. I'll respond ASAP if you have any questions.
>
> Thanks,
>
> Mark Ruzindana
>
> On Fri, Oct 2, 2020 at 12:23 AM Mark Ruzindana <ruziem...@gmail.com>
> wrote:
>
>> Hi David,
>>
>> Sorry it's been a while, I've been working on other tasks besides the
>> packet socket implementation and I've gotten the opportunity to come back
>> to it. I know you have access to the previous emails, but just to catch you
>> up with a summary of what the issue was in implementing packet sockets:
>>
>> I was able to install hashpipe with the suid bit set as you suggested
>> previously. So far, I have been able to capture data with the first round
>> of frames of the circular buffer i.e. if I have 160 frames, I am able to
>> capture packets of frames 0 to 159 at which point right at the memcpy() in
>> the process_packet() function of the net thread, I get a segmentation fault.
>>
>> And the suggestions that you provided were very helpful with diagnosis,
>> but the problem hasn't been resolved yet.
>>
>> I'm currently using gdb to debug and it either tells me that I have a
>> segmentation fault at the memcpy() in process_packet() or something very
>> strange happens where the starting mcnt of a block greatly exceeds the mcnt
>> corresponding to the packet being processed and there's no segmentation
>> fault because the mcnt distance becomes negative so the memcpy() is
>> skipped. Hopefully that wasn't too hard to track. Very strange problem that
>> only occurs with gdb and not when I run hashpipe without it. Without gdb, I
>> get the same segmentation fault at the end of the circular buffer as
>> mentioned above.
>>
>> I also omitted the "+ input_databuf_idx(...)" to test for buffer
>> overflow, and the same result (segmentation fault).
>>
>> I checked to make sure that the blocks are large enough for the number of
>> frames. Right now, I have 480 total frames and 60 blocks so 8 frames per
>> block. And my frame size (8192) is a multiple of the kernel page size
>> (4096). I've also tried frame sizes 4096, and 16384 with the same results.
>>
>> I tried using 'hashpipe_dump_databuf -b "block number"' and I see binary
>> symbols in stdout regardless of what values I put in memset(). So that part
>> wasn't as helpful with diagnosis as I'd hoped.
>>
>> I should also mention that there is data being received on the same
>> interface from other ports, but the code ignores data from them as far as I
>> can tell, and only captures/processes data from the user suggested port.
>> But maybe somehow it's causing these issues and I'm not able to see how.
>>
>> As a test, I also tried removing the release_frame() function after
>> process_packet() is called and I got the same segmentation fault. So I
>> still think there's something about the implementation of the
>> release_frame() function that I'm not doing or it's not releasing the
>> frame. I'm not sure.
>>
>> I appreciate any feedback. I'll respond ASAP if you have any questions.
>>
>> Thanks,
>>
>> Mark Ruzindana
>>
>>
>>
>>
>> On Mon, May 25, 2020 at 6:14 PM Mark Ruzindana <ruziem...@gmail.com>
>> wrote:
>>
>>> Thanks for the additional suggestions. I will try those and let you know
>>> what happens.
>>>
>>> Mark
>>>
>>> On Mon, May 25, 2020 at 6:07 PM David MacMahon <dav...@berkeley.edu>
>>> wrote:
>>>
>>>> A few more suggestions:
>>>>
>>>> 1) Enable core dumps.  Usually you have to run "ulimit -c unlimited"
>>>> and for suid executables there's an extra step related to
>>>> /proc/sys/fs/suid_dumpable.  See "man 5 core" and "man 5 proc" for
>>>> details.  Once you have a core file, you can use gdb to examine the state
>>>> of things when the segfault happened.  You might want to recompile your
>>>> plug-in with debugging enabled and fewer optimizations to get the most out
>>>> of this approach: "gdb /path/to/hashpipe /path/to/core".  (Gotta love how
>>>> it's still called "core"!).  gdb can be a bit cryptic, but it's also very
>>>> powerful.
>>>>
>>>> 2) Another idea, just for diagnostic purposes, is to omit the "+
>>>> input_databuf_idx(...)" part of the dest_p assignment.  That will write all
>>>> payloads to the first part of the data block, so not buffer overflow for
>>>> sure (assuming idx is in range :)).  It's just a way to eliminate a
>>>> variable.
>>>>
>>>> 3) Make sure the packet socket blocks are large enough for the packet
>>>> frames.  I agree it looks like you're not reading past the end of the
>>>> packet payload size, but maybe the payload itself goes beyond the end of
>>>> the packet socket blocks?  The kernel might silently truncate the packets
>>>> in that case.
>>>>
>>>> 4) If you're using tagged VLANs the PKT_UDP_xxx macros won't work
>>>> right.  It sounds like that's not happening because you're seeing the
>>>> expected size, but it's worth mentioning for mail archive completeness.
>>>>
>>>> 5) You can use hashpipe_dump_databuf to examine the 159 payloads you
>>>> were able copy before the segfault to see whether every byte is properly
>>>> positioned and has believable values.  You could change memcpy(..) to
>>>> memset(p_dest, 'X', PKT_UDP_SIZE(frame)-16) so you'll know the exact value
>>>> that every byte should have. Instead of 'X' you could use pkt_num+1 (i.e. a
>>>> 1-based packet counter) so you'll know which bytes correspond to which
>>>> packets.  Using memset() would also eliminate reading from the packet
>>>> socket blocks (another variable gone).
>>>>
>>>> Happy hunting,
>>>> Dave
>>>>
>>>> On May 25, 2020, at 16:33, Mark Ruzindana <ruziem...@gmail.com> wrote:
>>>>
>>>> Thanks for the suggestions. I neglected to mention that I'm printing
>>>> out the PKT_UDP_SIZE() and PKT_UDP_DST() right before the memcpy(), I take
>>>> into account the 8 byte UDP header and the size and port are correct. When
>>>> performing the memcpy(), I am taking into account that PKT_UDP_DATA()
>>>> returns a pointer of the payload and excludes the UDP header. However, I
>>>> also have an 8 byte packet header within that payload (this gives me the
>>>> mcnt, f-engine, and x-engine indices) and I exclude it when performing the
>>>> memcpy(). This is what it looks like:
>>>>
>>>> uint8_t * dest_p = db->block[idx].data + input_databuf_idx(m, f,
>>>> 0,0,0); // This macro index shifts every mcnt and f-engine index
>>>> const uint8_t * payload = (uint8_t *)(PKT_UDP_DATA(frame)+8); // Ignore
>>>> packet header
>>>>
>>>> fprintf(...); // prints PKT_UDP_SIZE() and PKT_UDP_DST()
>>>> memcpy(dest_p, payload, PKT_UDP_SIZE(frame) - 16)  // Ignore both UDP
>>>> (8 bytes) and packet header (8 bytes)
>>>>
>>>> I will look into the other possible issues that you suggested, but as
>>>> far as I can tell, it doesn't seem like there should be a segfault given
>>>> what I'm doing before that memcpy(). I will let you know what else I find.
>>>>
>>>> Thanks again, I really appreciate the help.
>>>>
>>>> Mark
>>>>
>>>> On Mon, May 25, 2020 at 4:30 PM David MacMahon <dav...@berkeley.edu>
>>>> wrote:
>>>>
>>>>> Hi, Mark,
>>>>>
>>>>> Sounds like progress!
>>>>>
>>>>> On May 25, 2020, at 13:56, Mark Ruzindana <ruziem...@gmail.com> wrote:
>>>>>
>>>>> I have been able to capture data with the first round of frames of the
>>>>> circular buffer i.e. if I have 160 frames, I am able to capture packets of
>>>>> frames 0 to 159 at which point right at the memcpy() in the
>>>>> process_packet() function of the net thread, I get a segmentation fault.
>>>>>
>>>>>
>>>>> The fact that you get a the segfault right at the memcpy of the final
>>>>> frame of the ring buffer suggests that there is problem with the 
>>>>> parameters
>>>>> passed to memcpy.  Most likely src+length-1 exceeds the end of the frame 
>>>>> so
>>>>> you get a segfault when memcpy tries to read from beyond the allocated
>>>>> memory.  This would explain why it segfaults on the final frame and not 
>>>>> the
>>>>> previous frames because reading beyond a previous frame still reads from
>>>>> "legal" (though incorrect) memory locations.  It's also possible that the
>>>>> segfault happens due to a bad address on the destination side of the
>>>>> memcpy(), but unless the destination buffer is also 160 frames in size 
>>>>> that
>>>>> seems less likely.
>>>>>
>>>>> The release_frame function is not likely to be a culprit here unless
>>>>> the pointer you are passing it differs from the pointer that the
>>>>> pktsock_recv function returned.
>>>>>
>>>>> For debugging, I suggest logging dst, src, len before calling memcpy.
>>>>> Normally you wouldn't generate a log message for every packet because that
>>>>> would ruin your throughput, but since you know it's going to crash after
>>>>> the first 160 packets there's not much throughout to ruin. :)
>>>>>
>>>>> One thing to remember is that PKT_UDP_DATA() evaluates to a pointer to
>>>>> the UDP payload of the packet, but PKT_UDP_SIZE() evaluates to the total
>>>>> UDP size (i.e. 8 bytes for the UDP header plus the length of the UDP
>>>>> payload).  Passing PKT_UDP_SIZE() as "len" to memcpy without subtracting 8
>>>>> for the header bytes is not correct and could potentially cause this
>>>>> problem.
>>>>>
>>>>> HTH,
>>>>> Dave
>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "casper@lists.berkeley.edu" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to casper+unsubscr...@lists.berkeley.edu.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/297C1709-AE9C-488D-9110-FD0832BF5951%40berkeley.edu
>>>>> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/297C1709-AE9C-488D-9110-FD0832BF5951%40berkeley.edu?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "casper@lists.berkeley.edu" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to casper+unsubscr...@lists.berkeley.edu.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpxVHhDiD6RT6qK86ub3Tq3aQaTFxrGitKFMaNnRh3rKRw%40mail.gmail.com
>>>> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpxVHhDiD6RT6qK86ub3Tq3aQaTFxrGitKFMaNnRh3rKRw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "casper@lists.berkeley.edu" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to casper+unsubscr...@lists.berkeley.edu.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/723417E3-C630-4988-84B8-F4F3171DB47E%40berkeley.edu
>>>> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/723417E3-C630-4988-84B8-F4F3171DB47E%40berkeley.edu?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
> You received this message because you are subscribed to the Google Groups "
> casper@lists.berkeley.edu" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to casper+unsubscr...@lists.berkeley.edu.
> To view this discussion on the web visit
> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpw_Eo8qy0x_N_ewS8PfpaaNoN%2BBate3C0DWsLOasELsFA%40mail.gmail.com
> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpw_Eo8qy0x_N_ewS8PfpaaNoN%2BBate3C0DWsLOasELsFA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"casper@lists.berkeley.edu" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to casper+unsubscr...@lists.berkeley.edu.
To view this discussion on the web visit 
https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CABmH8B-W%2BwNtpCvkd4vXgCqzGi7NDZ9NKjp2N3A4ycGdxNuHJg%40mail.gmail.com.

Reply via email to