Re: [casper] Dropped packets during HASHPIPE data acquisition

John Ford Tue, 31 Mar 2020 15:05:38 -0700

Hi Mark.  Since the newer version has a script called
"hashpipe_irqaffinity.sh" I would think that the most expedient thing to do
is to upgrade to the newer version.  It's likely to fix some or all of this.


That said, there are a lot of things that you can check, and not only the
irq affinity, but also make sure that your network tuning is good, that
your network card irqs are attached to processes where the memory is local
to that processor, and that the hashpipe threads are mapped to processor
cores that are also local to that memory.   Sometimes it's
counterproductive to map processes to processor cores by themselves if they
need data that is produced by a different core that's far away, NUMA-wise.
And lock all the memory in core with mlockall() or one of his friends.

Good luck with it!

John




On Tue, Mar 31, 2020 at 12:09 PM Mark Ruzindana <ruziem...@gmail.com> wrote:

> Hi all,
>
> I am fairly new to asking questions on a forum so if I need to provide
> more details, please let me know.
>
> Worth noting that just as I was about to send this out, I checked and I
> don't have the most recent version of HASHPIPE with hashpipe_irqaffinity.sh
> among other additions and modifications. So this might fix my problem, but
> maybe not and someone else has more insight. I will update everyone if it
> does.
>
> I am trying to reduce the number of packets lost/dropped when running
> HASHPIPE on a 32 core RHEL 7 server. I have run enough tests and
> diagnostics to be confident that the problem is not any HASHPIPE thread
> running for too long. Also, the percentage of packets dropped on any given
> scan is between about 0.3 and 0.8%. Approx. 5,000 packets in a 30 second
> scan with a total of 1,650,000 packets. So while it's a small percentage,
> the number of packets lost is still quite large. I have also done enough
> tests with 'top', 'iostat' as well as timing HASHPIPE in between time
> windows where there are no packets dropped to diagnose the issue further. I
> (as well as my colleagues) have come to the conclusion that the kernel is
> allowing processes to interrupt HASHPIPE as it is running.
>
> So I have researched and run tests involving 'niceness' and I am currently
> trying to configure smp affinities and irq balancing, but the changes that
> I make to the smp_affinity files aren't doing anything. My plan was to have
> the interrupts run on the 20 cores that aren't being used by HASHPIPE.
> Also, disabling 'irqbalance' didn't do anything either. I also restarted
> the machine to see whether the changes made are permanent, but the system
> reverts back to what it was.
>
> I might be missing something, or trying the wrong things. Has anyone
> experienced this? And could you point me in the right direction if you have
> any insight?
>
> If you need anymore details, please let me know. I didn't add as much as I
> could because I wanted this to be a reasonably sized message.
>
> Thanks,
>
> Mark Ruzindana
>
> --
> You received this message because you are subscribed to the Google Groups "
> casper@lists.berkeley.edu" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to casper+unsubscr...@lists.berkeley.edu.
> To view this discussion on the web visit
> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpxcwSQT-EsjuyqXpGmmBzykDeLt6JbfUUg_ZYpkXyat2w%40mail.gmail.com
> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpxcwSQT-EsjuyqXpGmmBzykDeLt6JbfUUg_ZYpkXyat2w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"casper@lists.berkeley.edu" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to casper+unsubscr...@lists.berkeley.edu.
To view this discussion on the web visit 
https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CABmH8B_4MoNDsO4yZNYH608u6DVtbSPkKz0YBS8%2Bb%3DffqS%3DwaA%40mail.gmail.com.

Re: [casper] Dropped packets during HASHPIPE data acquisition

Reply via email to