On Wed, Mar 6, 2019 at 7:01 PM Oleg Bondarev <obonda...@mirantis.com> wrote:
> > I'm thinking if this can be malloc() not returning memory to the system > after peak loads: > *"Occasionally, free can actually return memory to the operating system > and make the process smaller. Usually, all it can do is allow a later call > to malloc to reuse the space. In the meantime, the space remains in your > program as part of a free-list used internally by malloc." [1]* > > Does it sound sane? If yes, what would be a best way to check that? > Seems that's not the case. On one of the nodes memory usage by ovs-vswitchd grew from 84G to 87G for the past week, and on other nodes grows gradually as well. > > [1] http://www.gnu.org/software/libc/manual/pdf/libc.pdf > > Thanks, > Oleg > > On Wed, Mar 6, 2019 at 12:34 PM Oleg Bondarev <obonda...@mirantis.com> > wrote: > >> Hi, >> >> On Wed, Mar 6, 2019 at 1:08 AM Ben Pfaff <b...@ovn.org> wrote: >> >>> Starting from 0x30, this looks like a "minimatch" data structure, which >>> is a kind of compressed bitwise match against a flow. >>> >>> 00000030: 0000 0000 0000 4014 0000 0000 0000 0000 >>> 00000040: 0000 0000 0000 0000 fa16 3e2b c5d5 0000 0000 0022 0000 0000 >>> >>> 00000058: 0000 0000 0000 4014 0000 0000 0000 0000 >>> 00000068: 0000 0000 ffff ffff ffff ffff ffff 0000 0000 0fff 0000 0000 >>> >>> I think this corresponds to a flow of this form: >>> >>> >>> pkt_mark=0xc5d5/0xffff,skb_priority=0x3e2bfa16,reg13=0,mpls_label=2,mpls_tc=1,mpls_ttl=0,mpls_bos=0 >>> >>> Is that at all meaningful? Does it match anything that appears in the >>> OpenFlow flow table? >>> >> >> Not sure, actually fa:16:3e:2b:c5:d5 is a mac address of a neutron port >> (this is an OpenStack cluster) - the port is a VM port. >> fa:16:3e/fa:16:3f - are standard neutron mac prefixes. That makes me >> think those might be some actual eth packets (broadcasts?) that somehow >> stuck in memory.. >> So I didn't find anything similar in the flow tables. I'm attaching flows >> of all 5 OVS bridges on the node. >> >> >>> >>> Are you using the kernel or DPDK datapath? >>> >> >> It's kernel datapath, no DPDK. Ubuntu with 4.13.0-45 kernel. >> >> >>> >>> On Tue, Mar 05, 2019 at 08:42:14PM +0400, Oleg Bondarev wrote: >>> > Hi, >>> > >>> > thanks for your help! >>> > >>> > On Tue, Mar 5, 2019 at 7:26 PM Ben Pfaff <b...@ovn.org> wrote: >>> > >>> > > You're talking about the email where you dumped out a repeating >>> sequence >>> > > from some blocks? That might be the root of the problem, if you can >>> > > provide some more context. I didn't see from the message where you >>> > > found the sequence (was it just at the beginning of each of the 4 MB >>> > > blocks you reported separately, or somewhere else), how many copies >>> of >>> > > it, or if you were able to figure out how long each of the blocks >>> was. >>> > > If you can provide that information I might be able to learn some >>> > > things. >>> > > >>> > >>> > Yes, those were beginnings of 0x4000000 size blocks reported by the >>> script. >>> > I also checked 0x8000000 blocks reported and the content is the same. >>> > Examples of how those blocks end: >>> > - https://pastebin.com/D9M6T2BA >>> > - https://pastebin.com/gNT7XEGn >>> > - https://pastebin.com/fqy4XDbN >>> > >>> > So basically contents of the blocks are sequences of: >>> > >>> > *00000020: 0000 0000 0000 0000 6500 0000 0000 0000 ........e.......* >>> > *00000030: 0000 0000 0000 4014 0000 0000 0000 0000 ......@.........* >>> > *00000040: 0000 0000 0000 0000 fa16 3e2b c5d5 0000 ..........>+....* >>> > *00000050: 0000 0022 0000 0000 0000 0000 0000 4014 ..."..........@.* >>> > *00000060: 0000 0000 0000 0000 0000 0000 ffff ffff ................* >>> > *00000070: ffff ffff ffff 0000 0000 0fff 0000 0000 ................* >>> > >>> > following each other and sometimes separated by sequences like this: >>> > >>> > *00001040: 6861 6e64 6c65 7232 3537 0000 0000 0000 handler257......* >>> > >>> > I ran the scripts against several core dumps of several compute nodes >>> with >>> > the issue and >>> > the picture is pretty much the same: 0x4000000 blocks and less >>> 0x8000000 >>> > blocks. >>> > I checked the core dump from a compute node where OVS memory >>> consumption >>> > was ok: >>> > no such block sizes reported. >>> > >>> > >>> > > >>> > > On Tue, Mar 05, 2019 at 09:07:55AM +0400, Oleg Bondarev wrote: >>> > > > Hi Ben, >>> > > > >>> > > > I didn't have a chance to debug the scripts yet, but just in case >>> you >>> > > > missed my last email with examples of repeatable blocks >>> > > > and sequences - do you think we still need to analyze further, >>> will the >>> > > > scripts tell more about the heap? >>> > > > >>> > > > Thanks, >>> > > > Oleg >>> > > > >>> > > > On Thu, Feb 28, 2019 at 10:14 PM Ben Pfaff <b...@ovn.org> wrote: >>> > > > >>> > > > > On Tue, Feb 26, 2019 at 01:41:45PM +0400, Oleg Bondarev wrote: >>> > > > > > Hi, >>> > > > > > >>> > > > > > thanks for the scripts, so here's the output for a 24G core >>> dump: >>> > > > > > https://pastebin.com/hWa3R9Fx >>> > > > > > there's 271 entries of 4MB - does it seem something we should >>> take a >>> > > > > closer >>> > > > > > look at? >>> > > > > >>> > > > > I think that this output really just indicates that the script >>> failed. >>> > > > > It analyzed a lot of regions but didn't output anything useful. >>> If it >>> > > > > had worked properly, it would have told us a lot about data >>> blocks that >>> > > > > had been allocated and freed. >>> > > > > >>> > > > > The next step would have to be to debug the script. It >>> definitely >>> > > > > worked for me before, because I have fixed at least 3 or 4 bugs >>> based >>> > > on >>> > > > > it, but it also definitely is a quick hack and not something >>> that I can >>> > > > > stand behind. I'm not sure how to debug it at a distance. It >>> has a >>> > > > > large comment that describes what it's trying to do. Maybe that >>> would >>> > > > > help you, if you want to try to debug it yourself. I guess it's >>> also >>> > > > > possible that glibc has changed its malloc implementation; if >>> so, then >>> > > > > it would probably be necessary to start over and build a new >>> script. >>> > > > > >>> > > >>> >>
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss