Luca,
Here is the code:
if(sessionToPurge->session_info != NULL)
free(sessionToPurge->session_info);
According to gdb the session_info points to an address 0xffffffff, which causes
a segfault when the free function gets called.
--Brian
________________________________________
From: [email protected]
[[email protected]] on behalf of Luca Deri [[email protected]]
Sent: Monday, November 07, 2011 2:29 AM
To: [email protected]
Subject: Re: [Ntop-misc] Easily Reproducable Segfaults
Brian
I agree with your that there's something wrong with sessions. However
sessions.c:343 contains something different from what you reported. Can
you please send me the source code round line 343 so I can see what you
mean?
Thanks Luca
On 11/05/2011 08:09 PM, Brian Behrens wrote:
> No problem,
>
> I did some more work on this and found that line 343 in sessions.c is the
> culprit. Basically here is a breakdown of whats happening.
>
> That line attempts to free a memory at a pointer at the address specified by
> sessionToPurge->session_info. When you dump what is in the address pointed
> to session_info, it contains 0xffffffff. Since this is not a NULL value, it
> attempts to free the memory at that address which is out of bounds and causes
> a segfault.
>
> So, in perspective, its most likely trying to free memory that has already
> been freed. The question becomes why is the code thinking there is still a
> valid memory address at that pointer? I think I have an idea on why that
> might be, I started watching the session counters and even though I have
> specified an upper limit of 65536 sessions, I can see the count does actually
> get this high. When the count gets that high, it clears and starts over.
> Now, I have not investigated on what actually transpires when this reset
> occurs, but my guess is that it still thinks that there are sessions that
> need to be purged that have already been purged by the clearing.
>
> I have also noticed that once that bound is reached, the count seems to stay
> around 14k sessions. The ESX server I am running this on has 98Gb of memory,
> so memory constraints are not really a concern, this might be just tuning the
> max sessions to tolerate enough sessions so that the purge cycle that is
> supposed to purge these idle sessions can do its job effectively.
>
> I would think that this might be occurring on the lower load networks as the
> DEFAULT_NTOP_MAX_NUM_SESSIONS is set lower, and thus the limit might also
> being reached and causing the clear routine, and the segfault as the use of
> 0xffffffff is implemented in various places and could easily be stored in
> many memory locations.
>
> So, I might try to work around this by elevating the
> DEFAULT_NTOP_MAX_NUM_SESSIONS to see if that helps out. Also, taking a
> deeper look at what happens when this bound is reached might be productive
> for me to understand to help eliminate this.
>
> I hope this helps out some as I have seen similar postings to this in the
> threads.
>
> --Brian
> ________________________________________
> From: [email protected]
> [[email protected]] on behalf of Luca Deri
> [[email protected]]
> Sent: Saturday, November 05, 2011 6:22 AM
> To: [email protected]
> Subject: Re: [Ntop-misc] Easily Reproducable Segfaults
>
> Brian
> thanks for your report. I do not have the ability to reproduce the crash you
> reported using the code in SVN (this is the only version I can support). Can
> you please crash ntop, generate a core and analyze it a bit so that I can
> understand where the problem could be? Before doing that, please resync with
> SVN.
>
> Thanks for your support Luca
>
> On Nov 4, 2011, at 5:09 PM, Brian Behrens wrote:
>
>> Hello,
>>
>> I have been working for days trying to resolve a segfault issue like the
>> following:
>>
>> Nov 4 10:46:54 NTOP-SC kernel: ntop[25479]: segfault at 645 ip
>> 00007f95f3cf3395 sp 00007f95e9b75ae8 error 6 in
>> libntop-4.1.1.so[7f95f3cb9000+56000]
>>
>> The environment is an ESX 5 VM.
>>
>> Guest OS I have tried:
>>
>> 1. CentOS 6
>> 2. Fedora 15
>> 3. Network Security Toolkit (uses 4865 of the current dev tree)
>>
>> Versions I have tried:
>>
>> 1. Current dev tree.
>> 2. Current stable version (4.1.0)
>>
>> The times variate on where these faults occur, but it is relevant to network
>> load factors.
>>
>> My test networks:
>>
>> 1. Simple home network with all packets going to NTOP.
>> 2. High load work network that can see 25 Gig in 15 mins.
>>
>> The most stable I have seen is a clean CentOS install, build ntop from trunk
>> tree, install and run.
>>
>> The quickest segfault I can obtain is when I implement PF_RING, use a e1000
>> card in the vm, and use the pf_ring aware e1000 driver. Can get a segfault
>> usually within 30 mins on the busy network.
>>
>> The common theme is the segfaulting. I did attempt a gdb on the device one
>> time and saw a malloc issue, but all these VMs have 4GB memory and I have
>> tried tuning different hash sizes to see how this impacts the issue, but it
>> really never does. Use smaller hash values, and I get more messages of low
>> memory, etc.
>>
>> I am really not sure what else to do, if there is anything I can do to
>> present more information, please let me know as I would like to stop this
>> incessant segfaulting.
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Ntop-misc mailing list
>> [email protected]
>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
> ---
> We can't solve problems by using the same kind of thinking we used when we
> created them - Albert Einstein
>
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc
_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc