[webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
hi, In the linux Gtk port, with Webkit revision 33493, i see that the resource handles (curl backend) never get released after completing the data transfer for that request. This results in big leaks in resourcehandles as well as the curl internal data structures. (~800k on opening nytimes.com and closing the connection) The reason is that the ResourceHandle ref count never drops to 0, resouce loaders drop their refcount correctly, but the ref done by the Resource handle onitself (source below) before handing over to the resourcehandle manager is not matched with a deref. ResourceHandleCurl.cpp: bool ResourceHandle::start(Frame* frame) { ASSERT(frame); ref(); ResourceHandleManager::sharedInstance()-add(this); return true; } The fix that works is to deref in the ResourceHandleManager::removeFromCurl however we do not know the impact. Brief look at the latest code doesnt seem to have changed this much, however i can still verify on it. BTW why does handing resourcehnadle to resourcehandlemanager need to be protected, i guess a weak pointer would do. Also i dont see this done in other ports + gtk/soup though the interfaces are different. thanks in advance for any inputs. regards, Zaheer ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
Il giorno mer, 10/09/2008 alle 13.26 +0530, zaheer ahmad ha scritto: hi, In the linux Gtk port, with Webkit revision 33493, i see that the resource handles (curl backend) never get released after completing the data transfer for that request. This results in big leaks in resourcehandles as well as the curl internal data structures. (~800k on opening nytimes.com and closing the connection) I started some days ago to write some smart pointer classes for the Gtk port (using g_free or g_object_ref/unref) that should be able to fix some issues (I already fixed several memory leaks). Now I just started to use valgrind to find other memory leaks, so this and other issues should be hopefully fixed soon. Thanks! -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
Il giorno mer, 10/09/2008 alle 13.26 +0530, zaheer ahmad ha scritto: hi, In the linux Gtk port, with Webkit revision 33493, i see that the resource handles (curl backend) never get released after completing the data transfer for that request. This results in big leaks in resourcehandles as well as the curl internal data structures. (~800k on opening nytimes.com and closing the connection) The reason is that the ResourceHandle ref count never drops to 0, resouce loaders drop their refcount correctly, but the ref done by the Resource handle onitself (source below) before handing over to the resourcehandle manager is not matched with a deref. It seems that the ref was added to all the ports to adapt to the change in r16803, so that's right. The problem is that the resource handle should delete itself when there is an error or the load is completed but that doesn't happen in the CURL backend. Later I will check also the soup one as it may have the same problem. Thanks for your report, you saved me time trying to find where the extra reference should have been added/removed. -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
Il giorno mer, 10/09/2008 alle 07.22 -0700, Mike Emmel ha scritto: This leak is fixed in the ncurl port. Is it possible to backport the fix? -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
hi, The fix only helps little as we see the bigger leaks in curl. feedback from curl experts suggests that this design is correct.. let me know if you are aware of this issue == here's the mail snapshot. we are seeing big leaks in curl (Curl_connect - 600-800k and Curl_open - ~200k) when we browse through as little as few websites. This values keep increasing as we browse more sites. heres the high level logic of webkit=curl interaction 1- create a multi handle at the start of program 2- keep creating easy handles for each request 3- when request is done remove it from multi handle and clean up the handle 4- multi handle is never released (stays till the end of program) This design assumes that multi handle has a bounded memory usage as we keep adding and removing easy handles, but that seems to be not true with the leaks. == Now I just started to use valgrind to find other memory leaks, so this and other issues should be hopefully fixed soon. these are not traditional memory leaks, you are holding on to things longer than you should, so they are more functional leaks. Does valgrind help in that too? thanks, Zaheer On Wed, Sep 10, 2008 at 8:02 PM, Marco Barisione [EMAIL PROTECTED] wrote: Il giorno mer, 10/09/2008 alle 07.22 -0700, Mike Emmel ha scritto: This leak is fixed in the ncurl port. Is it possible to backport the fix? -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
Look I had to change to one multi handle per handle basically just and asynchronous handle to get everything to clean up. Its a significant refactoring and better design. What it points to on the curl side is the need for and asynchronous simple handle. Also polling was removed as much as possible curl does not send decent time outs if it has real work to perform so this is still a issue. However open file handles are handled in the event loop select. Curl needs to be extended to have the concept of a work request and a longer term watch timeout. So in my opinion the issues are fixed at least to the extent possible without help from the curl team. On Wed, Sep 10, 2008 at 7:53 AM, zaheer ahmad [EMAIL PROTECTED] wrote: hi, The fix only helps little as we see the bigger leaks in curl. feedback from curl experts suggests that this design is correct.. let me know if you are aware of this issue == here's the mail snapshot. we are seeing big leaks in curl (Curl_connect - 600-800k and Curl_open - ~200k) when we browse through as little as few websites. This values keep increasing as we browse more sites. heres the high level logic of webkit=curl interaction 1- create a multi handle at the start of program 2- keep creating easy handles for each request 3- when request is done remove it from multi handle and clean up the handle 4- multi handle is never released (stays till the end of program) This design assumes that multi handle has a bounded memory usage as we keep adding and removing easy handles, but that seems to be not true with the leaks. == Now I just started to use valgrind to find other memory leaks, so this and other issues should be hopefully fixed soon. these are not traditional memory leaks, you are holding on to things longer than you should, so they are more functional leaks. Does valgrind help in that too? thanks, Zaheer On Wed, Sep 10, 2008 at 8:02 PM, Marco Barisione [EMAIL PROTECTED] wrote: Il giorno mer, 10/09/2008 alle 07.22 -0700, Mike Emmel ha scritto: This leak is fixed in the ncurl port. Is it possible to backport the fix? -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
hi mike, The ncurl port is not yet in the official builds. meanwhile how do you suggest to fix this in the current baseline. one of the changes that does help is to periodically cleanup the multihandle when the running job count drops to 0 and recreate on the next request (this is just a temporary fix till we find the real issue in curl) i have few comments on the ncurl patch: https://bugs.webkit.org/show_bug.cgi?id=17972 - is it better than the current timer driven behavior, since the glib main loop polls the fds faster if its free- but thats seems to be a small gain since the timeout is very small - in the current implementation curl_multi_perform may block if theres lots of data queued up on multiple handles, but that can be easily mitigated by returning frequently from curl_multi_perform - what about doing select in separate thread as glibcurl does. i think this is safe as perform happens in the main thread. thanks, Zaheer On Wed, Sep 10, 2008 at 8:40 PM, Mike Emmel [EMAIL PROTECTED] wrote: Look I had to change to one multi handle per handle basically just and asynchronous handle to get everything to clean up. Its a significant refactoring and better design. What it points to on the curl side is the need for and asynchronous simple handle. Also polling was removed as much as possible curl does not send decent time outs if it has real work to perform so this is still a issue. However open file handles are handled in the event loop select. Curl needs to be extended to have the concept of a work request and a longer term watch timeout. So in my opinion the issues are fixed at least to the extent possible without help from the curl team. On Wed, Sep 10, 2008 at 7:53 AM, zaheer ahmad [EMAIL PROTECTED] wrote: hi, The fix only helps little as we see the bigger leaks in curl. feedback from curl experts suggests that this design is correct.. let me know if you are aware of this issue == here's the mail snapshot. we are seeing big leaks in curl (Curl_connect - 600-800k and Curl_open - ~200k) when we browse through as little as few websites. This values keep increasing as we browse more sites. heres the high level logic of webkit=curl interaction 1- create a multi handle at the start of program 2- keep creating easy handles for each request 3- when request is done remove it from multi handle and clean up the handle 4- multi handle is never released (stays till the end of program) This design assumes that multi handle has a bounded memory usage as we keep adding and removing easy handles, but that seems to be not true with the leaks. == Now I just started to use valgrind to find other memory leaks, so this and other issues should be hopefully fixed soon. these are not traditional memory leaks, you are holding on to things longer than you should, so they are more functional leaks. Does valgrind help in that too? thanks, Zaheer On Wed, Sep 10, 2008 at 8:02 PM, Marco Barisione [EMAIL PROTECTED] wrote: Il giorno mer, 10/09/2008 alle 07.22 -0700, Mike Emmel ha scritto: This leak is fixed in the ncurl port. Is it possible to backport the fix? -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Curl resourcehandle leaks in Linux/Gtk port
On Wed, Sep 10, 2008 at 10:15 PM, zaheer ahmad [EMAIL PROTECTED] wrote: hi mike, The ncurl port is not yet in the official builds. meanwhile how do you suggest to fix this in the current baseline. No suggestions. I wrote he original code and did not care much for it even when I wrote it. I was waiting on the newer versions of curl that had callbacks for file descriptors. That approach was copied from the examples which were intended or designed for command line apps not browsers. The new file handle callbacks where added to address this problem and make curl more ui friendly. one of the changes that does help is to periodically cleanup the multihandle when the running job count drops to 0 and recreate on the next request (this is just a temporary fix till we find the real issue in curl) i have few comments on the ncurl patch: https://bugs.webkit.org/show_bug.cgi?id=17972 - is it better than the current timer driven behavior, since the glib main loop polls the fds faster if its free- but thats seems to be a small gain since the timeout is very small My primary goal with the changes to the even loop was to work toward elimination of most timeouts. My focus is on battery powered systems and firing a timer rapidly for minutes at a time drains the battery. - in the current implementation curl_multi_perform may block if theres lots of data queued up on multiple handles, but that can be easily mitigated by returning frequently from curl_multi_perform In my new implementation with only one simple handle per multi the main even loop runs correctly and checks for events from the user in a timely fashion. - what about doing select in separate thread as glibcurl does. i think this is safe as perform happens in the main thread. Why use a different thread to wait in select ? Thats a design decision left to the implementor of the main loop its outside of the scope of the curl binding to try and make this decision. In general servicing these data file handles need to be cooperative with user input however you implement it. How this happens depends on the platform. thanks, Zaheer On Wed, Sep 10, 2008 at 8:40 PM, Mike Emmel [EMAIL PROTECTED] wrote: Look I had to change to one multi handle per handle basically just and asynchronous handle to get everything to clean up. Its a significant refactoring and better design. What it points to on the curl side is the need for and asynchronous simple handle. Also polling was removed as much as possible curl does not send decent time outs if it has real work to perform so this is still a issue. However open file handles are handled in the event loop select. Curl needs to be extended to have the concept of a work request and a longer term watch timeout. So in my opinion the issues are fixed at least to the extent possible without help from the curl team. On Wed, Sep 10, 2008 at 7:53 AM, zaheer ahmad [EMAIL PROTECTED] wrote: hi, The fix only helps little as we see the bigger leaks in curl. feedback from curl experts suggests that this design is correct.. let me know if you are aware of this issue == here's the mail snapshot. we are seeing big leaks in curl (Curl_connect - 600-800k and Curl_open - ~200k) when we browse through as little as few websites. This values keep increasing as we browse more sites. heres the high level logic of webkit=curl interaction 1- create a multi handle at the start of program 2- keep creating easy handles for each request 3- when request is done remove it from multi handle and clean up the handle 4- multi handle is never released (stays till the end of program) This design assumes that multi handle has a bounded memory usage as we keep adding and removing easy handles, but that seems to be not true with the leaks. == Now I just started to use valgrind to find other memory leaks, so this and other issues should be hopefully fixed soon. these are not traditional memory leaks, you are holding on to things longer than you should, so they are more functional leaks. Does valgrind help in that too? thanks, Zaheer On Wed, Sep 10, 2008 at 8:02 PM, Marco Barisione [EMAIL PROTECTED] wrote: Il giorno mer, 10/09/2008 alle 07.22 -0700, Mike Emmel ha scritto: This leak is fixed in the ncurl port. Is it possible to backport the fix? -- Marco Barisione ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev