[Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)
I'm debugging odd situations w Sugar datastore, and I wonder whether there is any tricks to debugging python programs that are run under dbus. I can see the sugar-datastore PID growing steadily while I test, which means that it's dying and respawning plenty, likely from unhandled exceptions... thanks, m -- martin.langh...@gmail.com mar...@laptop.org -- Software Architect - OLPC - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)
On Wed, Sep 19, 2012 at 4:15 PM, Martin Langhoff martin.langh...@gmail.com wrote: I'm debugging odd situations w Sugar datastore, and I wonder whether there is any tricks to debugging python programs that are run under dbus. I can see the sugar-datastore PID growing steadily while I test, which means that it's dying and respawning plenty, likely from unhandled exceptions... Adding to /usr/bin/datastore-service # debugging logerr=open(os.path.join(log_dir, 'datastore.err'), 'w') sys.stderr=logerr does not help in the least. The file is created, but nothing is ever written there. Any hints? m -- martin.langh...@gmail.com mar...@laptop.org -- Software Architect - OLPC - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)
You already have the line export SUGAR_LOGGER_LEVEL=debug uncommented in .sugar/debug, right? Gonzalo On Wed, Sep 19, 2012 at 5:41 PM, Martin Langhoff martin.langh...@gmail.comwrote: On Wed, Sep 19, 2012 at 4:15 PM, Martin Langhoff martin.langh...@gmail.com wrote: I'm debugging odd situations w Sugar datastore, and I wonder whether there is any tricks to debugging python programs that are run under dbus. I can see the sugar-datastore PID growing steadily while I test, which means that it's dying and respawning plenty, likely from unhandled exceptions... Adding to /usr/bin/datastore-service # debugging logerr=open(os.path.join(log_dir, 'datastore.err'), 'w') sys.stderr=logerr does not help in the least. The file is created, but nothing is ever written there. Any hints? m -- martin.langh...@gmail.com mar...@laptop.org -- Software Architect - OLPC - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)
On Wed, Sep 19, 2012 at 4:54 PM, Gonzalo Odiard gonz...@laptop.org wrote: You already have the line export SUGAR_LOGGER_LEVEL=debug uncommented in .sugar/debug, right? Yes. And with that, datastore.log is very chatty, which is good. But when the datastore code hits an unhandled exception, the exception does not get logged. DBus restarts the process transparently, but that isn't so useful :-/ In terms of progress, I have these tracks underway, - Add a ds_clean flag file, rm it on the start of every create/update/delete op, create it on completion. Complements index_updated, and between the two we get good coverage. Missing either one, we get an index rebuild. This is done, and working well. - During index rebuild, delete incomplete entries. Not working well yet. With the two above, we can operate with the system at ENOSPC, and essentially we don't lose existing user data. New files fail to save, and metadata changes fail to get stored (so your edit, ie: a rename, is reverted in the journal, visibly). But we don't lose the existing Journal entries -- so behaviour is _almost_ civilized now :-) Behind the scenes, however, the datastore is dying and respawning. Which complicates things because what we are missing is to be able to delete entries while we are at ENOSPC. A delete is pretty straightforward, because we are removing files, and that succeeds even on ENOSPC. Except that we need to be able to maintain the xapian indexes. My plan right now is to catch the Xapian-is-hitting-ENOSPC exception, and trigger a Xapian index re-creation, putting the Xapian files in /tmp (which is a tmpfs in our builds and in Fedora upstream too). We'll make sure that the index_updated file on disk is removed so after a reboot the index is recreated once more. With Xapian index on /tmp, deleting Journal entries should just work. All dbus sockets are on tmpfs. However, to do all of the above with some sense that it's working correctly... yeah, I need to see the exceptions I hit. Right now I am guessing what could be going wrong in this black box...? which isn't a productive mode of life. m -- martin.langh...@gmail.com mar...@laptop.org -- Software Architect - OLPC - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)
I don't know if will be useful for you, but I have used SystemTap. You can see at least a trace of the functions called. Info: http://fedoraproject.org/wiki/Features/SystemtapStaticProbes http://sourceware.org/systemtap/wiki/PythonMarkers With stap /usr/share/doc/python-libs-2.7.3/systemtap-example.stp -x pid_datastore you can see all the functions called in the datastore (of course if the process changes is not so easy :( Gonzalo On Wed, Sep 19, 2012 at 6:49 PM, Martin Langhoff martin.langh...@gmail.comwrote: On Wed, Sep 19, 2012 at 4:54 PM, Gonzalo Odiard gonz...@laptop.org wrote: You already have the line export SUGAR_LOGGER_LEVEL=debug uncommented in .sugar/debug, right? Yes. And with that, datastore.log is very chatty, which is good. But when the datastore code hits an unhandled exception, the exception does not get logged. DBus restarts the process transparently, but that isn't so useful :-/ In terms of progress, I have these tracks underway, - Add a ds_clean flag file, rm it on the start of every create/update/delete op, create it on completion. Complements index_updated, and between the two we get good coverage. Missing either one, we get an index rebuild. This is done, and working well. - During index rebuild, delete incomplete entries. Not working well yet. With the two above, we can operate with the system at ENOSPC, and essentially we don't lose existing user data. New files fail to save, and metadata changes fail to get stored (so your edit, ie: a rename, is reverted in the journal, visibly). But we don't lose the existing Journal entries -- so behaviour is _almost_ civilized now :-) Behind the scenes, however, the datastore is dying and respawning. Which complicates things because what we are missing is to be able to delete entries while we are at ENOSPC. A delete is pretty straightforward, because we are removing files, and that succeeds even on ENOSPC. Except that we need to be able to maintain the xapian indexes. My plan right now is to catch the Xapian-is-hitting-ENOSPC exception, and trigger a Xapian index re-creation, putting the Xapian files in /tmp (which is a tmpfs in our builds and in Fedora upstream too). We'll make sure that the index_updated file on disk is removed so after a reboot the index is recreated once more. With Xapian index on /tmp, deleting Journal entries should just work. All dbus sockets are on tmpfs. However, to do all of the above with some sense that it's working correctly... yeah, I need to see the exceptions I hit. Right now I am guessing what could be going wrong in this black box...? which isn't a productive mode of life. m -- martin.langh...@gmail.com mar...@laptop.org -- Software Architect - OLPC - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel