[Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)

2012-09-19 Thread Martin Langhoff
I'm debugging odd situations w Sugar datastore, and I wonder whether
there is any tricks to debugging python programs that are run under
dbus.

I can see the sugar-datastore PID growing steadily while I test, which
means that it's dying and respawning plenty, likely from unhandled
exceptions...

thanks,



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)

2012-09-19 Thread Martin Langhoff
On Wed, Sep 19, 2012 at 4:15 PM, Martin Langhoff
martin.langh...@gmail.com wrote:
 I'm debugging odd situations w Sugar datastore, and I wonder whether
 there is any tricks to debugging python programs that are run under
 dbus.

 I can see the sugar-datastore PID growing steadily while I test, which
 means that it's dying and respawning plenty, likely from unhandled
 exceptions...

Adding to /usr/bin/datastore-service

# debugging
logerr=open(os.path.join(log_dir, 'datastore.err'), 'w')
sys.stderr=logerr

does not help in the least. The file is created, but nothing is ever
written there.

Any hints?



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)

2012-09-19 Thread Gonzalo Odiard
You already have the line
export SUGAR_LOGGER_LEVEL=debug
uncommented in .sugar/debug, right?

Gonzalo

On Wed, Sep 19, 2012 at 5:41 PM, Martin Langhoff
martin.langh...@gmail.comwrote:

 On Wed, Sep 19, 2012 at 4:15 PM, Martin Langhoff
 martin.langh...@gmail.com wrote:
  I'm debugging odd situations w Sugar datastore, and I wonder whether
  there is any tricks to debugging python programs that are run under
  dbus.
 
  I can see the sugar-datastore PID growing steadily while I test, which
  means that it's dying and respawning plenty, likely from unhandled
  exceptions...

 Adding to /usr/bin/datastore-service

 # debugging
 logerr=open(os.path.join(log_dir, 'datastore.err'), 'w')
 sys.stderr=logerr

 does not help in the least. The file is created, but nothing is ever
 written there.

 Any hints?



 m
 --
  martin.langh...@gmail.com
  mar...@laptop.org -- Software Architect - OLPC
  - ask interesting questions
  - don't get distracted with shiny stuff  - working code first
  - http://wiki.laptop.org/go/User:Martinlanghoff
 ___
 Sugar-devel mailing list
 Sugar-devel@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)

2012-09-19 Thread Martin Langhoff
On Wed, Sep 19, 2012 at 4:54 PM, Gonzalo Odiard gonz...@laptop.org wrote:
 You already have the line
 export SUGAR_LOGGER_LEVEL=debug
 uncommented in .sugar/debug, right?

Yes. And with that, datastore.log is very chatty, which is good.

But when the datastore code hits an unhandled exception, the exception
does not get logged. DBus restarts the process transparently, but that
isn't so useful :-/

In terms of progress, I have these tracks underway,

 - Add a ds_clean flag file, rm it on the start of every
create/update/delete op, create it on completion. Complements
index_updated, and between the two we get good coverage. Missing
either one, we get an index rebuild. This is done, and working well.

 - During index rebuild, delete incomplete entries. Not working well yet.

With the two above, we can operate with the system at ENOSPC, and
essentially we don't lose existing user data. New files fail to save,
and metadata changes fail to get stored (so your edit, ie: a rename,
is reverted in the journal, visibly). But we don't lose the existing
Journal entries -- so behaviour is _almost_ civilized now :-)

Behind the scenes, however, the datastore is dying and respawning.
Which complicates things because what we are missing is to be able to
delete entries while we are at ENOSPC. A delete is pretty
straightforward, because we are removing files, and that succeeds even
on ENOSPC. Except that we need to be able to maintain the xapian
indexes.

My plan right now is to catch the Xapian-is-hitting-ENOSPC exception,
and trigger a Xapian index re-creation, putting the Xapian files in
/tmp (which is a tmpfs in our builds and in Fedora upstream too).
We'll make sure that the index_updated file on disk is removed so
after a reboot the index is recreated once more.

With Xapian index on /tmp, deleting Journal entries should just work.
All dbus sockets are on tmpfs.

However, to do all of the above with some sense that it's working
correctly... yeah, I need to see the exceptions I hit. Right now I am
guessing what could be going wrong in this black box...? which isn't
a productive mode of life.



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] Debugging sugar-datastore (and similar dbus-connected services)

2012-09-19 Thread Gonzalo Odiard
I don't know if will be useful for you, but I have used SystemTap.
You can see at least a trace of the functions called.
Info:
http://fedoraproject.org/wiki/Features/SystemtapStaticProbes
http://sourceware.org/systemtap/wiki/PythonMarkers

With
stap /usr/share/doc/python-libs-2.7.3/systemtap-example.stp  -x
pid_datastore
you can see all the functions called in the datastore
(of course if the process changes is not so easy :(

Gonzalo

On Wed, Sep 19, 2012 at 6:49 PM, Martin Langhoff
martin.langh...@gmail.comwrote:

 On Wed, Sep 19, 2012 at 4:54 PM, Gonzalo Odiard gonz...@laptop.org
 wrote:
  You already have the line
  export SUGAR_LOGGER_LEVEL=debug
  uncommented in .sugar/debug, right?

 Yes. And with that, datastore.log is very chatty, which is good.

 But when the datastore code hits an unhandled exception, the exception
 does not get logged. DBus restarts the process transparently, but that
 isn't so useful :-/

 In terms of progress, I have these tracks underway,

  - Add a ds_clean flag file, rm it on the start of every
 create/update/delete op, create it on completion. Complements
 index_updated, and between the two we get good coverage. Missing
 either one, we get an index rebuild. This is done, and working well.

  - During index rebuild, delete incomplete entries. Not working well yet.

 With the two above, we can operate with the system at ENOSPC, and
 essentially we don't lose existing user data. New files fail to save,
 and metadata changes fail to get stored (so your edit, ie: a rename,
 is reverted in the journal, visibly). But we don't lose the existing
 Journal entries -- so behaviour is _almost_ civilized now :-)

 Behind the scenes, however, the datastore is dying and respawning.
 Which complicates things because what we are missing is to be able to
 delete entries while we are at ENOSPC. A delete is pretty
 straightforward, because we are removing files, and that succeeds even
 on ENOSPC. Except that we need to be able to maintain the xapian
 indexes.

 My plan right now is to catch the Xapian-is-hitting-ENOSPC exception,
 and trigger a Xapian index re-creation, putting the Xapian files in
 /tmp (which is a tmpfs in our builds and in Fedora upstream too).
 We'll make sure that the index_updated file on disk is removed so
 after a reboot the index is recreated once more.

 With Xapian index on /tmp, deleting Journal entries should just work.
 All dbus sockets are on tmpfs.

 However, to do all of the above with some sense that it's working
 correctly... yeah, I need to see the exceptions I hit. Right now I am
 guessing what could be going wrong in this black box...? which isn't
 a productive mode of life.



 m
 --
  martin.langh...@gmail.com
  mar...@laptop.org -- Software Architect - OLPC
  - ask interesting questions
  - don't get distracted with shiny stuff  - working code first
  - http://wiki.laptop.org/go/User:Martinlanghoff

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel