Re: [Mailman-Users] Indexing mail right after delivery

2010-03-16 Thread Cédric Jeanneret
Done for launchpad. thanks again!

On Mon, Mar 15, 2010 at 5:40 PM, Mark Sapiro m...@msapiro.net wrote:
 Cedric Jeanneret wrote:

Maybe we should delete my bug on launchpad, or directly link it to your 
FAQ page ?

I just added my code in the function, and now it indexes, and archives 
correctly.


 I suggest you just delete the two existing attachments and attach your
 current code with a note that it is based on the template in the FAQ.

 That way the xappy/Xapian code will be available there if others wish
 to use it.

 --
 Mark Sapiro m...@msapiro.net        The highway is for gamblers,
 San Francisco Bay Area, California    better use your sense - B. Dylan


--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-15 Thread Cedric Jeanneret
On Sun, 14 Mar 2010 17:38:16 -0700
Mark Sapiro m...@msapiro.net wrote:

 To follow up on this thread, there is now a FAQ at
 http://wiki.list.org/x/RAKJ which contains an attached template,
 Ext_Arch.py, which can be used as an external archiver and which will
 add the message to the pipermail archive, and then call a stub function
 with arguments of the list name, host name, the URL to the just archived
 message, the file system path to the just archived message and the
 message object. The stub can be coded to call a search indexer or do
 other things one may wish to do with the archived message.
 

Hello Mark,

It just works like a magic!. Thank you so much!

Maybe we should delete my bug on launchpad, or directly link it to your FAQ 
page ?

I just added my code in the function, and now it indexes, and archives 
correctly.

Thanks again!

See you

C.

-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-15 Thread Mark Sapiro
Cedric Jeanneret wrote:

Maybe we should delete my bug on launchpad, or directly link it to your FAQ 
page ?

I just added my code in the function, and now it indexes, and archives 
correctly.


I suggest you just delete the two existing attachments and attach your
current code with a note that it is based on the template in the FAQ.

That way the xappy/Xapian code will be available there if others wish
to use it.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-14 Thread Mark Sapiro
To follow up on this thread, there is now a FAQ at
http://wiki.list.org/x/RAKJ which contains an attached template,
Ext_Arch.py, which can be used as an external archiver and which will
add the message to the pipermail archive, and then call a stub function
with arguments of the list name, host name, the URL to the just archived
message, the file system path to the just archived message and the
message object. The stub can be coded to call a search indexer or do
other things one may wish to do with the archived message.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-04 Thread Cedric Jeanneret
On Wed, 03 Mar 2010 10:04:31 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
  
  Maybe a python version? What is really strange is that it works inside
  the archiver I tried to NOT use email.message_from_file (so use
  directly StringIO on sys.stdin), and it worked fine. In fact, the
  error was that Message doesn't have tell() method...
 
 
 Which says you are passing a Message object, not a StringIO or file
 object. I considered at one point just passing sys.stdin directly, but
 that won't work because sys.stdin does not have seek() or tell() methods.
 
 
  Another error was really annoying : ALL worked. almost. I couldn't do
  my mlist.Save(), as there was an error for the lockfile.
  
  I did :
  mlist = MailList.MailList('toto', lock=False)
  # other code
  mlist.Save()
 
 
 Right. I overlooked the fact that you can't Save() an unlocked list.
 But, I don't think you need to. I don't think the archiver actually
 updates your list instance in it's processing, so you should be OK if
 you just remove the Save() from your code.
 
 
  - crashed. After poking into MailList code, I saw that it refreshes
  the lockfile. Commenting out this line made it work again more or
  less : message was in mbox, but wasn't in pipermail archives
 
 
 Don't do that. It won't work anyway because the locked list object in
 ArchRunner will be saved after you're done and will undo any changes you
 made to your list object. But, as I say, you shouldn't need to save your
 list object. It is only passed to the HyperArch.HyperArchive()
 constructor so the archiver knows where to find the archive. I don't
 think it is updated.
 
 
  Poking on the Net, I found this post
  http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
  answered some months (well, years) ago. I tried this way :
  applying the patch, so that it uses mailman internal archiver, and it
  calls my indexer right after.
  That's not really clean, it's not really a portable way, but it works.
  The fact that I have to patch a file from mailman package annoy me a
  bit, but... I didn't have any success with the ways you showed me :(
  
  
  To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
  for this. As I saw how to debug my scripts (thank you for the tip), I
  guess it would be the best way, instead of patching a code (which will
  be overriden on the next update).
  
  Or maybe there's a variable in mm_config (or defaults) which tell
  mailman to call a script after archiving ? I didn't see such a thing,
  I guess that's the role a the GLOBAL_PIPELINE and its handlers
  chain...
 
 As I tried to point out in my initial reply
 http://mail.python.org/pipermail/mailman-users/2010-February/068900.html,
 that won't work.
 
 The pipeline includes ToArchive which only queues the message in the
 archive queue for ArchRunner. Then IncomingRunner continues processing
 the pipeline. When it gets to your handler, there's no guarantee that
 ArchRunner has yet archived the message so how do you index something
 that may not yet even be there.
 
 We were almost there with the external archiver method. Let's try to
 make that work.
 
 What do you have now in the external archiver code and in the
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what
 is the problem?
 

Hello again !

I think I found what's the problem is :
the script works now, but as I write my own archiver, it doesn't do the 
pipermail part (i.e. update mails in archive)... I thought that this code :

mlist = MailList.MailList(maillist, lock=False)
msg = email.message_from_file(sys.stdin, Message.Message)
f = StringIO(str(sys.stdin))
h = HyperArch.HyperArchive(mlist)
h.processUnixMailbox(f)
f.close()

did all, but after reading a bit of code, it doesn't exactly. It saves to .mbox 
file, right ?

I tried to find where it does the pipermail stuff, but it's a bit complicated 
[I'm not so at ease with Python].

Any clue ?

Thank you

-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-04 Thread Cedric Jeanneret
On Wed, 03 Mar 2010 10:04:31 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
  
  Maybe a python version? What is really strange is that it works inside
  the archiver I tried to NOT use email.message_from_file (so use
  directly StringIO on sys.stdin), and it worked fine. In fact, the
  error was that Message doesn't have tell() method...
 
 
 Which says you are passing a Message object, not a StringIO or file
 object. I considered at one point just passing sys.stdin directly, but
 that won't work because sys.stdin does not have seek() or tell() methods.
 
 
  Another error was really annoying : ALL worked. almost. I couldn't do
  my mlist.Save(), as there was an error for the lockfile.
  
  I did :
  mlist = MailList.MailList('toto', lock=False)
  # other code
  mlist.Save()
 
 
 Right. I overlooked the fact that you can't Save() an unlocked list.
 But, I don't think you need to. I don't think the archiver actually
 updates your list instance in it's processing, so you should be OK if
 you just remove the Save() from your code.
 
 
  - crashed. After poking into MailList code, I saw that it refreshes
  the lockfile. Commenting out this line made it work again more or
  less : message was in mbox, but wasn't in pipermail archives
 
 
 Don't do that. It won't work anyway because the locked list object in
 ArchRunner will be saved after you're done and will undo any changes you
 made to your list object. But, as I say, you shouldn't need to save your
 list object. It is only passed to the HyperArch.HyperArchive()
 constructor so the archiver knows where to find the archive. I don't
 think it is updated.
 
 
  Poking on the Net, I found this post
  http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
  answered some months (well, years) ago. I tried this way :
  applying the patch, so that it uses mailman internal archiver, and it
  calls my indexer right after.
  That's not really clean, it's not really a portable way, but it works.
  The fact that I have to patch a file from mailman package annoy me a
  bit, but... I didn't have any success with the ways you showed me :(
  
  
  To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
  for this. As I saw how to debug my scripts (thank you for the tip), I
  guess it would be the best way, instead of patching a code (which will
  be overriden on the next update).
  
  Or maybe there's a variable in mm_config (or defaults) which tell
  mailman to call a script after archiving ? I didn't see such a thing,
  I guess that's the role a the GLOBAL_PIPELINE and its handlers
  chain...
 
 As I tried to point out in my initial reply
 http://mail.python.org/pipermail/mailman-users/2010-February/068900.html,
 that won't work.
 
 The pipeline includes ToArchive which only queues the message in the
 archive queue for ArchRunner. Then IncomingRunner continues processing
 the pipeline. When it gets to your handler, there's no guarantee that
 ArchRunner has yet archived the message so how do you index something
 that may not yet even be there.
 
 We were almost there with the external archiver method. Let's try to
 make that work.
 
 What do you have now in the external archiver code and in the
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what
 is the problem?
 

uho, found it !!
mailman/bin/arch toto

I guess that's all :))

-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-04 Thread Mark Sapiro
On 3/4/2010 4:23 AM, Cedric Jeanneret wrote:
 
 I think I found what's the problem is : the script works now, but as
 I write my own archiver, it doesn't do the pipermail part (i.e.
 update mails in archive)... I thought that this code :
 
 mlist = MailList.MailList(maillist, lock=False)
 msg = email.message_from_file(sys.stdin, Message.Message)
 f = StringIO(str(sys.stdin))
 h = HyperArch.HyperArchive(mlist) 
 h.processUnixMailbox(f)
 f.close()
 
 did all, but after reading a bit of code, it doesn't exactly. It
 saves to .mbox file, right ?


No. It doesn't save to the .mbox file. If you look at the ArchiveMail()
method in Mailman/Archivers/Archiver.py. it first saves to the .mbox by
doing

if mm_cfg.ARCHIVE_TO_MBOX in (1, 2):
self.__archive_to_mbox(msg)

Then it either calls the external archiver or executes essentially the
above to archive the mail in the pipermail archive.

What you are missing is

h.close()

and that's why it doesn't work.


 I tried to find where it does the pipermail stuff, but it's a bit
 complicated [I'm not so at ease with Python].


Yes, the archiver is very convoluted because classes are subclassed and
methods overridden all over. Don't feel bad. I've been looking at it for
years and still only barely understand it.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-04 Thread Mark Sapiro
On 3/4/2010 4:46 AM, Cedric Jeanneret wrote:
 
 uho, found it !!
 mailman/bin/arch toto
 
 I guess that's all :))


You may or may not be able to use bin/arch, but you can't use it in
conjunction with an external archiver because of list locking. If you
call bin/arch from your external archiver and wait for it to return, you
will have a deadlock, and if you don't wait, it won't run until after
your external archiver finishes.

I.e., an external archiver command like

'|/path/bin/arch $(listname)s;/path/myscript.py $(listname)s'

creates a deadlock, and one like

'|/path/bin/arch $(listname)s/path/myscript.py $(listname)s'

doesn't work because myscript.py has to complete before bin/arch can
obtain the list lock.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-04 Thread Cedric Jeanneret
On Thu, 04 Mar 2010 06:49:54 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 3/4/2010 4:23 AM, Cedric Jeanneret wrote:
  
  I think I found what's the problem is : the script works now, but as
  I write my own archiver, it doesn't do the pipermail part (i.e.
  update mails in archive)... I thought that this code :
  
  mlist = MailList.MailList(maillist, lock=False)
  msg = email.message_from_file(sys.stdin, Message.Message)
  f = StringIO(str(sys.stdin))
  h = HyperArch.HyperArchive(mlist) 
  h.processUnixMailbox(f)
  f.close()
  
  did all, but after reading a bit of code, it doesn't exactly. It
  saves to .mbox file, right ?
 
 
 No. It doesn't save to the .mbox file. If you look at the ArchiveMail()
 method in Mailman/Archivers/Archiver.py. it first saves to the .mbox by
 doing
 
 if mm_cfg.ARCHIVE_TO_MBOX in (1, 2):
 self.__archive_to_mbox(msg)
 
 Then it either calls the external archiver or executes essentially the
 above to archive the mail in the pipermail archive.
 
 What you are missing is
 
 h.close()
 
 and that's why it doesn't work.
 
 
  I tried to find where it does the pipermail stuff, but it's a bit
  complicated [I'm not so at ease with Python].
 
 
 Yes, the archiver is very convoluted because classes are subclassed and
 methods overridden all over. Don't feel bad. I've been looking at it for
 years and still only barely understand it.
 

hmmm, I use the h.close() a bit after (I catche its latest ID so that I ca 
build the direct URL for my indexer). But for now, I guess I'm done.
I've opened a bug (didn't figure where I could put my stuff) on launchpad: 
https://bugs.launchpad.net/mailman/+bug/531942
It contains my scripts, and some informations on how to use them.

Indeed, arch script uses locks. I copied it, removed the lock stuff, and used 
this version. All work fine now.

I'm happy I could understand a bit (well... very little bit) how mailman works.

Thanks again !


-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-04 Thread Mark Sapiro
On 3/4/2010 7:10 AM, Cedric Jeanneret wrote:
 
 hmmm, I use the h.close() a bit after (I catche its latest ID so that
 I ca build the direct URL for my indexer). But for now, I guess I'm
 done. I've opened a bug (didn't figure where I could put my stuff) on
 launchpad: https://bugs.launchpad.net/mailman/+bug/531942 It contains
 my scripts, and some informations on how to use them.


I've seen your bug in the tracker. It's too bad Launchpad calls
everything a bug, but that's the right place.


 Indeed, arch script uses locks. I copied it, removed the lock
 stuff, and used this version. All work fine now.


I will have some comments after I look at this more. I think there is
redundant stuff, but I'll comment further after I look in detail.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-03 Thread Cedric Jeanneret
On Tue, 02 Mar 2010 11:34:25 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
  On Fri, 26 Feb 2010 10:15:13 -0800
  Mark Sapiro m...@msapiro.net wrote:
 
  At this point, you have a list object (locked) and a message object. You
  might think you could just do
 
  mlist.ArchiveMail(msg)
 
  to archive the mail to the listname.mbox file and the pipermail archive,
  but that wouldn't quite work because that method would re-invoke the
  external archiver. Also, you don't need to worry about the listname.mbox
  file because the ArchiveMail() method already did that before invoking
  the external archiver, so what you would need is
 
  from Mailman.Archiver import HyperArch
  from cStringIO import StringIO
  f = StringIO(str(msg))
  h = HyperArch.HyperArchive(mlist)
  h.processUnixMailbox(f)
  h.close()
  f.close()
 
  Which is what the ArchiveMail() method would do. Now you still have the
  mlist and msg objects, and you need to save and unlock the list at some
  point
 
  mlist.Save()
  mlist.Unlock()
 
  and the message is now in the pipermail archive and can be indexed.
 
  
  Hello again,
  
  I'm having some troubles with my code. According to what Mark said, I've 
  done this :
  
  #!/usr/bin/env python
  import sys
  sys.path.insert(0,'/usr/lib/mailman')
  
  import syslog
  
  syslog.syslog('begin script')
  
  import email
  from Mailman import MailList
  from Mailman import Message
  ## archive part
  from Mailman.Archiver import HyperArch
  from cStringIO import StringIO
  
  maillist = sys.argv[2]
  hostname = sys.argv[1]
  
  msg = email.message_from_file(sys.stdin, Message.Message)
  syslog.syslog(maillist)
  
  mlist = MailList.MailList(maillist, lock=True)
  
  syslog.syslog('processing archiver')
  ## let archive it
  f = StringIO(str(msg))
  h = HyperArch.HyperArchive(mlist)
  h.processUnixMailbox(f)
  h.close()
  f.close()
  mlist.Save()
  mlist.Unlock()
  
  mlist.ArchiveMail(msg)
 
 
 Here is one problem. Remove the above line. As I tried to say above you
 can't do this. The lines above from f = StringIO(str(msg)) through
 f.close() archive the message. When you call mlist.ArchiveMail(msg),
 it reinvokes your external archiver in an endless loop.
 
 You need to remove the mlist.ArchiveMail(msg).
 
 The locking problem is something else. The external archiver is called
 with the list locked, thus when we try to instantiate the list 'locked',
 we have a deadlock. Thus, you never saw the loop because of the deadlock.
 
 The good news is we don't have to pass a locked list instance to
 HyperArch.HyperArchive() as it uses a special archiver lock.
 
 So, replace
 
 mlist = MailList.MailList(maillist, lock=True)
 
 with
 
 mlist = MailList.MailList(maillist, lock=False)
 
 and remove the mlist.Unlock() as your instance isn't locked, and
 ArchRunner will unlock its list instance when you exit.
 
 
  syslog.syslog('processing indexer')
  ### coming soon
  
  syslog.syslog('exiting - all ok')
  sys.exit(0)
  
  syslog is for debug purpose only.
  
  And if I send an email on my ML, I have this kind of error:
  
  Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking
 
 

Hmm, it seems it crashes in pipermail.py, in function processUnixMailbox:
we have a
pos = input.tell() on line 564, but unfortunately input does NOT have any 
tell() method...
It returns a 41 status.

-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-03 Thread Mark Sapiro
On 3/2/2010 11:02 PM, Cedric Jeanneret wrote:
 
 Woops, right. it was commented out in my code. For now, I'm pocking
 around with some other problems, such as my external archiver returns
 a non-zero status. It seems to crash with the 
 h.processUnixMailbox(f) Is there any way to have a backtrace of
 python errors (i.e. testing it through the shell)? I guess I can
 write a file with all email content, included headers, and pipe it in
 my file. Right ?


There are several choices.

You could try adding 'filename' to your external archiver command
string. That will probably work

You can do as you suggest above.

You can replace your import syslog with

from Mailman.Logging.Syslog import syslog
from Mailman.Logging.Utils import LogStdErr

and add

LogStdErr('debug', 'mailmanctl', manual_reprime=0)

and change your syslog.syslog('debug text') statements to

syslog('debug', 'debug text')

This will write all stderr output plus your 'debug text' entries to a
log named debug in Mailman's logs directory. (You can name the log
anything you want. It will be created if it doesn't exist.)

I see you've gotten further. I'll respond to that post.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-03 Thread Mark Sapiro
On 3/3/2010 12:57 AM, Cedric Jeanneret wrote:
 On Tue, 02 Mar 2010 11:34:25 -0800
 Mark Sapiro m...@msapiro.net wrote:
 
 On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
[...]
 from cStringIO import StringIO
[...]
 f = StringIO(str(msg))
 h = HyperArch.HyperArchive(mlist)
 h.processUnixMailbox(f)
[...]
 
 Hmm, it seems it crashes in pipermail.py, in function processUnixMailbox:
 we have a
 pos = input.tell() on line 564, but unfortunately input does NOT have any 
 tell() method...
 It returns a 41 status.


Something is strange. The input object in 'pos = input.tell()' is the
StringIO instance you passed as 'f', and StringIO objects do have a tell
method. Also, the above code snippet is exactly what the builtin
archiver uses, and I tested it and it worked for me.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-03 Thread Cédric Jeanneret
On Wed, Mar 3, 2010 at 4:44 PM, Mark Sapiro m...@msapiro.net wrote:
 On 3/3/2010 12:57 AM, Cedric Jeanneret wrote:
 On Tue, 02 Mar 2010 11:34:25 -0800
 Mark Sapiro m...@msapiro.net wrote:

 On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
 [...]
 from cStringIO import StringIO
 [...]
 f = StringIO(str(msg))
 h = HyperArch.HyperArchive(mlist)
 h.processUnixMailbox(f)
 [...]

 Hmm, it seems it crashes in pipermail.py, in function processUnixMailbox:
 we have a
 pos = input.tell() on line 564, but unfortunately input does NOT have any 
 tell() method...
 It returns a 41 status.


 Something is strange. The input object in 'pos = input.tell()' is the
 StringIO instance you passed as 'f', and StringIO objects do have a tell
 method. Also, the above code snippet is exactly what the builtin
 archiver uses, and I tested it and it worked for me.

 --
 Mark Sapiro m...@msapiro.net        The highway is for gamblers,
 San Francisco Bay Area, California    better use your sense - B. Dylan



Maybe a python version? What is really strange is that it works inside
the archiver I tried to NOT use email.message_from_file (so use
directly StringIO on sys.stdin), and it worked fine. In fact, the
error was that Message doesn't have tell() method...

Another error was really annoying : ALL worked. almost. I couldn't do
my mlist.Save(), as there was an error for the lockfile.

I did :
mlist = MailList.MailList('toto', lock=False)
# other code
mlist.Save()

- crashed. After poking into MailList code, I saw that it refreshes
the lockfile. Commenting out this line made it work again more or
less : message was in mbox, but wasn't in pipermail archives

Poking on the Net, I found this post
http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
answered some months (well, years) ago. I tried this way :
applying the patch, so that it uses mailman internal archiver, and it
calls my indexer right after.
That's not really clean, it's not really a portable way, but it works.
The fact that I have to patch a file from mailman package annoy me a
bit, but... I didn't have any success with the ways you showed me :(


To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
for this. As I saw how to debug my scripts (thank you for the tip), I
guess it would be the best way, instead of patching a code (which will
be overriden on the next update).

Or maybe there's a variable in mm_config (or defaults) which tell
mailman to call a script after archiving ? I didn't see such a thing,
I guess that's the role a the GLOBAL_PIPELINE and its handlers
chain...


Thank you for the time you spend on my problem.

Best regards,

C.
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-03 Thread Mark Sapiro
On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
 
 Maybe a python version? What is really strange is that it works inside
 the archiver I tried to NOT use email.message_from_file (so use
 directly StringIO on sys.stdin), and it worked fine. In fact, the
 error was that Message doesn't have tell() method...


Which says you are passing a Message object, not a StringIO or file
object. I considered at one point just passing sys.stdin directly, but
that won't work because sys.stdin does not have seek() or tell() methods.


 Another error was really annoying : ALL worked. almost. I couldn't do
 my mlist.Save(), as there was an error for the lockfile.
 
 I did :
 mlist = MailList.MailList('toto', lock=False)
 # other code
 mlist.Save()


Right. I overlooked the fact that you can't Save() an unlocked list.
But, I don't think you need to. I don't think the archiver actually
updates your list instance in it's processing, so you should be OK if
you just remove the Save() from your code.


 - crashed. After poking into MailList code, I saw that it refreshes
 the lockfile. Commenting out this line made it work again more or
 less : message was in mbox, but wasn't in pipermail archives


Don't do that. It won't work anyway because the locked list object in
ArchRunner will be saved after you're done and will undo any changes you
made to your list object. But, as I say, you shouldn't need to save your
list object. It is only passed to the HyperArch.HyperArchive()
constructor so the archiver knows where to find the archive. I don't
think it is updated.


 Poking on the Net, I found this post
 http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
 answered some months (well, years) ago. I tried this way :
 applying the patch, so that it uses mailman internal archiver, and it
 calls my indexer right after.
 That's not really clean, it's not really a portable way, but it works.
 The fact that I have to patch a file from mailman package annoy me a
 bit, but... I didn't have any success with the ways you showed me :(
 
 
 To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
 for this. As I saw how to debug my scripts (thank you for the tip), I
 guess it would be the best way, instead of patching a code (which will
 be overriden on the next update).
 
 Or maybe there's a variable in mm_config (or defaults) which tell
 mailman to call a script after archiving ? I didn't see such a thing,
 I guess that's the role a the GLOBAL_PIPELINE and its handlers
 chain...

As I tried to point out in my initial reply
http://mail.python.org/pipermail/mailman-users/2010-February/068900.html,
that won't work.

The pipeline includes ToArchive which only queues the message in the
archive queue for ArchRunner. Then IncomingRunner continues processing
the pipeline. When it gets to your handler, there's no guarantee that
ArchRunner has yet archived the message so how do you index something
that may not yet even be there.

We were almost there with the external archiver method. Let's try to
make that work.

What do you have now in the external archiver code and in the
PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what
is the problem?

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-03 Thread Cedric Jeanneret
On Wed, 03 Mar 2010 10:04:31 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
  
  Maybe a python version? What is really strange is that it works inside
  the archiver I tried to NOT use email.message_from_file (so use
  directly StringIO on sys.stdin), and it worked fine. In fact, the
  error was that Message doesn't have tell() method...
 
 
 Which says you are passing a Message object, not a StringIO or file
 object. I considered at one point just passing sys.stdin directly, but
 that won't work because sys.stdin does not have seek() or tell() methods.
 
 
  Another error was really annoying : ALL worked. almost. I couldn't do
  my mlist.Save(), as there was an error for the lockfile.
  
  I did :
  mlist = MailList.MailList('toto', lock=False)
  # other code
  mlist.Save()
 
 
 Right. I overlooked the fact that you can't Save() an unlocked list.
 But, I don't think you need to. I don't think the archiver actually
 updates your list instance in it's processing, so you should be OK if
 you just remove the Save() from your code.
 
 
  - crashed. After poking into MailList code, I saw that it refreshes
  the lockfile. Commenting out this line made it work again more or
  less : message was in mbox, but wasn't in pipermail archives
 
 
 Don't do that. It won't work anyway because the locked list object in
 ArchRunner will be saved after you're done and will undo any changes you
 made to your list object. But, as I say, you shouldn't need to save your
 list object. It is only passed to the HyperArch.HyperArchive()
 constructor so the archiver knows where to find the archive. I don't
 think it is updated.
 
 
  Poking on the Net, I found this post
  http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
  answered some months (well, years) ago. I tried this way :
  applying the patch, so that it uses mailman internal archiver, and it
  calls my indexer right after.
  That's not really clean, it's not really a portable way, but it works.
  The fact that I have to patch a file from mailman package annoy me a
  bit, but... I didn't have any success with the ways you showed me :(
  
  
  To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
  for this. As I saw how to debug my scripts (thank you for the tip), I
  guess it would be the best way, instead of patching a code (which will
  be overriden on the next update).
  
  Or maybe there's a variable in mm_config (or defaults) which tell
  mailman to call a script after archiving ? I didn't see such a thing,
  I guess that's the role a the GLOBAL_PIPELINE and its handlers
  chain...
 
 As I tried to point out in my initial reply
 http://mail.python.org/pipermail/mailman-users/2010-February/068900.html,
 that won't work.
 
 The pipeline includes ToArchive which only queues the message in the
 archive queue for ArchRunner. Then IncomingRunner continues processing
 the pipeline. When it gets to your handler, there's no guarantee that
 ArchRunner has yet archived the message so how do you index something
 that may not yet even be there.
 
 We were almost there with the external archiver method. Let's try to
 make that work.
 
 What do you have now in the external archiver code and in the
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what
 is the problem?
 

Hello again,

First of all, I want to thank you for the time you spend on my case. I really 
appreciate.

Now, for my code:
I attached the latest (buggy) version of my archive-and-index.py script. I've 
done a rollback to the way you told me, so that we won't go in all directions.
You'll find anotther attachment : debug file I added in this way :
PUBLIC_EXTERNAL_ARCHIVER = '/root/archive-and-index.py %(hostname)s 
%(listname)s /var/log/mailman/archiver'

It seems that the Message.Message stays, even if we create a new StringIO 
variable... weird.
Just in case :
python --version
Python 2.5.2

Maybe there's a problem with this version... ? If so, it will be a little 
problem, as it's the lenny version.

I'll keep on trying, and keep you updated as soon as I have some new things.

Thanks again.

Best regards,

C.


-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-02 Thread Cedric Jeanneret
On Fri, 26 Feb 2010 10:15:13 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 2/26/2010 4:20 AM, Cedric Jeanneret wrote:
  On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro m...@msapiro.net
  wrote:
  
  Cedric Jeanneret wrote:
  
  I'm trying to create a xapian[1] indexer for our mailing list. As
  mailman is written in Python and there are python bindings for
  xapian, I guess I can maybe create a plugin for that. My first
  question is : is there already such a thing ? I searched on the
  net, but nothing appeared My second one : can we create a plugin
  for mailman, if so, where should I go to have some doc ? seems
  there's nothing in the wiki
  (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=pluginsearchQuery.spaceKey=conf_all)
 
 
  
 Just to explain why I'd like to do that: we already have a xapian search
 engine in here, indexing a fileserver, request tracker queues and
 moinmoin wikis... so we'd like to aggregate all our stuff in one app for
 searching.
  
  
  This will be quite doable with Mailman 3 which is still in
  development.
  
  There are problems trying to do this in Mailman 2.1.x. There is a 
  plugin capability of sorts in the form of custom handlers that can
  be added to the incoming message processing pipeline. See the FAQ
  at http://wiki.list.org/x/l4A9. However, archiving is
  asynchronous with incoming message processing, so it is not
  possible for a custom handler to know the URL that will ultimately
  retrieve the message from the archive.
  
  A different approach which might be workable is to use the 
  PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If
  you set
  
  PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' 
  PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
  
  in mm_cfg.py, then that script will be invoked do do the archiving.
  The script in turn could invoke the standard pipermail archiving
  process and then invoke xapian to index the archived message.
  
  
  
  Hello again,
  
  Just one question : what do mlist, msg, msgdata stand for ? As I read
  I've to create my module and define a process(mlist, msg, msgdata)
  inside it, I'd like to know what are those objects. I discovered that
  mlist stands for a Mailman.MailList.MailList('list-name'), but for
  the others, it's a bit hard to find...
 
 
 Only custom handlers need to define process(mlist, msg, msgdata). That
 is the entry point to the handler and three objects are passed
 
 mlist is the Mailman.MailList.MailList() instance for the current list
 
 msg is a Mailman.Message.Message() (subclass of email.Message.Message)
 instance for the current message
 
 msgdata is a dictionary of the message metadata accumulated so far.
 
 The important thing is these are passed in as arguments to the handler
 process() function.
 
 In your case, you are defining a module which is going to be invoked
 like the following.
 
 Suppose that
 
 PUBLIC_EXTERNAL_ARCHIVER = '/path/to/myarch.py %(hostname)s %listname)s'
 
 It will be invoked in a pipe similar to
 
 cat raw_message | /path/to/myarch.py HOST LIST
 
 i.e. the command string with %(hostname)s and %listname)s replaced by
 the actual host name and list name of the list will be invoked and the
 message piped to it.
 
 So, it could begin something like:
 
 #!python
 import sys
 sys.path.insert(0, 'path/to/mailman/bin')
 # The above line can be skipped if myarch.py is in Mailman's
 # bin directory.
 import paths
 
 import email
 from Mailman import MailList
 from Mailman import Message
 
 msg = email.message_from_file(sys.stdin, Message.Message)
 mlist = MailList.MailList(sys.argv[1], lock=True)
 
 
 At this point, you have a list object (locked) and a message object. You
 might think you could just do
 
 mlist.ArchiveMail(msg)
 
 to archive the mail to the listname.mbox file and the pipermail archive,
 but that wouldn't quite work because that method would re-invoke the
 external archiver. Also, you don't need to worry about the listname.mbox
 file because the ArchiveMail() method already did that before invoking
 the external archiver, so what you would need is
 
 from Mailman.Archiver import HyperArch
 from cStringIO import StringIO
 f = StringIO(str(msg))
 h = HyperArch.HyperArchive(mlist)
 h.processUnixMailbox(f)
 h.close()
 f.close()
 
 Which is what the ArchiveMail() method would do. Now you still have the
 mlist and msg objects, and you need to save and unlock the list at some
 point
 
 mlist.Save()
 mlist.Unlock()
 
 and the message is now in the pipermail archive and can be indexed.
 

Hello again,

I'm having some troubles with my code. According to what Mark said, I've done 
this :

#!/usr/bin/env python
import sys
sys.path.insert(0,'/usr/lib/mailman')

import syslog

syslog.syslog('begin script')

import email
from Mailman import MailList
from Mailman import Message
## archive part
from Mailman.Archiver import HyperArch
from cStringIO import StringIO

maillist = sys.argv[2]
hostname = sys.argv[1]

msg = 

Re: [Mailman-Users] Indexing mail right after delivery

2010-03-02 Thread Mark Sapiro
On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
 On Fri, 26 Feb 2010 10:15:13 -0800
 Mark Sapiro m...@msapiro.net wrote:

 At this point, you have a list object (locked) and a message object. You
 might think you could just do

 mlist.ArchiveMail(msg)

 to archive the mail to the listname.mbox file and the pipermail archive,
 but that wouldn't quite work because that method would re-invoke the
 external archiver. Also, you don't need to worry about the listname.mbox
 file because the ArchiveMail() method already did that before invoking
 the external archiver, so what you would need is

 from Mailman.Archiver import HyperArch
 from cStringIO import StringIO
 f = StringIO(str(msg))
 h = HyperArch.HyperArchive(mlist)
 h.processUnixMailbox(f)
 h.close()
 f.close()

 Which is what the ArchiveMail() method would do. Now you still have the
 mlist and msg objects, and you need to save and unlock the list at some
 point

 mlist.Save()
 mlist.Unlock()

 and the message is now in the pipermail archive and can be indexed.

 
 Hello again,
 
 I'm having some troubles with my code. According to what Mark said, I've done 
 this :
 
 #!/usr/bin/env python
 import sys
 sys.path.insert(0,'/usr/lib/mailman')
 
 import syslog
 
 syslog.syslog('begin script')
 
 import email
 from Mailman import MailList
 from Mailman import Message
 ## archive part
 from Mailman.Archiver import HyperArch
 from cStringIO import StringIO
 
 maillist = sys.argv[2]
 hostname = sys.argv[1]
 
 msg = email.message_from_file(sys.stdin, Message.Message)
 syslog.syslog(maillist)
 
 mlist = MailList.MailList(maillist, lock=True)
 
 syslog.syslog('processing archiver')
 ## let archive it
 f = StringIO(str(msg))
 h = HyperArch.HyperArchive(mlist)
 h.processUnixMailbox(f)
 h.close()
 f.close()
 mlist.Save()
 mlist.Unlock()
 
 mlist.ArchiveMail(msg)


Here is one problem. Remove the above line. As I tried to say above you
can't do this. The lines above from f = StringIO(str(msg)) through
f.close() archive the message. When you call mlist.ArchiveMail(msg),
it reinvokes your external archiver in an endless loop.

You need to remove the mlist.ArchiveMail(msg).

The locking problem is something else. The external archiver is called
with the list locked, thus when we try to instantiate the list 'locked',
we have a deadlock. Thus, you never saw the loop because of the deadlock.

The good news is we don't have to pass a locked list instance to
HyperArch.HyperArchive() as it uses a special archiver lock.

So, replace

mlist = MailList.MailList(maillist, lock=True)

with

mlist = MailList.MailList(maillist, lock=False)

and remove the mlist.Unlock() as your instance isn't locked, and
ArchRunner will unlock its list instance when you exit.


 syslog.syslog('processing indexer')
 ### coming soon
 
 syslog.syslog('exiting - all ok')
 sys.exit(0)
 
 syslog is for debug purpose only.
 
 And if I send an email on my ML, I have this kind of error:
 
 Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking


-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-03-02 Thread Cedric Jeanneret
On Tue, 02 Mar 2010 11:34:25 -0800
Mark Sapiro m...@msapiro.net wrote:

 On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
  On Fri, 26 Feb 2010 10:15:13 -0800
  Mark Sapiro m...@msapiro.net wrote:
 
  At this point, you have a list object (locked) and a message object. You
  might think you could just do
 
  mlist.ArchiveMail(msg)
 
  to archive the mail to the listname.mbox file and the pipermail archive,
  but that wouldn't quite work because that method would re-invoke the
  external archiver. Also, you don't need to worry about the listname.mbox
  file because the ArchiveMail() method already did that before invoking
  the external archiver, so what you would need is
 
  from Mailman.Archiver import HyperArch
  from cStringIO import StringIO
  f = StringIO(str(msg))
  h = HyperArch.HyperArchive(mlist)
  h.processUnixMailbox(f)
  h.close()
  f.close()
 
  Which is what the ArchiveMail() method would do. Now you still have the
  mlist and msg objects, and you need to save and unlock the list at some
  point
 
  mlist.Save()
  mlist.Unlock()
 
  and the message is now in the pipermail archive and can be indexed.
 
  
  Hello again,
  
  I'm having some troubles with my code. According to what Mark said, I've 
  done this :
  
  #!/usr/bin/env python
  import sys
  sys.path.insert(0,'/usr/lib/mailman')
  
  import syslog
  
  syslog.syslog('begin script')
  
  import email
  from Mailman import MailList
  from Mailman import Message
  ## archive part
  from Mailman.Archiver import HyperArch
  from cStringIO import StringIO
  
  maillist = sys.argv[2]
  hostname = sys.argv[1]
  
  msg = email.message_from_file(sys.stdin, Message.Message)
  syslog.syslog(maillist)
  
  mlist = MailList.MailList(maillist, lock=True)
  
  syslog.syslog('processing archiver')
  ## let archive it
  f = StringIO(str(msg))
  h = HyperArch.HyperArchive(mlist)
  h.processUnixMailbox(f)
  h.close()
  f.close()
  mlist.Save()
  mlist.Unlock()
  
  mlist.ArchiveMail(msg)
 
 
 Here is one problem. Remove the above line. As I tried to say above you
 can't do this. The lines above from f = StringIO(str(msg)) through
 f.close() archive the message. When you call mlist.ArchiveMail(msg),
 it reinvokes your external archiver in an endless loop.
 
 You need to remove the mlist.ArchiveMail(msg).
 
 The locking problem is something else. The external archiver is called
 with the list locked, thus when we try to instantiate the list 'locked',
 we have a deadlock. Thus, you never saw the loop because of the deadlock.
 
 The good news is we don't have to pass a locked list instance to
 HyperArch.HyperArchive() as it uses a special archiver lock.
 
 So, replace
 
 mlist = MailList.MailList(maillist, lock=True)
 
 with
 
 mlist = MailList.MailList(maillist, lock=False)
 
 and remove the mlist.Unlock() as your instance isn't locked, and
 ArchRunner will unlock its list instance when you exit.
 
 
  syslog.syslog('processing indexer')
  ### coming soon
  
  syslog.syslog('exiting - all ok')
  sys.exit(0)
  
  syslog is for debug purpose only.
  
  And if I send an email on my ML, I have this kind of error:
  
  Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking
 
 

Woops, right. it was commented out in my code. For now, I'm pocking around with 
some other problems, such as my external archiver returns a non-zero status. It 
seems to crash with the
h.processUnixMailbox(f)
Is there any way to have a backtrace of python errors (i.e. testing it through 
the shell)? I guess I can write a file with all email content, included 
headers, and pipe it in my file. Right ?

Thank you!

C.


-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-02-27 Thread Cédric Jeanneret
On Fri, Feb 26, 2010 at 7:15 PM, Mark Sapiro m...@msapiro.net wrote:
 On 2/26/2010 4:20 AM, Cedric Jeanneret wrote:
 On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro m...@msapiro.net
 wrote:

 Cedric Jeanneret wrote:

 I'm trying to create a xapian[1] indexer for our mailing list. As
 mailman is written in Python and there are python bindings for
 xapian, I guess I can maybe create a plugin for that. My first
 question is : is there already such a thing ? I searched on the
 net, but nothing appeared My second one : can we create a plugin
 for mailman, if so, where should I go to have some doc ? seems
 there's nothing in the wiki
 (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=pluginsearchQuery.spaceKey=conf_all)



 Just to explain why I'd like to do that: we already have a xapian search
 engine in here, indexing a fileserver, request tracker queues and
 moinmoin wikis... so we'd like to aggregate all our stuff in one app for
 searching.


 This will be quite doable with Mailman 3 which is still in
 development.

 There are problems trying to do this in Mailman 2.1.x. There is a
 plugin capability of sorts in the form of custom handlers that can
 be added to the incoming message processing pipeline. See the FAQ
 at http://wiki.list.org/x/l4A9. However, archiving is
 asynchronous with incoming message processing, so it is not
 possible for a custom handler to know the URL that will ultimately
 retrieve the message from the archive.

 A different approach which might be workable is to use the
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If
 you set

 PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py'
 PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'

 in mm_cfg.py, then that script will be invoked do do the archiving.
 The script in turn could invoke the standard pipermail archiving
 process and then invoke xapian to index the archived message.



 Hello again,

 Just one question : what do mlist, msg, msgdata stand for ? As I read
 I've to create my module and define a process(mlist, msg, msgdata)
 inside it, I'd like to know what are those objects. I discovered that
 mlist stands for a Mailman.MailList.MailList('list-name'), but for
 the others, it's a bit hard to find...


 Only custom handlers need to define process(mlist, msg, msgdata). That
 is the entry point to the handler and three objects are passed

 mlist is the Mailman.MailList.MailList() instance for the current list

 msg is a Mailman.Message.Message() (subclass of email.Message.Message)
    instance for the current message

 msgdata is a dictionary of the message metadata accumulated so far.

 The important thing is these are passed in as arguments to the handler
 process() function.

 In your case, you are defining a module which is going to be invoked
 like the following.

 Suppose that

 PUBLIC_EXTERNAL_ARCHIVER = '/path/to/myarch.py %(hostname)s %listname)s'

 It will be invoked in a pipe similar to

 cat raw_message | /path/to/myarch.py HOST LIST

 i.e. the command string with %(hostname)s and %listname)s replaced by
 the actual host name and list name of the list will be invoked and the
 message piped to it.

 So, it could begin something like:

 #!python
 import sys
 sys.path.insert(0, 'path/to/mailman/bin')
 # The above line can be skipped if myarch.py is in Mailman's
 # bin directory.
 import paths

 import email
 from Mailman import MailList
 from Mailman import Message

 msg = email.message_from_file(sys.stdin, Message.Message)
 mlist = MailList.MailList(sys.argv[1], lock=True)


 At this point, you have a list object (locked) and a message object. You
 might think you could just do

 mlist.ArchiveMail(msg)

 to archive the mail to the listname.mbox file and the pipermail archive,
 but that wouldn't quite work because that method would re-invoke the
 external archiver. Also, you don't need to worry about the listname.mbox
 file because the ArchiveMail() method already did that before invoking
 the external archiver, so what you would need is

 from Mailman.Archiver import HyperArch
 from cStringIO import StringIO
 f = StringIO(str(msg))
 h = HyperArch.HyperArchive(mlist)
 h.processUnixMailbox(f)
 h.close()
 f.close()

 Which is what the ArchiveMail() method would do. Now you still have the
 mlist and msg objects, and you need to save and unlock the list at some
 point

 mlist.Save()
 mlist.Unlock()

 and the message is now in the pipermail archive and can be indexed.

 --
 Mark Sapiro m...@msapiro.net        The highway is for gamblers,
 San Francisco Bay Area, California    better use your sense - B. Dylan



wow, thanks a lot, with all this I'll be able to do what I want!

I'll post all my stuff as soon as I've done it, hopefully next week :).

Thanks again.

Best regards,

C.
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: 

Re: [Mailman-Users] Indexing mail right after delivery

2010-02-26 Thread Cedric Jeanneret
On Thu, 25 Feb 2010 17:08:06 -0800
Mark Sapiro m...@msapiro.net wrote:

 Cedric Jeanneret wrote:
 
 I'm trying to create a xapian[1] indexer for our mailing list. As mailman is 
 written in Python and there are python bindings for xapian, I guess I can 
 maybe create a plugin for that.
 My first question is : is there already such a thing ? I searched on the 
 net, but nothing appeared
 My second one : can we create a plugin for mailman, if so, where should I go 
 to have some doc ? seems there's nothing in the wiki 
 (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=pluginsearchQuery.spaceKey=conf_all)
 
 Just to explain why I'd like to do that: we already have a xapian search 
 engine in here, indexing a fileserver, request tracker queues and moinmoin 
 wikis... so we'd like to aggregate all our stuff in one app for searching.
 
 
 This will be quite doable with Mailman 3 which is still in development.
 
 There are problems trying to do this in Mailman 2.1.x. There is a
 plugin capability of sorts in the form of custom handlers that can be
 added to the incoming message processing pipeline. See the FAQ at
 http://wiki.list.org/x/l4A9. However, archiving is asynchronous with
 incoming message processing, so it is not possible for a custom
 handler to know the URL that will ultimately retrieve the message from
 the archive.
 
 A different approach which might be workable is to use the
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you
 set
 
 PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py'
 PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
 
 in mm_cfg.py, then that script will be invoked do do the archiving. The
 script in turn could invoke the standard pipermail archiving process
 and then invoke xapian to index the archived message.
 


Hello again,

Just one question :
what do mlist, msg, msgdata stand for ? As I read I've to create my module and 
define a process(mlist, msg, msgdata) inside it, I'd like to know what are 
those objects. I discovered that mlist stands for a 
Mailman.MailList.MailList('list-name'), but for the others, it's a bit hard to 
find...

Thanks in advance.

C.

-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL


signature.asc
Description: PGP signature
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Re: [Mailman-Users] Indexing mail right after delivery

2010-02-26 Thread Mark Sapiro
Cedric Jeanneret wrote:

Thank you very much for your answer. I guess the cleanest way would be to 
override the PUBLIC_EXTERNAL_ARCHIVER (we don't want to index our private for 
now). I'll give it a try as soon as possible.
Do you think my script will interest some people ? if so, where should I post 
it ?


Yes, I think it may be of interest. The best place is the tracker at
https://bugs.launchpad.net/mailman plus a note to this list.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-02-26 Thread Mark Sapiro
On 2/26/2010 4:20 AM, Cedric Jeanneret wrote:
 On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro m...@msapiro.net
 wrote:
 
 Cedric Jeanneret wrote:
 
 I'm trying to create a xapian[1] indexer for our mailing list. As
 mailman is written in Python and there are python bindings for
 xapian, I guess I can maybe create a plugin for that. My first
 question is : is there already such a thing ? I searched on the
 net, but nothing appeared My second one : can we create a plugin
 for mailman, if so, where should I go to have some doc ? seems
 there's nothing in the wiki
 (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=pluginsearchQuery.spaceKey=conf_all)


 
Just to explain why I'd like to do that: we already have a xapian search
engine in here, indexing a fileserver, request tracker queues and
moinmoin wikis... so we'd like to aggregate all our stuff in one app for
searching.
 
 
 This will be quite doable with Mailman 3 which is still in
 development.
 
 There are problems trying to do this in Mailman 2.1.x. There is a 
 plugin capability of sorts in the form of custom handlers that can
 be added to the incoming message processing pipeline. See the FAQ
 at http://wiki.list.org/x/l4A9. However, archiving is
 asynchronous with incoming message processing, so it is not
 possible for a custom handler to know the URL that will ultimately
 retrieve the message from the archive.
 
 A different approach which might be workable is to use the 
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If
 you set
 
 PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' 
 PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
 
 in mm_cfg.py, then that script will be invoked do do the archiving.
 The script in turn could invoke the standard pipermail archiving
 process and then invoke xapian to index the archived message.
 
 
 
 Hello again,
 
 Just one question : what do mlist, msg, msgdata stand for ? As I read
 I've to create my module and define a process(mlist, msg, msgdata)
 inside it, I'd like to know what are those objects. I discovered that
 mlist stands for a Mailman.MailList.MailList('list-name'), but for
 the others, it's a bit hard to find...


Only custom handlers need to define process(mlist, msg, msgdata). That
is the entry point to the handler and three objects are passed

mlist is the Mailman.MailList.MailList() instance for the current list

msg is a Mailman.Message.Message() (subclass of email.Message.Message)
instance for the current message

msgdata is a dictionary of the message metadata accumulated so far.

The important thing is these are passed in as arguments to the handler
process() function.

In your case, you are defining a module which is going to be invoked
like the following.

Suppose that

PUBLIC_EXTERNAL_ARCHIVER = '/path/to/myarch.py %(hostname)s %listname)s'

It will be invoked in a pipe similar to

cat raw_message | /path/to/myarch.py HOST LIST

i.e. the command string with %(hostname)s and %listname)s replaced by
the actual host name and list name of the list will be invoked and the
message piped to it.

So, it could begin something like:

#!python
import sys
sys.path.insert(0, 'path/to/mailman/bin')
# The above line can be skipped if myarch.py is in Mailman's
# bin directory.
import paths

import email
from Mailman import MailList
from Mailman import Message

msg = email.message_from_file(sys.stdin, Message.Message)
mlist = MailList.MailList(sys.argv[1], lock=True)


At this point, you have a list object (locked) and a message object. You
might think you could just do

mlist.ArchiveMail(msg)

to archive the mail to the listname.mbox file and the pipermail archive,
but that wouldn't quite work because that method would re-invoke the
external archiver. Also, you don't need to worry about the listname.mbox
file because the ArchiveMail() method already did that before invoking
the external archiver, so what you would need is

from Mailman.Archiver import HyperArch
from cStringIO import StringIO
f = StringIO(str(msg))
h = HyperArch.HyperArchive(mlist)
h.processUnixMailbox(f)
h.close()
f.close()

Which is what the ArchiveMail() method would do. Now you still have the
mlist and msg objects, and you need to save and unlock the list at some
point

mlist.Save()
mlist.Unlock()

and the message is now in the pipermail archive and can be indexed.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-02-25 Thread Mark Sapiro
Cedric Jeanneret wrote:

I'm trying to create a xapian[1] indexer for our mailing list. As mailman is 
written in Python and there are python bindings for xapian, I guess I can 
maybe create a plugin for that.
My first question is : is there already such a thing ? I searched on the net, 
but nothing appeared
My second one : can we create a plugin for mailman, if so, where should I go 
to have some doc ? seems there's nothing in the wiki 
(http://wiki.list.org/dosearchsite.action?searchQuery.queryString=pluginsearchQuery.spaceKey=conf_all)

Just to explain why I'd like to do that: we already have a xapian search 
engine in here, indexing a fileserver, request tracker queues and moinmoin 
wikis... so we'd like to aggregate all our stuff in one app for searching.


This will be quite doable with Mailman 3 which is still in development.

There are problems trying to do this in Mailman 2.1.x. There is a
plugin capability of sorts in the form of custom handlers that can be
added to the incoming message processing pipeline. See the FAQ at
http://wiki.list.org/x/l4A9. However, archiving is asynchronous with
incoming message processing, so it is not possible for a custom
handler to know the URL that will ultimately retrieve the message from
the archive.

A different approach which might be workable is to use the
PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you
set

PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py'
PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'

in mm_cfg.py, then that script will be invoked do do the archiving. The
script in turn could invoke the standard pipermail archiving process
and then invoke xapian to index the archived message.

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Indexing mail right after delivery

2010-02-25 Thread Cedric Jeanneret
On Thu, 25 Feb 2010 17:08:06 -0800
Mark Sapiro m...@msapiro.net wrote:

 Cedric Jeanneret wrote:
 
 I'm trying to create a xapian[1] indexer for our mailing list. As mailman is 
 written in Python and there are python bindings for xapian, I guess I can 
 maybe create a plugin for that.
 My first question is : is there already such a thing ? I searched on the 
 net, but nothing appeared
 My second one : can we create a plugin for mailman, if so, where should I go 
 to have some doc ? seems there's nothing in the wiki 
 (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=pluginsearchQuery.spaceKey=conf_all)
 
 Just to explain why I'd like to do that: we already have a xapian search 
 engine in here, indexing a fileserver, request tracker queues and moinmoin 
 wikis... so we'd like to aggregate all our stuff in one app for searching.
 
 
 This will be quite doable with Mailman 3 which is still in development.
 
 There are problems trying to do this in Mailman 2.1.x. There is a
 plugin capability of sorts in the form of custom handlers that can be
 added to the incoming message processing pipeline. See the FAQ at
 http://wiki.list.org/x/l4A9. However, archiving is asynchronous with
 incoming message processing, so it is not possible for a custom
 handler to know the URL that will ultimately retrieve the message from
 the archive.
 
 A different approach which might be workable is to use the
 PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you
 set
 
 PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py'
 PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
 
 in mm_cfg.py, then that script will be invoked do do the archiving. The
 script in turn could invoke the standard pipermail archiving process
 and then invoke xapian to index the archived message.
 

Hello Mark,

Thank you very much for your answer. I guess the cleanest way would be to 
override the PUBLIC_EXTERNAL_ARCHIVER (we don't want to index our private for 
now). I'll give it a try as soon as possible.
Do you think my script will interest some people ? if so, where should I post 
it ?

Thanks again

Best regards,

C.


-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org