Cool, this is sort of what I thought. My first instinct was your first suggestion, just modify the generated index into something easily parsed and run through it collapsing the threads into one file. Grossly inefficient, but lacking an as easy alternative I think it should be acceptable for what I'm shooting for.
You could take everything I know about mhonarc, put it in a dixie cup, and then throw the dixie cup away, so I'm going to stay away from modifying it for the time being. I'll keep it in mind though if things get out of hand
Thanks for the pointers
-scott
From: Earl Hood <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: conversion question Date: Thu, 24 Jul 2003 22:58:48 -0500
On July 24, 2003 at 18:08, "Scott Noone" wrote:
> I have a directory full of messages from an NNTP server, each file
> corresponding to one post. What I need to do is generate one HTML file per
> thread, ending up with something that looks like what you get in Google
> Groups. This is for a specific purpose and not something that readers of the
> archive are ever going to see, so I'm not worried about navigation or making
> it pretty. I have a plan on how to do it already, but it seems like overkill
> and I feel like I'm missing something obvious. Ideas?
I actually did a custom contract job to develop a program that does exactly this. Of course, there is more than one way to do it, and the solution to go with depends on how much you know about some of MHonArc internals.
If you know nothing, you can always do a post-processing step on the files themselves to generating your thread pages (in the program I did, I called them discussion pages to avoid confusion with thread index pages). The easiest approach I can think of is to utilize the OTHERINDEXES resource to create a special file that lists out all the threads in a format that is easily parsable to faciliate post processing.
The alternatives is to utilize some of the internals of MHonArc for better performance (along with custom resource settings). My first idea was to use SSIs. Each message page layout would be configured to be included via an SSI. Then a post-processing step would create the discussion pages and used an SSI for each message of the thread. Therefore, the HTTP server would generate the complete page when requested. This is basically an extension of the the blog.mrc example provided in the MHonArc docs.
However, the client wanted both the normal singe message pages along with discussion pages and was concerned about the overhead of SSI processing. Therefore, the post-processing step would extract the "meat" of each message of a thread to generate the discussion page. I used page layout resources to set markers that deliminate what the script should extract.
I used some of MHonArc internals to quickly walk the threads to generate the discussion pages vs having to peek at a bunch of message pages. Also, by using the internals, I was able to optimize discussion page updates by only updating those pages that needed to be updated when messages are added vs blindly creating all discussion pages each time.
BTW, things like navigation were important, so the script provided customizations features so navigational links can be included.
--ewh
--------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-USERS
_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. http://join.msn.com/?page=features/virus
--------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-USERS