Donna,
 
What do you mean by "massive amount"? Are the files huge or do you have a  
1,000 files?
 
Some people use the Word to HTML to DocBook trail, some a more direct  
conversion, like Majix. Either way, how much cleanup you do after the 
conversion 
 is important. 
 
I use Majix and I have tweeked it so that it produces fairly clean DocBook  
(i.e usually no editing afterwards). I could make an install package  
available to you if you like. I am a maintainer of the Majix package and I have 
 
not released new packages lately and there has been many Docbook related 
changes  in the past year. I got on the maintainer list because it appeared to 
be an  inactive project and I had many Docbook related changes that needed 
to get into  the source tree.
 
I would suggest you try several packages out there to see what they produce 
 for a single file. This may be more important than the actual package, 
because  if you have a "massive amount" that could equate to a lot of "fix up" 
if they  are not perfect and I assume you want to limit your post conversion 
handling to  zero ;-)
 
One other caution, Word formats range 20+ years and not  all Word versions 
produce the same RTF format. I have had problems  converting Word97 files 
and then I saved them as Word2003 and they were fine -  and visa versa. Weird. 
I would be interested to see what FrameMaker RTF looks  like and how it 
goes through Majix.
 
Regards,
Dean Nelson
 
 
 
 
In a message dated 12/2/2011 8:53:26 A.M. Pacific Standard Time,  
b...@sagehill.net writes:

I've had good results using dbdoclet.  I  first let Word convert the 
content to HTML using Save As -> Webpage  (filtered), and then apply dbdoclet 
to 
the HTML to generate docbook XML.   That approach lets Word handle all of 
Word's many coding options and quirks,  filtering them down to something more 
standardized to convert.   dbdoclet is a Java toolkit, one part of which is 
for converting HTML to  DocBook.
 
Bob Stayton
Sagehill Enterprises
_bobs@sagehill.net_ (mailto:b...@sagehill.net) 
 
 

----- Original Message ----- 
From:  _Donna Saporito_ (mailto:dsapor...@antennasoftware.com)  
To: _docbook-apps@lists.oasis-open.org_ 
(mailto:docbook-apps@lists.oasis-open.org)   
Sent: Friday, December 02, 2011 7:07  AM
Subject: [docbook-apps] Easiest way to  convert Word .doc or .rtf to 
DocBook?



Hi, 

I have to convert a massive amount of Word documents  over to DocBook for 
my company. I also have a few FrameMaker documents that  will need to be 
converted. I figure that I can save the .fm files as .rtf  and then .doc files 
and follow the conversion process I will use for Word  (once I figure out 
what to use). 
I am willing to purchase a tool such as WordPlay by  Docsoft, Inc. or 
Upcast by InfinityLoop. I looked into MajiX, but I thought  the installation 
instructions were rather confusing. (They may not be  confusing to others, but 
the user doc said I’d have to run some command from  Sun’s virtual machine, 
etc. ) I also looked into Yawc briefly, but it calls  for attaching a new 
template in Microsoft Word. I am nervous to do this in  case I corrupt any 
existing Word templates that I have. 
If there is a simple, straight-forward way to convert  Word to XML? I am 
willing to ask my company to purchase the best  tool. 
Any input will be appreciated. Thanks so much. -  Donna 
Donna  Saporito | Technical Writer | Technology and Engineering
O: 201-217-3382  | M: 551-655-5721 | H: 973-845-6594| E:  
dsapor...@antennasoftware.com

Antenna |  Deploy Happiness  | _www.antennasoftware.com_ 
(http://www.antennasoftware.com/) 
………………………………………………………………….……….…

Join other Antenna Fans @

-  _http://www.facebook.com/antennainc_ 
(http://www.facebook.com/antennainc)   
- _http://www.twitter.com/AntennaSoftware_ 
(http://www.twitter.com/AntennaSoftware)  


This email and any files  transmitted with it are confidential and intended 
solely for the use of the  individual or entity to whom they are addressed. 
Please note that any views  or opinions presented in this email are solely 
those of the author and do  not necessarily represent those of the company. 
Finally, the recipient  should check this email and any attachments for the 
presence of viruses. The  company accepts no liability for any damage caused 
by any virus transmitted  by this email. If you have received this email in 
error please delete it and  notify the system administrator at 
administra...@antennasoftware.com  ­­   

Reply via email to