On Mar 16, 2011, at 2:51 PM, Hafiz Badrie Lubis wrote:

Hi People,

I just joined the group and I want to ask something about my problem.
I'm still learning Ruby on Rails and now I have a task to parse
Microsoft Word and store the content into database.

Do you have any suggestion how to do it?

FYI, I develop it under Unix Environment. So, I don't have a chance to
use win32ole on it, CMIIW.

I also have searched the internet about this. But all I found that I
need to use JRuby and combine it with Apache POI or else I need to use
win32ole. As far as I know, to use JRuby I need to create the rails
project also with JRuby but unfortunately I already created the
project with plain Ruby.

So, I don't know what to do anymore. Does anybody have clue?

I did a project in PHP quite a few years ago, and I used some venerable unix cli converters to do this. I stored the files as is, and then used these converters to rip out their text and stored that in the database for searching. They aren't perfect, but they do a good enough job for search results.

$translators = array(
        'pdf' => '/usr/local/bin/pdftotext ./pdf/%s.pdf -',
        'ppt' => '/usr/local/bin/catppt -d ascii ./ppt/%s.ppt',
        'xls' => '/usr/local/bin/xls2csv -d ascii ./xls/%s.xls',
        'doc' => '/usr/local/bin/catdoc -d ascii ./doc/%s.doc'
); //these translators all pipe to stdout, which means that shell_exec will return their text value

Walter


Regards,

Hafiz Badrie Lubis

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails- t...@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-talk+unsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en .


--
You received this message because you are subscribed to the Google Groups "Ruby on 
Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to 
rubyonrails-talk+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en.

Reply via email to