Prakash,

There is an open source library to extract text from a Word document here:
http://www.textmining.org 

-Ryan

-----Original Message-----
From: prakash jaya [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 27, 2005 9:28 AM
To: [email protected]
Subject: Re: how can i extract text from Powerpointfiles,Ms word files









hello friends thank for u suggetions,
                                   i tried i got some result,but i am not 
getting the total document.i am getting only the first line of the 
document.plz give solution to this.
here is my following code:

////////////////////////////////////////////////////////////////////
import java.io.*;

import org.apache.poi.hwpf.usermodel.*;
import org.apache.poi.hwpf.HWPFDocument;

public class Test11
{
  public Test11()
  {
  }

  public static void main(String[] args)throws IOException
  {
    try
    {
      HWPFDocument doc = new HWPFDocument (new FileInputStream (fin));
      Range r = doc.getRange();
         FileOutputStream out=new FileOutputStream("d:\\example.txt");

        for (int x = 0; x < r.numSections(); x++)
      {
        Section s = r.getSection(x);
        for (int y = 0; y < s.numParagraphs(); y++)
        {
          Paragraph p = s.getParagraph(y);
          for (int z = 0; z < p.numCharacterRuns(); z++)
          {

            //character run
            CharacterRun run = p.getCharacterRun(z);
            //character run text
            String text = run.text();
           byte[] b1=text.getBytes();

            // show us the text
            out.write(b1);

          }


                 out.close();
        }
      }

    }
    catch (Throwable t)
    {
      t.printStackTrace();
    }
  }

}



///////////////////////////////////////////////////////
my original document is:


I want to read a powerpoit file "A" and write it content to create another 
powerpoint "B".

The simple way is to use FileInputStream to read a byte array from file 
A.ppt and FileOutputStream to write the byte array to B.ppt. It's work.

But today, i don't want to use raw byte array to write to B.ppt 
immediately(Just the program's demand, i do not want to do this too!!:<). I 
translate the byte array to "String", then translate it back to byte array 
and write it to B.ppt. One problem happens!!


this program output is:
I want to read a powerpoit file "A" and write it content to create another 
powerpoint "B".


plz give solution to this problem.i would be thankful if u give solution to 
my problem


















>From: Rama Subba Reddy <[EMAIL PROTECTED]>
>Reply-To: "POI Users List" <[email protected]>
>To: POI Users List <[email protected]>
>Subject: Re: how can i extract text from Powerpointfiles,Ms word files
>Date: Thu, 27 Oct 2005 12:53:07 +0100 (BST)
>
>Hello,
>   use the following code and extract
>HWPFDocument doc = new HWPFDocument(fin);
>   Range range = doc.getRange();
>    int totParagraphs = range.numParagraphs();
>for (int i = 0; i < totParagraphs; i++) {
>      Paragraph para = range.getParagraph(i);
>get text run from para and then get text and properties from run
>}
>prakash jaya <[EMAIL PROTECTED]> wrote:
>
>hello friend good morning,
>i am getting text from the powerpoint
>presentations using the powerpointextractor class of poi.but how to get
>text from MS word files.i run HWPFDocument.java class.In the specification
>it takes two aruments(one is sorce file,another is destination file).it 
>does
>not give any result & also it does not create any destination file.can u 
>plz
>give solution this problem.i would be thankful if u give solution.
>
>_________________________________________________________________
>Spice up your IM conversations. New colourful, animated emoticons. Go
>chatting! http://server1.msn.co.in/SP05/emoticons/
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>Mailing List: http://jakarta.apache.org/site/mail2.html#poi
>The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
>
>
>
>---------------------------------
>  Enjoy this Diwali with Y! India Click here

_________________________________________________________________
Answer questions. Register with e-bay. Win gold, watches and more! 
http://pages.ebay.in/msnindia/msn_quad_shopwingold_sept.html


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Reply via email to