Natarajan,

This won't work because there is no class called WordTextPiece in POI. This
is just code regurgitated from the internals of the textmining.org
libraries. When you steal my code at least give me credit. It is required by
the Apache license I distribute the textmining.org libraries under.

-Ryan

----- Original Message ----- 
From: "Natarajan.T" <[EMAIL PROTECTED]>
To: "'Lucene Users List'" <[EMAIL PROTECTED]>
Sent: Tuesday, August 24, 2004 8:11 AM
Subject: RE: worddoucments search


> Hi Santhosh,
>
> Try out the below attached code.....(POI.jar should be in your class
> path)
>
>
> public String getContent(InputStream reader) throws IOException {
>     ArrayList text = new ArrayList();
>     POIFSFileSystem fsys = new POIFSFileSystem(reader);
>
>     DocumentEntry headerProps =
> (DocumentEntry)fsys.getRoot().getEntry("WordDocument");
>     DocumentInputStream din =
> fsys.createDocumentInputStream("WordDocument");
>     byte[] header = new byte[headerProps.getSize()];
>
>     din.read(header);
>     din.close();
>
>     //Get the information we need from the header
>     int info = LittleEndian.getShort(header, 0xa);
>     boolean useTable1 = (info & 0x200) != 0;
>
>     //get the location of the piece table
>     int complexOffset = LittleEndian.getInt(header,
> 0x1a2);
>
>     String tableName = null;
>     if (useTable1) {
>       tableName = "1Table";
>     }
>     else{
>       tableName = "0Table";
>     }
>
>     DocumentEntry table =
> (DocumentEntry)fsys.getRoot().getEntry(tableName);
>     byte[] tableStream = new byte[table.getSize()];
>     din = fsys.createDocumentInputStream(tableName);
>     din.read(tableStream);
> din.close();
>
>     din = null;
>     fsys = null;
>     table = null;
>     headerProps = null;
>
>     int multiple = findText(tableStream, complexOffset,
> text);
>
>     StringBuffer sb = new StringBuffer();
>     int size = text.size();
>     tableStream = null;
>
> WordTextPiece nextPiece = null;
> int start ;
> int length;
> String toStr = "";
> for (int x = 0; x < size; x++) {
> nextPiece = (WordTextPiece)text.get(x);
> start = nextPiece.getStart();
> length = nextPiece.getLength();
>
> boolean unicode =
> nextPiece.usesUnicode();
> if (unicode) {
> toStr = new String(header,
> start, length * multiple, "UTF-16LE");
> }
> else{
> toStr = new String(header,
> start, length , "ISO-8859-1");
> }
>
> }
>
> reader.close();
> return toStr;
> }
>
>
> Regards,
> Natarajan.
>
>
>
> -----Original Message-----
> From: Santosh [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, August 24, 2004 5:46 PM
> To: Lucene Users List
> Subject: worddoucments search
>
> Can lucene be able to search word documents? if so please give me
> information about it
>
> regards
> Santosh kumar
>
>
> -----------------------SOFTPRO DISCLAIMER------------------------------
>
> Information contained in this E-MAIL and any attachments are
> confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> and 'confidential'.
>
> If you are not an intended or authorised recipient of this E-MAIL or
> have received it in error, You are notified that any use, copying or
> dissemination  of the information contained in this E-MAIL in any
> manner whatsoever is strictly prohibited. Please delete it immediately
> and notify the sender by E-MAIL.
>
> In such a case reading, reproducing, printing or further dissemination
> of this E-MAIL is strictly prohibited and may be unlawful.
>
> SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> hereto is free from computer viruses or other defects.
>
> The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> those of the author and are not necessarily those of SOFTPRO SYSTEMS.
> ------------------------------------------------------------------------
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to