RE: Getting a PIC - And errors found in EscherDump (got fixes if want)

2005-04-21 Thread Robert Paris
No, they are stored in the Datastream BUT not in the format that the 
documentation states. But using the Escher format you should be able to grab 
MOST (not all) picture data, as images inserted into a Word file from Word 
'97 or later are now stored as Escher objects, even if they're not drawings 
but jpegs, etc.

The documentation states that the file is saved as a PIC header followed by 
the filename as a Pascal string and then the file data. That is not even 
remotely close to what actually exists there. Instead, there's the PIC 
header structure, then IF it's an Escher object, you've got the insane 
Escher heading structure (similar to, but even worse than the grppls of 
srpms) and then the actual file data.

Hope this helps!
(BTW, did anyone notice the oddly sexual nature of the Word naming 
structure? A whole host of sprms everywhere, which are linked to STDs, which 
of course require a PAP to discern and was all preceded by a whole lot of 
grppl-ing) -JK


From: Kais Dukes [EMAIL PROTECTED]
Reply-To: POI Developers List poi-dev@jakarta.apache.org
To: POI Developers List poi-dev@jakarta.apache.org
Subject: RE: Getting a PIC
Date: Wed, 20 Apr 2005 23:33:59 +0100
Hi Robert,
I am most interested in what you have found. Are you saying that the 
picture
data for some Escher images are not stored in the exepcted place (the
document's Data stream?) but are instead embedded as part of the complex
stream?

Kind Regards,
Kais
-Original Message-
From: Robert Paris [mailto:[EMAIL PROTECTED]
Sent: 20 April 2005 22:06
To: poi-dev@jakarta.apache.org
Subject: RE: Getting a PIC
Thanks for the reply.
OH, if only it were so simple. I believe I found it, and as with all other
Word formats, the thing is a mess. You have to loop through and when you
find the right record (and check a thousand fWhateverBooleans and option
shorts), you then have to parse the complex data, and it appears to be
stored in there.
Of course, none of this follows the MS Binary Format writings and is found
pretty much no where on the web. Ugh. But thankfully it appears the good
folks at POI (non-scratchpad area) have done some great work in this area 
to
get me started.

Thanks again!
From: Kais Dukes [EMAIL PROTECTED]
Reply-To: POI Developers List poi-dev@jakarta.apache.org
To: POI Developers List poi-dev@jakarta.apache.org
Subject: RE: Getting a PIC
Date: Wed, 20 Apr 2005 18:49:02 +0100

Hi Robert,

Although I have not looked at the BSE record code myself, I have some
information from my own work on Escher diagrams.
A BSE record contains a fixed size header, and then may be followed by an
optional string (2 bytes per character). Could this string be the file 
name
you have described?

-- Kais

-Original Message-
From: Robert Paris [mailto:[EMAIL PROTECTED]
Sent: 20 April 2005 18:26
To: poi-dev@jakarta.apache.org
Subject: Re: Getting a PIC


Thanks for the reply. Yes, it does appear to be an Escher BSE Record,
however, there seems to be an issue with grabbing some of the info inside
it.

When I look at the actual data in the byte stream, I can see the file 
path
and name in the data (e.g. D : \ F i l e s \ S o m e I m a g e . j p g ),
yet I cannot find that data anywhere inside either POI's EscherBSE Record
reading (from 0xF007) nor in any other documentation I've found on that.
None of the tags seem to hold that info. Any idea where I read it from?

Attempts to read from the case 0xF007 don't work because by the time it
hits
that tag marker, it's already past the path/filename string and when it
reads the name length (at offset 33), it always has length = 0.

Thanks again for your help and time!



 From: Avik Sengupta [EMAIL PROTECTED]
 Reply-To: POI Developers List poi-dev@jakarta.apache.org
 To: POI Developers List poi-dev@jakarta.apache.org
 Subject: Re: Getting a PIC
 Date: Wed, 20 Apr 2005 12:39:53 +0530
 
 Have you seen the drawing code in HSSF? Maybe its similar/same?
 
 On Wed, 2005-04-20 at 03:02 +, Robert Paris wrote:
   I'm working on the part of Word that stores pictures and I've run 
into
a
   problem. I'm able to grab the PIC structure (from the SPRM
   sprmCPicLocation). However, once I've gone through that, I have a
chunk
 of
   data that I believe is an Office Shape Format. Unfortunately, I am
 unable
   to find the definition for this structure anywhere. Does anyone know
 where
   it is?
  
   The documentation for Word 97 says that all pictures inserted with
Word
 97
   are in the new Office shape format (documented elsewhere). Without
that
   documentation, I have no way to read this data!
  
   Anyone?
  
  
  
   
-
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   Mailing List:http://jakarta.apache.org/site/mail2.html#poi
   The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
  
 --
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 Mailing List

Re: Getting a PIC

2005-04-20 Thread Avik Sengupta
Have you seen the drawing code in HSSF? Maybe its similar/same?

On Wed, 2005-04-20 at 03:02 +, Robert Paris wrote:
 I'm working on the part of Word that stores pictures and I've run into a 
 problem. I'm able to grab the PIC structure (from the SPRM 
 sprmCPicLocation). However, once I've gone through that, I have a chunk of 
 data that I believe is an Office Shape Format. Unfortunately, I am unable 
 to find the definition for this structure anywhere. Does anyone know where 
 it is?
 
 The documentation for Word 97 says that all pictures inserted with Word 97 
 are in the new Office shape format (documented elsewhere). Without that 
 documentation, I have no way to read this data!
 
 Anyone?
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 Mailing List:http://jakarta.apache.org/site/mail2.html#poi
 The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
 
-- 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/



RE: Getting a PIC

2005-04-20 Thread Kais Dukes
Hi Robert,

Although I have not looked at the BSE record code myself, I have some
information from my own work on Escher diagrams.
A BSE record contains a fixed size header, and then may be followed by an
optional string (2 bytes per character). Could this string be the file name
you have described?

-- Kais

-Original Message-
From: Robert Paris [mailto:[EMAIL PROTECTED]
Sent: 20 April 2005 18:26
To: poi-dev@jakarta.apache.org
Subject: Re: Getting a PIC


Thanks for the reply. Yes, it does appear to be an Escher BSE Record,
however, there seems to be an issue with grabbing some of the info inside
it.

When I look at the actual data in the byte stream, I can see the file path
and name in the data (e.g. D : \ F i l e s \ S o m e I m a g e . j p g ),
yet I cannot find that data anywhere inside either POI's EscherBSE Record
reading (from 0xF007) nor in any other documentation I've found on that.
None of the tags seem to hold that info. Any idea where I read it from?

Attempts to read from the case 0xF007 don't work because by the time it hits
that tag marker, it's already past the path/filename string and when it
reads the name length (at offset 33), it always has length = 0.

Thanks again for your help and time!



From: Avik Sengupta [EMAIL PROTECTED]
Reply-To: POI Developers List poi-dev@jakarta.apache.org
To: POI Developers List poi-dev@jakarta.apache.org
Subject: Re: Getting a PIC
Date: Wed, 20 Apr 2005 12:39:53 +0530

Have you seen the drawing code in HSSF? Maybe its similar/same?

On Wed, 2005-04-20 at 03:02 +, Robert Paris wrote:
  I'm working on the part of Word that stores pictures and I've run into a
  problem. I'm able to grab the PIC structure (from the SPRM
  sprmCPicLocation). However, once I've gone through that, I have a chunk
of
  data that I believe is an Office Shape Format. Unfortunately, I am
unable
  to find the definition for this structure anywhere. Does anyone know
where
  it is?
 
  The documentation for Word 97 says that all pictures inserted with Word
97
  are in the new Office shape format (documented elsewhere). Without that
  documentation, I have no way to read this data!
 
  Anyone?
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  Mailing List:http://jakarta.apache.org/site/mail2.html#poi
  The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
 
--


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 19/04/2005

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 19/04/2005


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/



RE: Getting a PIC

2005-04-20 Thread Robert Paris
Thanks for the reply.
OH, if only it were so simple. I believe I found it, and as with all other 
Word formats, the thing is a mess. You have to loop through and when you 
find the right record (and check a thousand fWhateverBooleans and option 
shorts), you then have to parse the complex data, and it appears to be 
stored in there.

Of course, none of this follows the MS Binary Format writings and is found 
pretty much no where on the web. Ugh. But thankfully it appears the good 
folks at POI (non-scratchpad area) have done some great work in this area to 
get me started.

Thanks again!
From: Kais Dukes [EMAIL PROTECTED]
Reply-To: POI Developers List poi-dev@jakarta.apache.org
To: POI Developers List poi-dev@jakarta.apache.org
Subject: RE: Getting a PIC
Date: Wed, 20 Apr 2005 18:49:02 +0100
Hi Robert,
Although I have not looked at the BSE record code myself, I have some
information from my own work on Escher diagrams.
A BSE record contains a fixed size header, and then may be followed by an
optional string (2 bytes per character). Could this string be the file name
you have described?
-- Kais
-Original Message-
From: Robert Paris [mailto:[EMAIL PROTECTED]
Sent: 20 April 2005 18:26
To: poi-dev@jakarta.apache.org
Subject: Re: Getting a PIC
Thanks for the reply. Yes, it does appear to be an Escher BSE Record,
however, there seems to be an issue with grabbing some of the info inside
it.
When I look at the actual data in the byte stream, I can see the file path
and name in the data (e.g. D : \ F i l e s \ S o m e I m a g e . j p g ),
yet I cannot find that data anywhere inside either POI's EscherBSE Record
reading (from 0xF007) nor in any other documentation I've found on that.
None of the tags seem to hold that info. Any idea where I read it from?
Attempts to read from the case 0xF007 don't work because by the time it 
hits
that tag marker, it's already past the path/filename string and when it
reads the name length (at offset 33), it always has length = 0.

Thanks again for your help and time!

From: Avik Sengupta [EMAIL PROTECTED]
Reply-To: POI Developers List poi-dev@jakarta.apache.org
To: POI Developers List poi-dev@jakarta.apache.org
Subject: Re: Getting a PIC
Date: Wed, 20 Apr 2005 12:39:53 +0530

Have you seen the drawing code in HSSF? Maybe its similar/same?

On Wed, 2005-04-20 at 03:02 +, Robert Paris wrote:
  I'm working on the part of Word that stores pictures and I've run into 
a
  problem. I'm able to grab the PIC structure (from the SPRM
  sprmCPicLocation). However, once I've gone through that, I have a 
chunk
of
  data that I believe is an Office Shape Format. Unfortunately, I am
unable
  to find the definition for this structure anywhere. Does anyone know
where
  it is?
 
  The documentation for Word 97 says that all pictures inserted with 
Word
97
  are in the new Office shape format (documented elsewhere). Without 
that
  documentation, I have no way to read this data!
 
  Anyone?
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  Mailing List:http://jakarta.apache.org/site/mail2.html#poi
  The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
 
--


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 19/04/2005
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 19/04/2005
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/