[HSLF] Bug 40143 - Font size
Hi List, any news about bug #40143 http://issues.apache.org/bugzilla/show_bug.cgi?id=40143? I'm experiencing the same problem described in some files. The font size returned is wrong, sometimes the correct information is being stored in other style like asian_or_complex, char_unknown_2 and even font.color. This is really annoying. Thanks! -- Tales Paiva - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re: PPT Unicode
Hi, I'm not changing the text. I just read it. My problem occurs when there is any TextCharsAtom because the platform I am using doesn't support Unicode, just ISO-8859-1. So I had to change the code replacing UTF-16LE by ISO-8859-1. So I think I have no way out but show the text, without styles. Thanks a lot, -- Tales Paiva Nick Burch wrote: On Tue, 5 Dec 2006, Tales Paiva Nogueira wrote: When PowerPoint stores text in Unicode a unknown char (byte value = 0) is placed between every normal char making the text 2 times longer than it really is. TextCharsAtoms, and other unicode containing fields in powerpoint files, are stored as UTF-16. That means two bytes are used to store every character. US-ASCII will be stored with the second byte zero, but other characters will need to make some use of the second byte. If you call getText() on a TextCharsAtom, it'll convert it to a string for you. You should really be using that, not getting the bytes directly. Is there any way to keep the style information and get the text as a TextByteAtom, instead of TextCharsAtom? Why? PowerPoint decided to make it a TextCharsAtom, rather than a TextByteAtom, since your string contained at least one character that couldn't be represented in a TextByteAtom. HSLF supports upgrading a TextByteAtom to a TextCharsAtom if you try to set text that can't be held in a TextByteAtom. It doesn't do the other way around. If you really want just the low order bytes, call getText() on the TextCharsAtom, and mangle the string yourself. Not sure why you'd want to though Nick Yegor Kozlov wrote: Hi, Could you provide a test case? As I understood you did something like this: - take a ppt file with a text. - programmatically change the text using HSLF API - save file - style information is wrong after save. Is it correct? Yegor TPN Hi List, TPN When PowerPoint stores text in Unicode a unknown char (byte value = TPN 0) is placed between every normal char making the text 2 times longer TPN than it really is. I can ignore these garbage chars, but I lost the text TPN style informations, as it's indexes are based in the original unicode TPN text with all that unicode trash. :( TPN Is there any way to keep the style information and get the text as a TPN TextByteAtom, instead of TextCharsAtom? TPN Thank you very much. TPN -- TPN Tales Paiva - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Unsupported features
Hi list, The questions I'm about to ask here are probably about non supported features by POI, but I'd like to get some hints to solve them even if I have to implement it. So Nick, Yegor and anyone who is working with POI, if you can give some directions I'll appreciate. 1. I'm trying to retrieve an image from the _pictures PictureData array. The problem I'm experiencing is in the case that the picture is a slide background because there isn't a Picture model object to get the corresponding image index in the array. 2. Another problem in the background issue is to get background colors. Every background has its color scheme or follows the master scheme, but when I change the background of a single slide, I can't get that color (that doesn't follows the master background). 3. I've already asked here about numbered lists, is there any flag that indicates if the list is numbered or bulleted? Some para_unknown field? 4. When there is a table in a ppt file and I resize it by the border lines, the anchor isn't updated. But when I change any column or row size, the anchor is updated, and I have the correct table size and position (anchor). Thanks in advance, Tales Paiva - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Unsupported features
Hi list, The questions I'm about to ask here are probably about non supported features by POI, but I'd like to get some hints to solve them even if I have to implement it. So Nick, Yegor and anyone who is working with POI, if you can give some directions I'll appreciate. 1. I'm trying to retrieve an image from the _pictures PictureData array. The problem I'm experiencing is in the case that the picture is a slide background because there isn't a Picture model object to get the corresponding image index in the array. 2. Another problem in the background issue is to get background colors. Every background has its color scheme or follows the master scheme, but when I change the background of a single slide, I can't get that color (that doesn't follows the master background). 3. I've already asked here about numbered lists, is there any flag that indicates if the list is numbered or bulleted? Some para_unknown field? 4. When there is a table in a ppt file and I resize it by the border lines, the anchor isn't updated. But when I change any column or row size, the anchor is updated, and I have the correct table size and position (anchor). Thanks in advance, Tales Paiva - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re: Numbering formats of lines of text in PPT files
Hi List, I'm experiencing a problem when I try to differ numbered lists from bulleted lists. The paragraph flags should indicate the difference, shouldn't it? thanks in advance, Tales Paiva Erez Eisenstein wrote: Thanks, I'll try that :) -Original Message- From: Yegor Kozlov [mailto:[EMAIL PROTECTED] Sent: Monday, October 09, 2006 6:20 PM To: POI Users List Subject: Re: Numbering formats of lines of text in PPT files Hi, Recently we added RuchTextRun.getIndentLevel() which returns the indentation level of a line. In your example it should return 0 for first line, second line, third line and 1 for sub-lines: 1. first line //indent=0 a. first sub-line //indent=1 b. second sub-line //indent=1 2. second line //indent=0 3. third line //indent=0 Regards, Yegor EE Hi Nick, I'm having the following problem: EE After extracting several lines of text from a TextBox, if the lines were EE numbered, with also sub-numbered, I can't tell which line is which. EE Example: EE 1. first line EE a. first sub-line EE b. second sub-line EE 2. second line EE 3. third line EE will come out looking like this: EE * first line EE * first sub-line EE * second sub-line EE * second line EE * third line EE All of these lines appear in the same TextRun instance, I got by EE invoking the getTextRun() method of TextBox class. I can't tell from EE each line's attribute (color, font, size, etc.) which line is a sub-line EE of which. EE Is there a solution to this I'm over looking? EE Thanks EE - EE To unsubscribe, e-mail: [EMAIL PROTECTED] EE Mailing List: http://jakarta.apache.org/site/mail2.html#poi EE The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
[HSLF] Indentation level identification
Hi, I need to identify the indentation level of a text in a TextBox. For this I created a method in the RichTextRun class as follows: public int getTextOffset(){ return getParaTextPropVal(text.offset); } Assuming that the property name that holds this information is text.offset, is it? When I call it from my main class, it returns -1. I know this is because the RichTextRun inherits the property from the Master. Is there any other way to identify the indentation level for a given RichTextRun? Thanks a lot, Tales Paiva - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
RE: TextBox bug
Hi, I noticed this problem few days ago, I made a little change in the code, and it worked properly for my needs. The change I made was add an test to ensure that the _txtrun isn't null. Below is the code for the setSheet(Sheet sheet) in the TextBox class: public void setSheet(Sheet sheet){ _sheet = sheet; //initialize _txtrun object. //we can't do it in the constructor because the sheet is not assigned yet if(_txtrun == null) initTextRun(); if (_txtrun != null){ RichTextRun[] rt = _txtrun.getRichTextRuns(); for (int i = 0; i rt.length; i++) { rt[i].supplySlideShow(_sheet.getSlideShow()); } if (_fontname != null) { setFontName(_fontname); _fontname = null; } } } I hope it can help you. -- Tales I am not creating a TextBox myself... I have an instance of Slide, and I call the getShapes() method (some of the shapes are TextBox instances) -Original Message- From: Nick Burch [mailto:[EMAIL PROTECTED] Sent: Thursday, July 13, 2006 11:23 AM To: POI Users List Subject: Re: TextBox bug On Thu, 13 Jul 2006, Erez Eisenstein wrote: If I create a new ppt file, with an empty textbox in it (that says: click to add subtitle), than the _txtrun is null. This causes the TextBox class to throw NullPointerExceptions, in methods getText(), setSheet(). When you create a TextBox, it does create and set _txtrun (line 160 in createSpContainer), so I'm not sure where your problem is? Could you supply the code you use to generate the NullPointerExceptions? Nick - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
RE: TextBox bug
This piece of code doesn't work in all cases, I made some tests and the NullPointerException still happens. This time it happens in the TextBox.getRichTextRuns() when it tries to return _txtrun.getRichTextRuns(). -- Tales Hi, This is the changes I make (works just the same): RichTextRun[] rt; if (_txtrun == null) rt = new RichTextRun[0]; else rt = _txtrun.getRichTextRuns(); Thanks -Original Message- From: Tales Paiva Nogueira [mailto:[EMAIL PROTECTED] Sent: Thursday, July 13, 2006 3:15 PM To: POI Users List Subject: RE: TextBox bug Hi, I noticed this problem few days ago, I made a little change in the code, and it worked properly for my needs. The change I made was add an test to ensure that the _txtrun isn't null. Below is the code for the setSheet(Sheet sheet) in the TextBox class: public void setSheet(Sheet sheet){ _sheet = sheet; //initialize _txtrun object. //we can't do it in the constructor because the sheet is not assigned yet if(_txtrun == null) initTextRun(); if (_txtrun != null){ RichTextRun[] rt = _txtrun.getRichTextRuns(); for (int i = 0; i rt.length; i++) { rt[i].supplySlideShow(_sheet.getSlideShow()); } if (_fontname != null) { setFontName(_fontname); _fontname = null; } } } I hope it can help you. -- Tales I am not creating a TextBox myself... I have an instance of Slide, and I call the getShapes() method (some of the shapes are TextBox instances) -Original Message- From: Nick Burch [mailto:[EMAIL PROTECTED] Sent: Thursday, July 13, 2006 11:23 AM To: POI Users List Subject: Re: TextBox bug On Thu, 13 Jul 2006, Erez Eisenstein wrote: If I create a new ppt file, with an empty textbox in it (that says: click to add subtitle), than the _txtrun is null. This causes the TextBox class to throw NullPointerExceptions, in methods getText(), setSheet(). When you create a TextBox, it does create and set _txtrun (line 160 in createSpContainer), so I'm not sure where your problem is? Could you supply the code you use to generate the NullPointerExceptions? Nick - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
PPT Pictures
Hi, I'm having trouble extracting images from PPT files. When I have the same image more than once, the API identifies only one image even if the size differs from one to another. I use the readPictures method in the HSLFSlideShow class which puts the image streams in the a pictstream vector. For instance, if there are 3 pictures, being 2 of them the same picture, the returned vector length is 2. What can I do to get the real images number? Thanks, Tales - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/