Re: PDFBox not rendering all text when converting a specific file to image - Bug?

2022-08-17 Thread Tilman Hausherr
pdate java if you haven't done already. Tilman On Wed, Aug 17, 2022 at 2:58 PM Tilman Hausherr wrote: Am 17.08.2022 um 20:24 schrieb Daniel Skiles: https://drive.google.com/drive/folders/1JujjHzdQEGm8Z544dB5IoVauXf9UdJam?usp=sharing I've rehosted the files on Google Drive. They should b

Re: PDFBox not rendering all text when converting a specific file to image - Bug?

2022-08-17 Thread Tilman Hausherr
should be all you need. Yeah it works with pdfbox-app. I also tried your code, it works too. Tilman On Wed, Aug 17, 2022 at 1:16 PM Tilman Hausherr wrote: Please try with 2.0.26 and with a snapshot https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.0-SNA

Re: PDFBox not rendering all text when converting a specific file to image - Bug?

2022-08-17 Thread Tilman Hausherr
Please try with 2.0.26 and with a snapshot https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.0-SNAPSHOT/ The PNG isn't attached, you need to upload this and the PDF on a file hoster. Tilman Am 17.08.2022 um 18:32 schrieb Daniel Skiles: When I use PDFBox

Re: Fill value in the PDF Checkbox Field using PDAcroForm.

2022-08-17 Thread Tilman Hausherr
Hi, If this is really a PDCheckBox then call check() or uncheck(). I can't see the image, maybe this was an attachment. Tilman Am 17.08.2022 um 08:26 schrieb Damaji Kalunge: Hi Team,   In the editable PDF we have a checkbox as shown below. image.png *Indeterminate / Intersex / Unspecified

Re: ExtractImages command test - images appear different

2022-08-05 Thread Tilman Hausherr
Am 04.08.2022 um 20:41 schrieb Tilman Hausherr: It gets weirder: I get a wrong rendering when using the twelvemonkeys library. fixed in https://issues.apache.org/jira/browse/PDFBOX-5488 (but I still don't know if that was your problem) Tilman

Re: ExtractImages command test - images appear different

2022-08-04 Thread Tilman Hausherr
It gets weirder: I get a wrong rendering when using the twelvemonkeys library. Tilman Am 04.08.2022 um 20:20 schrieb Tilman Hausherr: Hi, Could you upload the image that you got? Here's my first page: Tilman Am 02.08.2022 um 18:09 schrieb Daniel Earwicker: Cool, thanks - that has made

Re: ExtractImages command test - images appear different

2022-08-04 Thread Tilman Hausherr
Hi, Could you upload the image that you got? Here's my first page: Tilman Am 02.08.2022 um 18:09 schrieb Daniel Earwicker: Cool, thanks - that has made it crop the pages as expected but the colour is still as before. I tried specifying each of the documented color depths with the -color

Re: page margins and line spacing

2022-08-04 Thread Tilman Hausherr
Oops, now I see that was a posting by somebody else. I'm somewhat behind reading the messages. Tilman Am 04.08.2022 um 19:23 schrieb Tilman Hausherr: Am 04.08.2022 um 16:52 schrieb Mehmet Fatih ÇİN: sorry please i copied and pasted a message this message was sent by mistake I'm confused

Re: page margins and line spacing

2022-08-04 Thread Tilman Hausherr
Am 04.08.2022 um 16:52 schrieb Mehmet Fatih ÇİN: sorry please i copied and pasted a message this message was sent by mistake I'm confused because this looked like a valid question. What you need is PDFToImage, not ExtractImages. Tilman

Re: page margins and line spacing

2022-08-04 Thread Tilman Hausherr
Hello Mehmet, PDFBox is low level so yourself needs to set the start X coordinate at 1.5cm. 1 pdf unit is 1/72 inch so 1.5 cm would be 1 / (10 * 2.54) * 72 * 15. Note that the y 0 coordinate starts at the bottom. Spacing is also done by you, either by setting the y value or by calling

Re: Preserve encryption after filling out the editable PDF.

2022-08-01 Thread Tilman Hausherr
ental() method that uses a list of dictionary objects, then you'll need it only once. Tilman On 2022/07/27 18:25:08 Tilman Hausherr wrote: You need to know what the original encryption was, and reuse the passwords (user and owner). An alternative would be to use incremental s

Re: Preserve encryption after filling out the editable PDF.

2022-08-01 Thread Tilman Hausherr
to use a password cracker. Don't do it if it is illegal in your country, and be aware that this type of software is sometimes malicious. Tilman Thanks Damaji On 2022/07/26 18:39:36 Tilman Hausherr wrote: Am 26.07.2022 um 12:22 schrieb Damaji Kalunge: Hi Team, We have encrypted Editable

Re: Hello, I'm recently trying to create a ".pdf" document with PDFBox version 3.0. But I couldn't set the font of a text to "Arial". Besides, the Turkish character "\u0131" is not displayed on the do

2022-07-30 Thread Tilman Hausherr
of code below. PDFont font2 = PDTrueTypeFont.loadTTF(doc, new File(getClass().getResource("/ttf/Arial.ttf").toURI())); Mehmet Fatih ÇİN -Original Message----- From: Tilman Hausherr Sent: Saturday, July 30, 2022 6:45 PM To: users@pdfbox.apache.org Subject: Re: Hello, I'm recen

Re: Hello, I'm recently trying to create a ".pdf" document with PDFBox version 3.0. But I couldn't set the font of a text to "Arial". Besides, the Turkish character "\u0131" is not displayed on the do

2022-07-30 Thread Tilman Hausherr
Hello Mehmet, Please post the code you used. Did it use the font with the call PDFont font = PDType0Font.load(doc, new File("c:/windows/fonts/arial.ttf")); ? Tilman Am 30.07.2022 um 17:25 schrieb Mehmet Fatih ÇİN: Hello, I'm recently trying to create a ".pdf" document with PDFBox version

Re: pdfbox Use problems

2022-07-29 Thread Tilman Hausherr
Could it be that the page is rotated? Call page.getRotation() to find out Tilman Am 29.07.2022 um 16:49 schrieb 刘啸: When I use pdfbox to append pictures, the width of the current page is greater than the height. When I use 0,0 coordinates to insert, the pictures are appended to the lower

Re: Pdf compress

2022-07-27 Thread Tilman Hausherr
Am 27.07.2022 um 16:23 schrieb Aiming Xu: Hi, Anyone knows how to compress a big pdf file? Appreciate very much if you can share the sample code or the link to the sample code. There is no general code for this. Sometimes there are PDFs that have all the same flaw and then we can suggest

Re: Preserve encryption after filling out the editable PDF.

2022-07-27 Thread Tilman Hausherr
You need to know what the original encryption was, and reuse the passwords (user and owner). An alternative would be to use incremental saving, but this is tricky as it requires some knowledge of the COS model. See the testSaveIncrementalAfterSign method in TestCreateSignature.java in the

Re: Preserve encryption after filling out the editable PDF.

2022-07-26 Thread Tilman Hausherr
Am 26.07.2022 um 12:22 schrieb Damaji Kalunge: Hi Team, We have encrypted Editable PDF and our requirement is to fill that editable pdf by preserving the same encryption. Approach : Step 1: - Read PDF File . - if encrypted remove encryption -

Re: finding/setting correct breakpoint in jdb debugging of pdfbox/pdfparser in tika?

2022-07-20 Thread Tilman Hausherr
Within pdfbox, set a breakpoint here: If execution stops at that point, then the file still exists. Tilman Am 20.07.2022 um 12:32 schrieb PGNet Dev: On 7/19/22 10:46 PM, Tilman Hausherr wrote: You can also put a breakpoint in PDFBox, then go to org.apache.pdfbox.pdfparser.PDFParser.parse

Re: finding/setting correct breakpoint in jdb debugging of pdfbox/pdfparser in tika?

2022-07-19 Thread Tilman Hausherr
I'm also here  You can also put a breakpoint in PDFBox, then go to org.apache.pdfbox.pdfparser.PDFParser.parse() and when it does breakpoint-stop there (it definitively passes that point!), then look into your /tmp directory for the file that is mentioned in the tika debug output and copy it

Re: pdfbox problem

2022-07-18 Thread Tilman Hausherr
If you add the page a second time then you'd have two pages. You can add stuff to the existing page, i.e. to the content stream. Or you can append an extra content stream. IMHO the two top code lines (after the comment) below the top red line are not needed. Your image didn't get through,

Re: incremental update

2022-07-12 Thread Tilman Hausherr
Hi, This is rather advanced stuff. You need to save the file, reload it, then modify the COSDictionary objects you want, and pass these objects as a list as the second parameter to "saveIncremental()". There is no example, but it is used in the example build tests. In the example

Re: Save Separation image

2022-07-11 Thread Tilman Hausherr
We don't keep intermediate images except for patterns and transparency groups. And these intermediate images are usually RGB so they are not really 100% correct. Or do you want an image of the color chart, like in PDFDebugger? Tilman Am 11.07.2022 um 15:13 schrieb Gianluca Sartori: Dear

Re: Possible bug with FunctionType3?

2022-06-14 Thread Tilman Hausherr
Am 15.06.2022 um 05:42 schrieb Tilman Hausherr: float[] functionResult = function.eval(functionValues); eval is an abstract method, but I don't see how any of its implementation would return null :-(   (but I just woke up) oops, the return of eval() is irrelevant here. Anyway, I fixed

Re: Possible bug with FunctionType3?

2022-06-14 Thread Tilman Hausherr
float[] functionResult = function.eval(functionValues); eval is an abstract method, but I don't see how any of its implementation would return null :-(   (but I just woke up) oh wait, there's this:     if (functionsArray.length == 1)     {     // This doesn't make sense but

Re: Setting the raw value of a COSStream

2022-05-25 Thread Tilman Hausherr
I tried myself with the trunk and got an NPE, and I now used setName() instead of setString() for the two types.

Re: Setting the raw value of a COSStream

2022-05-24 Thread Tilman Hausherr
xmp )     throws IOException     {     try (OutputStream os = createOutputStream())     {     os.write(xmp);     }     } } Maybe create a minimal fully (almost) working program (remove all that isn't relevant) and share your code. Tilman On Mon, May 23, 2022 at 9:03 PM Ti

Re: Setting the raw value of a COSStream

2022-05-23 Thread Tilman Hausherr
Am 23.05.2022 um 14:56 schrieb Gilad Denneboom: Hi all, I have a signed document where the Signature field's value contains a Metadata dictionary with some info in it. I'm trying to reproduce this using PDFBox but am having a hard time doing so. This is what the original structure of the file

Re: Option to ignore transparency when printing?

2022-04-28 Thread Tilman Hausherr
Am 13.04.2022 um 07:04 schrieb Tres Finocchiaro: Thanks, initial tests are going well, this is very helpful! Would PDFBOX consider adding this as part of the API? My first thought was to make this an example, I put it on my todo list and but didn't do anything yet. Tilman Although at

Re: Broken font after rendering

2022-04-21 Thread Tilman Hausherr
This sounds like https://issues.apache.org/jira/browse/PDFBOX-5403 but it's impossible to tell without the file. What you could do is to look for the images in the file, below Root/Pages/Kids/[0]/Resources/XObject . Are the images like in the file

Re: Mirrored text when drawing in a loaded pdf

2022-04-09 Thread Tilman Hausherr
Use the PDPageContentStream constructor with five parameters Tilman Am 09.04.2022 um 15:00 schrieb Tom Eicher: Hello, in my application I use pdfbox 2.0.22 to create documents. I load a (customer supplied) template, then write my content onto it. That works very fine. But now one customer

Re: How read Document Restriction Summary

2022-04-04 Thread Tilman Hausherr
Sounds like the MDP stuff... here some code from the examples subproject     /** * Get the access permissions granted for this document in the DocMDP transform parameters * dictionary. Details are described in the table "Entries in the DocMDP transform parameters * dictionary"

Re: Option to ignore transparency when printing?

2022-03-30 Thread Tilman Hausherr
Hi, Both options are related to transparency groups. These are an extension of XObject forms. I thought about ignoring them, i.e. treat them like ordinary forms. So I looked at PDXobject.createXObject(); this one is called by PDResources.getXObject(), which is called by DrawObject.process().

Re: pdfbox在springboot的多线程中出现java.util.ConcurrentModificationException

2022-03-23 Thread Tilman Hausherr
I suspect the problem is because you're accessing the initial document from multiple threads. Tilman Am 23.03.2022 um 09:52 schrieb 1540731...@qq.com.INVALID: @RequestMapping(value = "/test", method = RequestMethod.GET) public void test(HttpServletResponse response,

Re: Question

2022-03-23 Thread Tilman Hausherr
Hello Salman, Re 1: there is none. Ask or wait... but you're lucky, we're close to a release. Currently fixing regressions. Re 2: Stay with 2.0.* for now. Almost all changes are done there too. Use the alpha for tests / experiments only, unless you know what's in it. Tilman PS: please don't

Re: Corrupt PDF

2022-03-20 Thread Tilman Hausherr
upload to a sharehoster. And take care not to save to the file you were loading from. Tilman I've attached the corrupt file, which I just generated via PDDocument.save. On Sun, Mar 20, 2022 at 12:36 PM Tilman Hausherr wrote: How is it "corrupt"? What is that programming l

Re: Corrupt PDF

2022-03-20 Thread Tilman Hausherr
How is it "corrupt"? What is that programming language and does the "}" close the content stream? Tilman Am 20.03.2022 um 19:26 schrieb Andy Czerwonka: I pull the latest 2.x version and am just playing around with the API. I saw some examples, and was trying to add some text to each page.

Re: Weird new log outputs..

2022-03-17 Thread Tilman Hausherr
7-228-1086 Vacation Alert : No Current Alerts. -Original Message- From: Tilman Hausherr Sent: Wednesday, March 16, 2022 4:12 PM To: users@pdfbox.apache.org Subject: Re: Weird new log outputs.. EXTERNAL EMAIL Am 16.03.2022 um 19:19 schrieb Rauer, Kevin: I haven't set any logging levels,

Re: Trailing Space and Final CRLF Added

2022-03-16 Thread Tilman Hausherr
Fixed in https://issues.apache.org/jira/browse/PDFBOX-5390 I didn't fix the final CR LF, this is probably part of the paragraph handling. I don't see a problem with that. Tilman Am 16.03.2022 um 19:38 schrieb Tilman Hausherr: Yeah, this is a (minor) bug in TextToPDF, so the extraction would

Re: Weird new log outputs..

2022-03-16 Thread Tilman Hausherr
Bremner Blvd. Suite 2300 | Toronto, ON M5J 0A8 | CA | p +1 647-436-5361 | m +1 647-228-1086 Vacation Alert : No Current Alerts. -Original Message- From: Tilman Hausherr Sent: Wednesday, March 16, 2022 2:17 PM To: users@pdfbox.apache.org Subject: Re: Weird new log outputs.. EXTERNAL

Re: Trailing Space and Final CRLF Added

2022-03-16 Thread Tilman Hausherr
Yeah, this is a (minor) bug in TextToPDF, so the extraction would have to be postprocessed. But you already have the text anyway. I'll fix this soon. Tilman Am 16.03.2022 um 13:27 schrieb flywire: Can text be extracted without adding trailing space? *Text.txt* def hello_world():

Re: Weird new log outputs..

2022-03-16 Thread Tilman Hausherr
Hi, This is a debug output. Unless you want to debug something, you shouldn't use debug level for logs. Tilman Am 16.03.2022 um 19:13 schrieb Rauer, Kevin: Good day all.. I have just updated my jar to pdfbox-app-2.0.25.jar which I run in IBM Functional tester 10.2.2 ( which was also just

Re: Add new Named Dest to file

2022-03-16 Thread Tilman Hausherr
Tilman Hausherr wrote: Please upload the file "before" and "after" somewhere, and add load and save code so we can have a look what happens Tilman Am 15.03.2022 um 00:21 schrieb Gilad Denneboom: It's been quite a while since I posted this question, and this issue still

Re: Add new Named Dest to file

2022-03-14 Thread Tilman Hausherr
Please upload the file "before" and "after" somewhere, and add load and save code so we can have a look what happens Tilman Am 15.03.2022 um 00:21 schrieb Gilad Denneboom: It's been quite a while since I posted this question, and this issue still doesn't have a clear answer, as far as I can

Re: Ligature Substitutions, glyph reverse split reorder and gsub in PDFbox

2022-02-28 Thread Tilman Hausherr
That is an unfinished thing, sadly. https://issues.apache.org/jira/browse/PDFBOX-4189 It works for Bengali, but only visually, the text extraction doesn't work. I assume it might be possible to add Tamil, but the text extraction problem would still be there. Tilman Am 28.02.2022 um 19:51

Re: Problem with page rotation

2022-02-23 Thread Tilman Hausherr
include 90 and 180. This will likely require more code. Tilman Am 24.02.2022 um 06:20 schrieb Dalibor Kálna: hi, exactly. question is, ist this the right way to do it? i ask, because i have to explain (and defend) that solution to our client. :) Dalibor Am 24.02.2022 um 04:55 schrieb Tilman

Re: Problem with page rotation

2022-02-23 Thread Tilman Hausherr
Hi, I just tried your project, the final file has the marker on all three pages, i.e. vertically on the left side. Tilman Am 23.02.2022 um 14:37 schrieb dalibor.ka...@bluewin.ch: hi Tilman thank you for you response. to clarify things first, our workflow is: DOCX -> PDF (with some

Re: Problem with page rotation

2022-02-22 Thread Tilman Hausherr
Hi, I haven't tested your code, but I notice that you rotate at two different places, was that intended? You can see an example for a 90° rotated page in the CreateLandscapePDF.java example in the source code download. Tilman Am 22.02.2022 um 09:36 schrieb dalibor.ka...@bluewin.ch: hi

Re: setNonStrokingColor deprecation

2022-02-21 Thread Tilman Hausherr
; The code validates that r, g, b are in the interval [0.0, 1.0]. Cheers, John On Monday, February 21, 2022, 09:14:56 AM PST, Tilman Hausherr wrote: Am 21.02.2022 um 10:23 schrieb Vassallo, Fabio: Good morning. I noticed that I have a PDFBox deprecationwarning in my code (I call setNonStroking

Re: Watermark does not get rendered

2022-02-21 Thread Tilman Hausherr
Hi, Without the file, one can only guess. There are two parts in the code that are suspicious. One is in PageDrawer.java (look for "if (pdImage.isStencil())"), the other one is the entire class TilingPaint.java. In both you can try to save the intermediate BufferedImage object(s) with

Re: setNonStrokingColor deprecation

2022-02-21 Thread Tilman Hausherr
Am 21.02.2022 um 10:23 schrieb Vassallo, Fabio: Good morning. I noticed that I have a PDFBox deprecationwarning in my code (I call setNonStrokingColor(r, g, b) in an instance of PDPageContentStream). You should use the other call with float r, g, b. Divide your values by 255f. You should

Re: Watermark does not get rendered

2022-02-17 Thread Tilman Hausherr
Am 17.02.2022 um 20:20 schrieb Tilman Hausherr: I looked at that content stream again, it seems you're using a pattern "color". IIRC there's a PDFBox weakness with patterns that are very small. See also the first file from https://github.com/mozilla/pdf.js/issues/9627 Tilman

Re: Watermark does not get rendered

2022-02-17 Thread Tilman Hausherr
I looked at that content stream again, it seems you're using a pattern "color". IIRC there's a PDFBox weakness with patterns that are very small. Tilman Am 17.02.2022 um 13:04 schrieb Stefan Sauerer: Hi Tilman, so played around a liitle bit with the PDFDebugger tool (in 2.0.25 & 3.0 alpha).

Re: Watermark does not get rendered

2022-02-16 Thread Tilman Hausherr
Do you get any log messages? (PDFDebugger has a log window, click on the bottom right) Also the latest version is 2.0.25. Try also downloading the 3.0 alpha of PDFDebugger. Tilman Am 16.02.2022 um 09:47 schrieb Stefan Sauerer: Hello PDFBox-Team, some of our customers send me a PDF, which

Re: PDFBox 3 - Is it thread safe?

2022-02-13 Thread Tilman Hausherr
Am 13.02.2022 um 10:19 schrieb Mo Maison: Le 08/02/2022 à 19:17, Tilman Hausherr a écrit : Am 08.02.2022 um 14:35 schrieb Marton Róbert: Dear PDFBox developers, We are working on an application that is parsing and verifying the quality of PDF files. We wanted to extract the text of the pages

Re: Searchable Pdf

2022-02-10 Thread Tilman Hausherr
Am 10.02.2022 um 22:48 schrieb flywire: Can PDFBox make a scanned text document pdf text searchable? No. Use Tesseract, they can do this (IIRC you need to render the PDF first). However there is a minor bug https://github.com/tesseract-ocr/tesseract/issues/2879 Tilman

Re: PDFBox 3 - Is it thread safe?

2022-02-08 Thread Tilman Hausherr
Am 08.02.2022 um 14:35 schrieb Marton Róbert: Dear PDFBox developers, We are working on an application that is parsing and verifying the quality of PDF files. We wanted to extract the text of the pages using multiple threads to speed up the process but eventually we found that PDFBox 2 is not

Re: How to set PDF/A to an existing PDF

2022-02-04 Thread Tilman Hausherr
Am 05.02.2022 um 01:13 schrieb Tommy Wu: Would you send over the icc file that you were using? It is in the source code download, in the directory examples\src\main\resources\org\apache\pdfbox\resources\pdfa Tilman Tilman Hausherr 於 2022年2月3日週四 下午10:45寫道: I've also found out why

Re: How to set PDF/A to an existing PDF

2022-02-03 Thread Tilman Hausherr
I've also found out why the file produced with the trunk wasn't OK, this is because my pom.xml had used Apache FOP for some reason. Tilman Am 04.02.2022 um 04:37 schrieb Tilman Hausherr: Am 04.02.2022 um 01:48 schrieb Tommy Wu: Sorry I am bit confused. Do you mean you have a version

Re: How to set PDF/A to an existing PDF

2022-02-03 Thread Tilman Hausherr
. Tilman Tilman Hausherr 於 2022年2月3日週四 下午1:29寫道: I must correct myself. The file created by 2.0 is fine on both services that I mentioned. The problem is only in the trunk, and only in my local version. I never heard about "PDFen". The two other services I mentioned are well k

Re: How to set PDF/A to an existing PDF

2022-02-03 Thread Tilman Hausherr
est-output/PDFA.pdf tilman trunk xmp: target/test-output/PDFA.pdf  ) Tilman Am 03.02.2022 um 11:15 schrieb Tommy Wu: So how should we address this? Tilman Hausherr 於 2022年2月2日週三 下午11:42寫道: OK, now we have a problem. I tried https://www.pdf-online.com/osa/validate.aspx Validating file

Re: How to set PDF/A to an existing PDF

2022-02-02 Thread Tilman Hausherr
OK, now we have a problem. I tried https://www.pdf-online.com/osa/validate.aspx Validating file "PDFA.pdf" for conformance level pdfa-1b dc:title/*[0] :: Missing language qualifier. The document does not conform to the requested standard. The document's meta data is either missing

Re: How to set PDF/A to an existing PDF

2022-02-02 Thread Tilman Hausherr
-validator, it said image.png Tilman Hausherr 於 2022年2月1日週二 下午11:04寫道: Because you're not using the 2.0 CreatePDFA.java example. That one goes like this: // add XMP metadata XMPMetadata xmp = XMPMetadata.createXMPMetadata();   

Re: How to set PDF/A to an existing PDF

2022-02-01 Thread Tilman Hausherr
PSchemaPDFAId location: class com.example.pdfboxtest.CreatePDFAjava: incompatible types: org.apache.xmpbox.XMPMetadata cannot be converted to byte[]* Tilman Hausherr 於 2022年2月1日週二 下午3:03寫道: Use xmpbox, not jempbox. Tilman Am 01.02.2022 um 20:19 schrieb Tommy Wu: Here's the dependency I

Re: How to set PDF/A to an existing PDF

2022-02-01 Thread Tilman Hausherr
types: org.apache.jempbox.xmp.XMPMetadata cannot be converted to byte[] line 40 is metadata.importXMPMetadata(xmp); Tilman Hausherr 於 2022年2月1日週二 下午1:56寫道: Then please tell what errors you get, and what libraries you're using (hopefully pdfbox, fontbox and xmpbox). Tilman Am 01.02.2022 um

Re: How to set PDF/A to an existing PDF

2022-02-01 Thread Tilman Hausherr
Then please tell what errors you get, and what libraries you're using (hopefully pdfbox, fontbox and xmpbox). Tilman Am 01.02.2022 um 19:42 schrieb Tommy Wu: I can't even get it to compile Andreas Lehmkuehler 於 2022年2月1日週二 上午12:35寫道: Hi, Am 31.01.22 um 22:03 schrieb Tommy Wu: The

Re: Help with line needed

2022-01-28 Thread Tilman Hausherr
Am 29.01.2022 um 01:49 schrieb Tim Mann: content.setLineDashPattern (new float[]{1}, 0); // reset lines to solid // added 01262022 No that's a dash pattern of length 1. use "(new float[0], 0)".

Re: Problem with text extraction

2022-01-23 Thread Tilman Hausherr
Hi, Your screenshots didn't get through. There are so many things that can go wrong with PDF, so it's difficult to tell without the file. "then pasted the text into a text editor. The text pasted this way matches the extracted text" Then it means PDFBox is correct. It's possible that the

Re: Intermittent printing problem with images on Afinia L801 label printer

2022-01-19 Thread Tilman Hausherr
We've had troubles with label printers, but not this type of trouble. Does this also happen if the same document is printed again? If yes, it is reproducible 1) on PDFDebugger, 2) with a different printer? Also make sure you're using the latest version of PDFBox (2.0.25), of java, and of

Re: Unable to set Page Size using setMediaBox on a PDF document created by PDFMergerUtility

2022-01-16 Thread Tilman Hausherr
Maybe you need to set the cropBox as well. Tilman Am 16.01.2022 um 14:29 schrieb Dip Narayan Sarkar: Hi,     Please see the sample code below: In the final Output A_3.pdf , the page size does not change  to A3. Please let me know if I am missing something. public static void

Re: Wrong datatype for OPM in PDExtendedGraphicsState

2022-01-14 Thread Tilman Hausherr
Fixed in https://issues.apache.org/jira/browse/PDFBOX-5361 Thanks for reporting this! Tilman Am 13.01.2022 um 09:50 schrieb Hiller, Gerhard: Hi everybody, we are using pdfbox to create PDFs for print production end encountered a problem using PDExtendedGraphicsState.

Re: Wrong datatype for OPM in PDExtendedGraphicsState

2022-01-13 Thread Tilman Hausherr
Yeah this is a bug. I'll fix it soon unless somebody else does it. Tilman Am 13.01.2022 um 09:50 schrieb Hiller, Gerhard: Hi everybody, we are using pdfbox to create PDFs for print production end encountered a problem using PDExtendedGraphicsState. PDFExtendesGraphicState offers the

Re: Mask is not applied.

2022-01-13 Thread Tilman Hausherr
instantiates the pdfBox objects has the following log4j config: http://www.w3.org/2001/XInclude; > I get logging from our code but not from pdfBox, although when tracing I can see that it is logging somewhere ... Could you give me some advice? Thanks again. Jeremy

Re: LegacyPDFStreamEngine

2022-01-13 Thread Tilman Hausherr
 Yeah this is somewhat scary language, I wasn't really deep into text extraction at the time and still am not. IIRC the problem is that the height is a heuristic. Tilman Am 13.01.2022 um 18:55 schrieb Vincent Letard: Hi, I have a question about showGlyph() from LegacyPDFStreamEngine: In

Re: Mask is not applied.

2022-01-12 Thread Tilman Hausherr
Are the JBIG2 and JPX plugins in your classpaths? Did you see any log messages? Did you activate logging? Tilman Am 12.01.2022 um 22:46 schrieb Jeremy Young: The attached pdf is the first page referred to in this link from Gunnar Brand.

Re: memory requirements when merging PDF files?

2022-01-07 Thread Tilman Hausherr
AM Tilman Hausherr wrote: Am 06.01.2022 um 18:26 schrieb John Lussmyer: I have a need to merge a couple thousand PDF's into one humongous PDF. The old tool we use for PDF manipulation runs out of memory as it builds the result PDF in memory, and only writes it out when done. Can PDFBox do

Re: memory requirements when merging PDF files?

2022-01-06 Thread Tilman Hausherr
Am 06.01.2022 um 18:26 schrieb John Lussmyer: I have a need to merge a couple thousand PDF's into one humongous PDF. The old tool we use for PDF manipulation runs out of memory as it builds the result PDF in memory, and only writes it out when done. Can PDFBox do something more like streaming

Re: Strange corrupted font in FontCache

2021-12-24 Thread Tilman Hausherr
samedi 18 décembre 2021 à 14:59:10 UTC+1, Tilman Hausherr a écrit : Hi, Yes I suspect this is a parallel access problem, probably parallel initialization of a standard 14 font. This has made occasional troubles for years (despite that we tried to solve this) which is why it was modif

Re: Rendering a page with annotations, observing the "Print" flag

2021-12-23 Thread Tilman Hausherr
' with them! - Richard P. Feynman On Thu, Dec 23, 2021 at 6:57 PM Tilman Hausherr wrote: Behavior is not correct, we do catch this case: if (deviceType == GraphicsDevice.TYPE_PRINTER && !annotation.isPrinted()) { return; } Please create

Re: Rendering a page with annotations, observing the "Print" flag

2021-12-23 Thread Tilman Hausherr
Behavior is not correct, we do catch this case:     if (deviceType == GraphicsDevice.TYPE_PRINTER && !annotation.isPrinted())     {     return;     } Please create an issue in JIRA with the PDF file. Tilman Am 23.12.2021 um 15:02 schrieb Constantine Dokolas: I have a

Re: PDFBox Opening Additional Java Process

2021-12-22 Thread Tilman Hausherr
Very mysterious. I searched into the tabula sources for "swing" and "awt" and didn't find anything that opens a window. PDFBox build tests open a window. Tilman Am 23.12.2021 um 08:08 schrieb Andreas Lehmkuehler: Hi, Am 21.12.21 um 20:32 schrieb Charles Givre: Hello there, I have a small

Re: Question: Support specifications of PDF

2021-12-22 Thread Tilman Hausherr
Hello, None of these is supported out of the box. I did watch a presentation of PDF/Raster ( https://pdfraster.org/ ) and I think it is possible to create such PDFs with PDFBox, but one would need to read the specification

Re: Relationship between PDF and DPI

2021-12-20 Thread Tilman Hausherr
to write something) Resize with java imaging / java 2D methods. Tilman On 2021/12/16 19:13:06 Tilman Hausherr wrote: Am 16.12.2021 um 20:07 schrieb Peter Kronenberg: I'm using PdfRenderer.renderImageToDPI() and I'm trying to understand exactly how the DPI is computed. From looking at the code

Re: Blank page after conversion

2021-12-20 Thread Tilman Hausherr
Please do also make a clean build with the latest version 2.0.25. Delete all old jar files from the class path. Tilman - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail:

Re: Blank page after conversion

2021-12-20 Thread Tilman Hausherr
Hi, I looked at your PDF again. That PDF is a scanned image with some invisible OCR, so this font thing is probably not relevant (although I can't explain it). The Arial0 font is used only for a dot. (That is invisible) You have PDFDebugger, please try displaying this: it should display the

Re: Strange corrupted font in FontCache

2021-12-18 Thread Tilman Hausherr
Hi, Yes I suspect this is a parallel access problem, probably parallel initialization of a standard 14 font. This has made occasional troubles for years (despite that we tried to solve this) which is why it was modified in 3.0. Please try this workaround: PDType1Font.COURIER.getPath("a");

Re: The signature is invalid because the content of pdf is changed

2021-12-17 Thread Tilman Hausherr
Hi, You have to load from a file or from a stream (loading from byte array should also work). You should not pass the same output stream as before. If it still doesn't work, please create a minimal project. Tilman Am 17.12.2021 um 14:28 schrieb Oleksandr Kiselev: Hello everyone. I

Re: FDFAnnotation not processing xfdf file with

2021-12-17 Thread Tilman Hausherr
Hi, It's not supported, and I don't know what we should do with it. The XFDF specification doesn't mention it. https://www.immagic.com/eLibrary/ARCHIVES/TECH/ADOBE/A070914X.pdf Tilman Am 17.12.2021 um 06:51 schrieb Kishore Gade: Hi, Our xfdf file has tag and I noticed that

Re: Relationship between PDF and DPI

2021-12-16 Thread Tilman Hausherr
Am 16.12.2021 um 20:07 schrieb Peter Kronenberg: I’m using PdfRenderer.renderImageToDPI() and I’m trying to understand exactly how the DPI is computed.  From looking at the code, it seems that 72 DPI is the default, meaning that it will have the same resolution as the original document.  Is

Re: log4j vulnerability?

2021-12-16 Thread Tilman Hausherr
No Tilman Am 16.12.2021 um 09:43 schrieb Thomas Möller: Hello, is the use of the prebuild pdfbox.jar in any manner affected by the log4j security problems? Best Regards, Thomas M. - To unsubscribe, e-mail:

Re: Blank page after conversion

2021-12-13 Thread Tilman Hausherr
Am 13.12.2021 um 10:58 schrieb Stefan Sauerer: Hi Tilman, sorry for my late response. I did not recognize your response in this mail portal. So with PDFDebugger you were not able to reproduce... Are you able to reproduce it with PDFBox under windows? Unfortunately, there are no additional

Re: Fwd: rotating pages and setting origin

2021-12-08 Thread Tilman Hausherr
I've added a response to your SO question. Please mention if it doesn't help. Tilman - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: Blank page after conversion

2021-12-07 Thread Tilman Hausherr
Hi, I'm unable to reproduce this with PDFDebugger on windows. Is there more logging output? I wonder if PDFBox uses one of your local fonts and that one is broken. Tilman PS to anybody else: the "," is part of the password Am 07.12.2021 um 15:56 schrieb Stefan Sauerer: Hello PDFBox-Team,

Re: How to get MuPDF TextPage functionality with PDFBox?

2021-12-06 Thread Tilman Hausherr
Hi, Text extraction is available from the PDFTextStripper class. A subclass can create HTML. All the rest you'll have to write yourself. Tilman Am 06.12.2021 um 13:52 schrieb shah manon: For organizing books and article I need a light weight PDF viewer with copy, highlight, image snap

Re: PDAnnotationTextMarkup workaround to include action?

2021-12-04 Thread Tilman Hausherr
happens. this is my code PDActionURI action = new PDActionURI(); action.setURI("https://www.google.com;); txtMark.getCOSObject().setItem(COSName.AA, action); On Sun, Dec 5, 2021 at 12:09 PM Tilman Hausherr wrote: Am 05.12.2021 um 04:57 schrieb chitgoks: Hi tilman. the s

Re: PDAnnotationTextMarkup workaround to include action?

2021-12-04 Thread Tilman Hausherr
. Try setting it manually annotation.getCOSObject().setItem(COSName.AA, actions); the PDF specification is unclear, in "trigger events" it mentions that it works in annotations, but then it's only specifically mentioned in widget annotations. Tilman On Sun, Dec 5, 2021 at 11:39

Re: PDAnnotationTextMarkup workaround to include action?

2021-12-04 Thread Tilman Hausherr
Am 05.12.2021 um 03:11 schrieb chitgoks: are there ways to add functionality to PDAnnotationTextMarkup wherein a click can do execute something? like opening a new url? The additional action /D might be what you need. and it turns out we have an example about this, FieldTriggers.java in the

Re: Managing logging

2021-11-25 Thread Tilman Hausherr
Use and configure your favorite logger, e.g. log4j 2. Tilman Am 25.11.2021 um 14:50 schrieb Bernhard Fey: Hello, We are in the process of integrating PDFBox into our PDF library, so we can use it to merge or encrypt the PDFs we create. This works very well, but we have one remaining issue:

Re: FontBox: Failing to get multiple encodings from cmap table

2021-11-20 Thread Tilman Hausherr
Fixed: https://issues.apache.org/jira/browse/PDFBOX-5328 Thank you for reporting this! Tilman - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: FontBox: Failing to get multiple encodings from cmap table

2021-11-17 Thread Tilman Hausherr
Hi, First thank you for that research. My feeling is that you're right, I looked in the tables with DTL OTMaster 3.7 light and there are indeed two entries. I have also prepared a fix (it works) but this was just by looking at the code logic, I need to read the specification too to be sure,

<    1   2   3   4   5   6   7   8   9   10   >