date:20140423

[jira] [Commented] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

2014-04-23 Thread Andrei Solntsev (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977879#comment-13977879
 ] 

Andrei Solntsev commented on PDFBOX-2039:
-

No problems, the interface java.io.Closeable is available since Java 1.5

 Class PDDocument should implement java.io.Closeable
 ---

 Key: PDFBOX-2039
 URL: https://issues.apache.org/jira/browse/PDFBOX-2039
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
   Original Estimate: 1h
  Remaining Estimate: 1h

 It would make it possible to use Java 7 try-with-resources feature:
 try (PDDocument doc = PDDocument.load(outputFile)) {
   // bla-bla
   // no need to call doc.close(); explicitly
 }
 P.S. Actually all org.apache.pdfbox.* classes with method close() could 
 implement java.io.Closeable



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

2014-04-23 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2039:


Affects Version/s: 1.8.5
   1.8.4

 Class PDDocument should implement java.io.Closeable
 ---

 Key: PDFBOX-2039
 URL: https://issues.apache.org/jira/browse/PDFBOX-2039
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
   Original Estimate: 1h
  Remaining Estimate: 1h

 It would make it possible to use Java 7 try-with-resources feature:
 try (PDDocument doc = PDDocument.load(outputFile)) {
   // bla-bla
   // no need to call doc.close(); explicitly
 }
 P.S. Actually all org.apache.pdfbox.* classes with method close() could 
 implement java.io.Closeable



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

2014-04-23 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977504#comment-13977504
 ] 

Tilman Hausherr edited comment on PDFBOX-2039 at 4/23/14 6:12 AM:
--

-The 1.8 version must support JDK5, so it is not possible.- The 2.0 version has 
COSDocument and PDDocument that are Closeable.


was (Author: tilman):
The 1.8 version must support JDK5, so it is not possible. The 2.0 version has 
COSDocument and PDDocument that are Closeable.

 Class PDDocument should implement java.io.Closeable
 ---

 Key: PDFBOX-2039
 URL: https://issues.apache.org/jira/browse/PDFBOX-2039
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
   Original Estimate: 1h
  Remaining Estimate: 1h

 It would make it possible to use Java 7 try-with-resources feature:
 try (PDDocument doc = PDDocument.load(outputFile)) {
   // bla-bla
   // no need to call doc.close(); explicitly
 }
 P.S. Actually all org.apache.pdfbox.* classes with method close() could 
 implement java.io.Closeable



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

2014-04-23 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2039:


Fix Version/s: 2.0.0
   1.8.5

 Class PDDocument should implement java.io.Closeable
 ---

 Key: PDFBOX-2039
 URL: https://issues.apache.org/jira/browse/PDFBOX-2039
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
 Fix For: 1.8.5, 2.0.0

   Original Estimate: 1h
  Remaining Estimate: 1h

 It would make it possible to use Java 7 try-with-resources feature:
 try (PDDocument doc = PDDocument.load(outputFile)) {
   // bla-bla
   // no need to call doc.close(); explicitly
 }
 P.S. Actually all org.apache.pdfbox.* classes with method close() could 
 implement java.io.Closeable



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

2014-04-23 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977890#comment-13977890
 ] 

Tilman Hausherr commented on PDFBOX-2039:
-

For a start, I added it for PDDocument and COSDocument in 1.8 in rev 1589346, 
because these are the methods where it is implemented in 2.0. I'll have a look 
at other methods later. You can get it immediately with svn if you want to test 
improved code, or here in a few hours:
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/1.8.5-SNAPSHOT/

 Class PDDocument should implement java.io.Closeable
 ---

 Key: PDFBOX-2039
 URL: https://issues.apache.org/jira/browse/PDFBOX-2039
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
 Fix For: 1.8.5, 2.0.0

   Original Estimate: 1h
  Remaining Estimate: 1h

 It would make it possible to use Java 7 try-with-resources feature:
 try (PDDocument doc = PDDocument.load(outputFile)) {
   // bla-bla
   // no need to call doc.close(); explicitly
 }
 P.S. Actually all org.apache.pdfbox.* classes with method close() could 
 implement java.io.Closeable



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2038) Method VisualSignatureParser#parse does not close COSDocument

2014-04-23 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2038:


Affects Version/s: 2.0.0
   1.8.5

 Method VisualSignatureParser#parse does not close COSDocument
 -

 Key: PDFBOX-2038
 URL: https://issues.apache.org/jira/browse/PDFBOX-2038
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
   Original Estimate: 1h
  Remaining Estimate: 1h

 I am adding a visual signature to my PDF.
 SignatureOptions options = new SignatureOptions();
 options.setVisualSignature( new FileInputStream(my.jpg) );
 After a while I am getting the following warning in logs:
 Warning: COSDocument: You did not close a PDF Document
 The problem cause is probably the method 
 org.apache.pdfbox.pdfparser.VisualSignatureParser#parse which creates 
 instance of COSDocument, but does not close it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2038) Method VisualSignatureParser#parse does not close COSDocument

2014-04-23 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977904#comment-13977904
 ] 

Tilman Hausherr commented on PDFBOX-2038:
-

setVisualSignature does create a COSDocument, which contains the visual 
signature that is to be used later. So it can't be closed immediately. 
Currently, all you can do is to call options.getVisualSignature() and close 
that object. An improvement might be to add a close() method to 
SignatureOptions, but I'm not one of the signature people here, so I'd rather 
wait for their opinion.

 Method VisualSignatureParser#parse does not close COSDocument
 -

 Key: PDFBOX-2038
 URL: https://issues.apache.org/jira/browse/PDFBOX-2038
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Priority: Minor
   Original Estimate: 1h
  Remaining Estimate: 1h

 I am adding a visual signature to my PDF.
 SignatureOptions options = new SignatureOptions();
 options.setVisualSignature( new FileInputStream(my.jpg) );
 After a while I am getting the following warning in logs:
 Warning: COSDocument: You did not close a PDF Document
 The problem cause is probably the method 
 org.apache.pdfbox.pdfparser.VisualSignatureParser#parse which creates 
 instance of COSDocument, but does not close it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (PDFBOX-2041) Convert PDF to Image (Strange Color)

2014-04-23 Thread ahfei (JIRA)

ahfei created PDFBOX-2041:
-

 Summary: Convert PDF to Image (Strange Color)
 Key: PDFBOX-2041
 URL: https://issues.apache.org/jira/browse/PDFBOX-2041
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 1.8.4
 Environment: Java(1.7.0_45),   OS (Ubuntu) 
Reporter: ahfei


Using PDFBox, tried to convert PDF to Image file  (case1.pdf, case1.jpg)
Below is code i'm using : 

BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 200);
ImageIOUtil.writeImage(image, jpg, imagePath, BufferedImage.TYPE_INT_RGB, 
200);

After convert, this image isn't look like pdf. Half page of it become blue and 
black color. Attached images  PDF.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2041) Convert PDF to Image (Strange Color)

2014-04-23 Thread ahfei (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ahfei updated PDFBOX-2041:
--

Description: 
Using PDFBox, tried to convert PDF to Image file  (case1.pdf, case1.jpg)
Below is code i'm using : 

BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 200);
ImageIOUtil.writeImage(image, jpg, imagePath, BufferedImage.TYPE_INT_RGB, 
200);

After convert, this image isn't look like pdf. Half page of it become blue and 
black color. 

Attached images  PDF : https://www.dropbox.com/sh/jevegc8bh09km1o/5XkVwPUxri 

  was:
Using PDFBox, tried to convert PDF to Image file  (case1.pdf, case1.jpg)
Below is code i'm using : 

BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 200);
ImageIOUtil.writeImage(image, jpg, imagePath, BufferedImage.TYPE_INT_RGB, 
200);

After convert, this image isn't look like pdf. Half page of it become blue and 
black color. Attached images  PDF.


 Convert PDF to Image (Strange Color)
 

 Key: PDFBOX-2041
 URL: https://issues.apache.org/jira/browse/PDFBOX-2041
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 1.8.4
 Environment: Java(1.7.0_45),   OS (Ubuntu) 
Reporter: ahfei

 Using PDFBox, tried to convert PDF to Image file  (case1.pdf, case1.jpg)
 Below is code i'm using : 
 BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 200);   
  
 ImageIOUtil.writeImage(image, jpg, imagePath, BufferedImage.TYPE_INT_RGB, 
 200);
 After convert, this image isn't look like pdf. Half page of it become blue 
 and black color. 
 Attached images  PDF : https://www.dropbox.com/sh/jevegc8bh09km1o/5XkVwPUxri 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1241) Better handle of missing offset at the end of a file

2014-04-23 Thread Manuel Mahringer (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978008#comment-13978008
 ] 

Manuel Mahringer commented on PDFBOX-1241:
--

With the trunk version from 22.04.2014 the issue isn't reproduceable anymore.

 Better handle of missing offset at the end of a file
 

 Key: PDFBOX-1241
 URL: https://issues.apache.org/jira/browse/PDFBOX-1241
 Project: PDFBox
  Issue Type: Improvement
  Components: Parsing, Text extraction
Affects Versions: 1.6.0
 Environment: All platforms affected
Reporter: Ernst Eibensteiner
 Attachments: On the Insert tab.pdf


 We came across PDF files that do not have an offset at the end of the file.
 This leads to the following exeption:
 c:\tmp java -jar pdfbox-app-1.6.0.jar ExtractText -endPage 1 On the Insert 
 tab.pdf
 ExtractText failed with the following exception:
 java.io.IOException: Error: Expected an integer type, actual=''
 at 
 org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1384)
 at 
 org.apache.pdfbox.pdfparser.PDFParser.parseStartXref(PDFParser.java:6
 63)
 at 
 org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:464)
 at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1088)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:978)
 at org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:196)
 at org.apache.pdfbox.ExtractText.main(ExtractText.java:76)
 at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
 While these PDFs are non-conforming, it'd be an improvement to allow them to 
 be read and processed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (PDFBOX-2042) ColorSpace without Range

2014-04-23 Thread Juraj Lonc (JIRA)

Juraj Lonc created PDFBOX-2042:
--

 Summary: ColorSpace without Range
 Key: PDFBOX-2042
 URL: https://issues.apache.org/jira/browse/PDFBOX-2042
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Juraj Lonc


I have PDF document where I am modifying PDPage content stream.
Saved document is invalid (Adobe reader complains about it).

I have narrowed it down to ColorSpace. 

Original document has colorspace:
/ColorSpace 
/Cs6 [/ICCBased 
/Alternate /DeviceRGB
/Filter /FlateDecode
/Length 2597
/N 3
]

Modified document has colorspace:
/ColorSpace 
/Cs6 [/ICCBased 
/Alternate /DeviceRGB
/Filter /FlateDecode
/Length 2597
/N 3
/Range []
]

When I manually remove /Range [] from PDF then Adobe reader opens it without 
an error.

Obviously that range is added by calling PDICCBased.getRangeArray(0) somewhere.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2042) ColorSpace without Range

2014-04-23 Thread Juraj Lonc (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juraj Lonc updated PDFBOX-2042:
---

Attachment: pdfbox18.pdf

Original (working) file.

 ColorSpace without Range
 

 Key: PDFBOX-2042
 URL: https://issues.apache.org/jira/browse/PDFBOX-2042
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Juraj Lonc
 Attachments: pdfbox18.pdf


 I have PDF document where I am modifying PDPage content stream.
 Saved document is invalid (Adobe reader complains about it).
 I have narrowed it down to ColorSpace. 
 Original document has colorspace:
 /ColorSpace 
 /Cs6 [/ICCBased 
 /Alternate /DeviceRGB
 /Filter /FlateDecode
 /Length 2597
 /N 3
 ]
 Modified document has colorspace:
 /ColorSpace 
 /Cs6 [/ICCBased 
 /Alternate /DeviceRGB
 /Filter /FlateDecode
 /Length 2597
 /N 3
 /Range []
 ]
 When I manually remove /Range [] from PDF then Adobe reader opens it 
 without an error.
 Obviously that range is added by calling PDICCBased.getRangeArray(0) 
 somewhere.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2042) ColorSpace without Range

2014-04-23 Thread Juraj Lonc (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juraj Lonc updated PDFBOX-2042:
---

Attachment: pdfbox20.pdf

Modified file in pdfbox 2.0.0 (error in Adobe Reader)

 ColorSpace without Range
 

 Key: PDFBOX-2042
 URL: https://issues.apache.org/jira/browse/PDFBOX-2042
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Juraj Lonc
 Attachments: pdfbox18.pdf, pdfbox20.pdf


 I have PDF document where I am modifying PDPage content stream.
 Saved document is invalid (Adobe reader complains about it).
 I have narrowed it down to ColorSpace. 
 Original document has colorspace:
 /ColorSpace 
 /Cs6 [/ICCBased 
 /Alternate /DeviceRGB
 /Filter /FlateDecode
 /Length 2597
 /N 3
 ]
 Modified document has colorspace:
 /ColorSpace 
 /Cs6 [/ICCBased 
 /Alternate /DeviceRGB
 /Filter /FlateDecode
 /Length 2597
 /N 3
 /Range []
 ]
 When I manually remove /Range [] from PDF then Adobe reader opens it 
 without an error.
 Obviously that range is added by calling PDICCBased.getRangeArray(0) 
 somewhere.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

community bonding period

2014-04-23 Thread Tilman Hausherr

Although I'm only mentoring Shaola, maybe some of it is useful for 
Dimuthu as well:


From the mentors list:
===
We now are in the community bonding period [1] which lasts until May 19. 
During this period students should learn about your project, your 
release processes, the Apache Way, how we do things around here, 
interact with the community and close any knowledge gaps they might 
have. [1] 
http://googlesummerofcode.blogspot.com/2007/04/so-what-is-this-community-bonding-all.html

===
Here's a FAQ about Apache:
https://www.apache.org/foundation/faq.html
IMHO most important are What is Apache about? and What is Apache not 
about?. (My personal addendum to that is Apache is not like 
Wikipedia. If you've ever edited in wikipedia, you'll notice the 
difference after a few days)


https://www.apache.org/foundation/how-it-works.html
The roles are simpler than in that text, all committers here are PMC 
members, and the PMC chair (Andreas) is also ASF member.


Only committers and above have write access to the official PDFBOX 
repository. So the best would be to set up a copy on an open source 
repository.

https://en.wikipedia.org/wiki/Comparison_of_open-source_software_hosting_facilities

We're trying to be transparent. So stuff that deals with the 
implementation of the project should probably be in the ticket. To see 
what I mean, have a look at 
https://issues.apache.org/jira/browse/PDFBOX-615 and the related issues. 
PDFBOX-615 started with I will be trying to add this functionality this 
week but it became a huge effort by several people that ended 4 years 
later :-) See also John's remarks about my code. It annoyed me somewhat 
at the beginning, but at the end it resulted in much better code.


Note that you can edit in JIRA. See an example here
https://issues.apache.org/jira/browse/PDFBOX-2039
i.e. you can modify previous posts.

Stuff that deals with PDFBOX in general is best in this (publicly 
readable) mailing list. The advantage is that others might answer you 
(if they want) when I'm working, sleeping, or not on the internet for 
whatever reason. Stuff that deals with java, svn and maven - e-mail me 
if you don't get the answer within a few minutes from google or from 
stackoverflow, i.e. don't waste time searching.


Using other libraries: this is OK as long as they have an Apache license 
or a compatible license (GPL is not). However we don't use many 
libraries, everything is already big, so if you want, ask first. (Sorry 
if you already mentioned a library, will reread your proposal again 
later) Of course it is always OK to temporary use whatever you want to 
just test a theory / strategy / algorithm.
Using other code: the code should rather be your own, but you can use 
small excerpts from stackoverflow.com etc but indicate it in your code 
with a link. Always comment in the code if you were inspired by other 
peoples code or algorithms or research papers, just look at the existing 
shading code for how I did it.


Don't forget the Apache header in new modules.

Your code should work on JDK5, so that we can use it in the 1.8 version 
too. So don't use diamond operators, lambda expressions or even 
String.isEmpty().


IDE: I recommend netbeans but you're free to use your own. Just make 
sure that svn (and whatever the hoster will use) and maven are 
integrated in it, this will make your life easier.


A personal recommendation from my student days in the 80ies: don't work 
all night. Such code was usually found to be poor/worthless after I had 
the much needed sleep.


Andreas: correct me if I forgot something.

Tilman

Re: community bonding period

2014-04-23 Thread John Hewson

Great advice!

-- John

On 23 Apr 2014, at 15:49, Tilman Hausherr thaush...@t-online.de wrote:

Although I'm only mentoring Shaola, maybe some of it is useful for Dimuthu as
well:

From the mentors list:
===
We now are in the community bonding period [1] which lasts until May 19.
During this period students should learn about your project, your release
processes, the Apache Way, how we do things around here, interact with the
community and close any knowledge gaps they might have. [1]
http://googlesummerofcode.blogspot.com/2007/04/so-what-is-this-community-bonding-all.html
===
Here's a FAQ about Apache:
https://www.apache.org/foundation/faq.html
IMHO most important are What is Apache about? and What is Apache not
about?. (My personal addendum to that is Apache is not like Wikipedia. If
you've ever edited in wikipedia, you'll notice the difference after a few
days)

https://www.apache.org/foundation/how-it-works.html
The roles are simpler than in that text, all committers here are PMC members,
and the PMC chair (Andreas) is also ASF member.

Only committers and above have write access to the official PDFBOX
repository. So the best would be to set up a copy on an open source
repository.
https://en.wikipedia.org/wiki/Comparison_of_open-source_software_hosting_facilities

We're trying to be transparent. So stuff that deals with the implementation
of the project should probably be in the ticket. To see what I mean, have a
look at https://issues.apache.org/jira/browse/PDFBOX-615 and the related
issues. PDFBOX-615 started with I will be trying to add this functionality
this week but it became a huge effort by several people that ended 4 years
later :-) See also John's remarks about my code. It annoyed me somewhat at
the beginning, but at the end it resulted in much better code.

Note that you can edit in JIRA. See an example here
https://issues.apache.org/jira/browse/PDFBOX-2039
i.e. you can modify previous posts.

Stuff that deals with PDFBOX in general is best in this (publicly readable)
mailing list. The advantage is that others might answer you (if they want)
when I'm working, sleeping, or not on the internet for whatever reason. Stuff
that deals with java, svn and maven - e-mail me if you don't get the answer
within a few minutes from google or from stackoverflow, i.e. don't waste time
searching.

Using other libraries: this is OK as long as they have an Apache license or a
compatible license (GPL is not). However we don't use many libraries,
everything is already big, so if you want, ask first. (Sorry if you already
mentioned a library, will reread your proposal again later) Of course it is
always OK to temporary use whatever you want to just test a theory / strategy
/ algorithm.
Using other code: the code should rather be your own, but you can use small
excerpts from stackoverflow.com etc but indicate it in your code with a link.
Always comment in the code if you were inspired by other peoples code or
algorithms or research papers, just look at the existing shading code for how
I did it.

Don't forget the Apache header in new modules.

Your code should work on JDK5, so that we can use it in the 1.8 version too.
So don't use diamond operators, lambda expressions or even String.isEmpty().

IDE: I recommend netbeans but you're free to use your own. Just make sure
that svn (and whatever the hoster will use) and maven are integrated in it,
this will make your life easier.

A personal recommendation from my student days in the 80ies: don't work all
night. Such code was usually found to be poor/worthless after I had the much
needed sleep.

Andreas: correct me if I forgot something.

Tilman

[jira] [Closed] (PDFBOX-1241) Better handle of missing offset at the end of a file

2014-04-23 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-1241.
---

   Resolution: Fixed
Fix Version/s: 1.8.3

I tested with old versions, it failed until before 1.8.3. Since 1.8.3. has 
already been released, I assume I should close it and not just set it do 
resolved.

 Better handle of missing offset at the end of a file
 

 Key: PDFBOX-1241
 URL: https://issues.apache.org/jira/browse/PDFBOX-1241
 Project: PDFBox
  Issue Type: Improvement
  Components: Parsing, Text extraction
Affects Versions: 1.6.0
 Environment: All platforms affected
Reporter: Ernst Eibensteiner
 Fix For: 1.8.3

 Attachments: On the Insert tab.pdf


 We came across PDF files that do not have an offset at the end of the file.
 This leads to the following exeption:
 c:\tmp java -jar pdfbox-app-1.6.0.jar ExtractText -endPage 1 On the Insert 
 tab.pdf
 ExtractText failed with the following exception:
 java.io.IOException: Error: Expected an integer type, actual=''
 at 
 org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1384)
 at 
 org.apache.pdfbox.pdfparser.PDFParser.parseStartXref(PDFParser.java:6
 63)
 at 
 org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:464)
 at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1088)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:978)
 at org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:196)
 at org.apache.pdfbox.ExtractText.main(ExtractText.java:76)
 at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
 While these PDFs are non-conforming, it'd be an improvement to allow them to 
 be read and processed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

2014-04-23 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2039.
-

Resolution: Fixed
  Assignee: Tilman Hausherr

Done in rev 1589459 and 1589467 in the trunk and rev 1589465 in the 1.8 branch. 
Thanks for pointing me to this!

 Class PDDocument should implement java.io.Closeable
 ---

 Key: PDFBOX-2039
 URL: https://issues.apache.org/jira/browse/PDFBOX-2039
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 1.8.4, 1.8.5, 2.0.0
Reporter: Andrei Solntsev
Assignee: Tilman Hausherr
Priority: Minor
 Fix For: 1.8.5, 2.0.0

   Original Estimate: 1h
  Remaining Estimate: 1h

 It would make it possible to use Java 7 try-with-resources feature:
 try (PDDocument doc = PDDocument.load(outputFile)) {
   // bla-bla
   // no need to call doc.close(); explicitly
 }
 P.S. Actually all org.apache.pdfbox.* classes with method close() could 
 implement java.io.Closeable



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: community bonding period

2014-04-23 Thread DImuthu Upeksha

Hi Tilman,
Thanks for the information. That helped me a lot. I'll work accordingly.

On Wed, Apr 23, 2014 at 9:14 PM, John Hewson j...@jahewson.com wrote:
Great advice!

-- John

On 23 Apr 2014, at 15:49, Tilman Hausherr thaush...@t-online.de wrote:

Although I'm only mentoring Shaola, maybe some of it is useful for Dimuthu
as well:

https://www.apache.org/foundation/how-it-works.html
The roles are simpler than in that text, all committers here are PMC
members, and the PMC chair (Andreas) is also ASF member.

Note that you can edit in JIRA. See an example here
https://issues.apache.org/jira/browse/PDFBOX-2039
i.e. you can modify previous posts.

Stuff that deals with PDFBOX in general is best in this (publicly readable)
mailing list. The advantage is that others might answer you (if they want)
when I'm working, sleeping, or not on the internet for whatever reason.
Stuff that deals with java, svn and maven - e-mail me if you don't get the
answer within a few minutes from google or from stackoverflow, i.e. don't
waste time searching.

Using other libraries: this is OK as long as they have an Apache license or
a compatible license (GPL is not). However we don't use many libraries,
everything is already big, so if you want, ask first. (Sorry if you already
mentioned a library, will reread your proposal again later) Of course it is
always OK to temporary use whatever you want to just test a theory /
strategy / algorithm.
Using other code: the code should rather be your own, but you can use small
excerpts from stackoverflow.com etc but indicate it in your code with a
link. Always comment in the code if you were inspired by other peoples
code or algorithms or research papers, just look at the existing shading
code for how I did it.

Don't forget the Apache header in new modules.

Your code should work on JDK5, so that we can use it in the 1.8 version too.
So don't use diamond operators, lambda expressions or even String.isEmpty().

IDE: I recommend netbeans but you're free to use your own. Just make sure
that svn (and whatever the hoster will use) and maven are integrated in it,
this will make your life easier.

A personal recommendation from my student days in the 80ies: don't work all
night. Such code was usually found to be poor/worthless after I had the much
needed sleep.

Andreas: correct me if I forgot something.

Tilman

--
Regards

W.Dimuthu Upeksha
Undergraduate

Department of Computer Science And Engineering

University of Moratuwa, Sri Lanka

[jira] [Comment Edited] (PDFBOX-2041) Convert PDF to Image (Strange Color)

2014-04-23 Thread Tilman Hausherr (JIRA)

[
https://issues.apache.org/jira/browse/PDFBOX-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978953#comment-13978953
]

Tilman Hausherr edited comment on PDFBOX-2041 at 4/23/14 9:49 PM:
--

1. The PDF file is corrupt. A look at it with NOTEPAD++ shows %%EOF and then
trash characters. Deleting all after that one makes the file much smaller,
518KB instead of 4,85MB. How did you get that file?!
2. I am able to render it. Your jpg file looks like it was cut off at some time.
3. The 2.0 version isn't able to open it with the non sequential parser, the
sequential parser can open it.
4. The 1.8 version renders it fine, the 2.0 version has many glyphs missing,
maybe a duplicate of PDFBOX-2037. I was able to render it with a modified 2.0
version that I use for myself.

Convert PDF to Image (Strange Color)

Key: PDFBOX-2041
URL: https://issues.apache.org/jira/browse/PDFBOX-2041
Project: PDFBox
Issue Type: Bug
Components: PDModel
Affects Versions: 1.8.4
Environment: Java(1.7.0_45), OS (Ubuntu)
Reporter: ahfei
Attachments: PDFBOX-2041.pdf, PDFBOX-2041.pdf-1-bad.tif,
pdfbox-2041.pdf-1-good.png

Using PDFBox, tried to convert PDF to Image file (case1.pdf, case1.jpg)
Below is code i'm using :
BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 200);

ImageIOUtil.writeImage(image, jpg, imagePath, BufferedImage.TYPE_INT_RGB,
200);
After convert, this image isn't look like pdf. Half page of it become blue
and black color.
Attached images PDF : https://www.dropbox.com/sh/jevegc8bh09km1o/5XkVwPUxri

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2042) ColorSpace without Range

2014-04-23 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979011#comment-13979011
 ] 

Tilman Hausherr commented on PDFBOX-2042:
-

It would be helpful if you provide the code that modifies the content stream. I 
couldn't reproduce the problem by just opening the document and saving it. But 
your theory about PDICCBased.getRangeArray(0) makes sense, that one would 
return an empty range array for the 0 parameter.

 ColorSpace without Range
 

 Key: PDFBOX-2042
 URL: https://issues.apache.org/jira/browse/PDFBOX-2042
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Juraj Lonc
 Attachments: pdfbox18.pdf, pdfbox20.pdf


 I have PDF document where I am modifying PDPage content stream.
 Saved document is invalid (Adobe reader complains about it).
 I have narrowed it down to ColorSpace. 
 Original document has colorspace:
 /ColorSpace 
 /Cs6 [/ICCBased 
 /Alternate /DeviceRGB
 /Filter /FlateDecode
 /Length 2597
 /N 3
 ]
 Modified document has colorspace:
 /ColorSpace 
 /Cs6 [/ICCBased 
 /Alternate /DeviceRGB
 /Filter /FlateDecode
 /Length 2597
 /N 3
 /Range []
 ]
 When I manually remove /Range [] from PDF then Adobe reader opens it 
 without an error.
 Obviously that range is added by calling PDICCBased.getRangeArray(0) 
 somewhere.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2041) Convert PDF to Image (Strange Color)

2014-04-23 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979239#comment-13979239
 ] 

Tilman Hausherr commented on PDFBOX-2041:
-

I didn't mean to remove %%EOF, just everything after it.

Could it be your Ubuntu disk is full?

I don't have Ubuntu, so someone else will have to answer that.

 Convert PDF to Image (Strange Color)
 

 Key: PDFBOX-2041
 URL: https://issues.apache.org/jira/browse/PDFBOX-2041
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 1.8.4
 Environment: Java(1.7.0_45),   OS (Ubuntu) 
Reporter: ahfei
 Attachments: PDFBOX-2041.pdf, PDFBOX-2041.pdf-1-bad.tif, 
 pdfbox-2041.pdf-1-good.png


 Using PDFBox, tried to convert PDF to Image file  (case1.pdf, case1.jpg)
 Below is code i'm using : 
 BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 200);   
  
 ImageIOUtil.writeImage(image, jpg, imagePath, BufferedImage.TYPE_INT_RGB, 
 200);
 After convert, this image isn't look like pdf. Half page of it become blue 
 and black color. 
 Attached images  PDF : https://www.dropbox.com/sh/jevegc8bh09km1o/5XkVwPUxri 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

[jira] [Updated] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

[jira] [Comment Edited] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

[jira] [Updated] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

[jira] [Commented] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

[jira] [Updated] (PDFBOX-2038) Method VisualSignatureParser#parse does not close COSDocument

[jira] [Commented] (PDFBOX-2038) Method VisualSignatureParser#parse does not close COSDocument

[jira] [Created] (PDFBOX-2041) Convert PDF to Image (Strange Color)

[jira] [Updated] (PDFBOX-2041) Convert PDF to Image (Strange Color)

[jira] [Commented] (PDFBOX-1241) Better handle of missing offset at the end of a file

[jira] [Created] (PDFBOX-2042) ColorSpace without Range

[jira] [Updated] (PDFBOX-2042) ColorSpace without Range

[jira] [Updated] (PDFBOX-2042) ColorSpace without Range

community bonding period

Re: community bonding period

[jira] [Closed] (PDFBOX-1241) Better handle of missing offset at the end of a file

[jira] [Resolved] (PDFBOX-2039) Class PDDocument should implement java.io.Closeable

Re: community bonding period

[jira] [Comment Edited] (PDFBOX-2041) Convert PDF to Image (Strange Color)

[jira] [Commented] (PDFBOX-2042) ColorSpace without Range

[jira] [Commented] (PDFBOX-2041) Convert PDF to Image (Strange Color)

21 matches

Site Navigation

Mail list logo

Footer information