Re: Reopen PDFBOX-483?

2010-03-08 Thread Maruan Sahyoun
Hi Andreas,

I can do a test on our Windows test server (Windows 2003, 32bit) and let you 
know the results around lunch time (german time) if that helps

Maruan Sahyoun

Am 09.03.2010 um 08:11 schrieb Andreas Lehmkuehler:

> Hi,
> 
> steve poling schrieb:
>> Andreas Lehmkuehler schrieb:
> If you goto PDFBOX-490 
> , you'll find attached 
> file filled.pdf that manifests this error, but I've been seeing this with 
> a lot of different PDFs: display looks good, print looks bad. I can 
> attach another file to PDFBOX-483 
>  if you'd like.
 I've tried that pdf and it works like a charm except for some misplaced 
 characters. I'm using ubuntu linux, java 1.6.0_15 32bit and a HP Laserjet 
 2550N.
>>> I've made another test on my MacBook (MacOSX 10.6., jdk 1.6.0_17 64bit, 
>>> same printer) and it works well too.
>> I'd like to know if anyone has repeated the experiment on any Windows-based 
>> platform, since Ubuntu and OSX are both Linux-based. If someone else can 
>> reproduce the failure on Windows, I'll start trusting my sanity again.
> I'm a software development for a lot of years and sometimes it leads to
> insanity, but we all have to do our best not to end in the programmers
> nuthouse ;-))
> 
> I'll see if I can find some time to run that test on my rarely used windows 
> box.
> 
> BR
> Andreas Lehmkühler
> 



Re: Reopen PDFBOX-483?

2010-03-08 Thread Andreas Lehmkuehler

Hi,

steve poling schrieb:

Andreas Lehmkuehler schrieb:
If you goto PDFBOX-490 
, you'll find 
attached file filled.pdf that manifests this error, but I've been 
seeing this with a lot of different PDFs: display looks good, print 
looks bad. I can attach another file to PDFBOX-483 
 if you'd like.
I've tried that pdf and it works like a charm except for some 
misplaced characters. I'm using ubuntu linux, java 1.6.0_15 32bit and 
a HP Laserjet 2550N.
I've made another test on my MacBook (MacOSX 10.6., jdk 1.6.0_17 
64bit, same printer) and it works well too.


I'd like to know if anyone has repeated the experiment on any 
Windows-based platform, since Ubuntu and OSX are both Linux-based. If 
someone else can reproduce the failure on Windows, I'll start trusting 
my sanity again.

I'm a software development for a lot of years and sometimes it leads to
insanity, but we all have to do our best not to end in the programmers
nuthouse ;-))

I'll see if I can find some time to run that test on my rarely used windows box.

BR
Andreas Lehmkühler



Re: pdfbox develpment

2010-03-08 Thread Maruan Sahyoun
Hi,

we were looking to start fixing some of the open issues but can instead develop 
some small tutorials for common tasks like text extraction, forms handling and 
highlighting.

WDYT

Kind regards

Maruan Sahyoun

Am 09.03.2010 um 07:58 schrieb Andreas Lehmkuehler:

> Hi,
> 
> Michael Müller schrieb:
>> Daniel,
>> Yes, I found some activities on the lists. But on the project site
>> neither developer nor commiter. Just missing documentation? ;-)
>> Great to hear, this project is alive.
>> I have big problems to use it, due to missing or vague docs.
>> EG: setTextMatrix
>> public void setTextMatrix(double a, double b, double c, double d, double
>> e, double f)
>> What's a, b, c, d, e, f? I figured out, e and f to be coordinates. Would
>> be much better to name this x and y or to enhance this documentation.
> These values correspond to the naming used in the pdf reference for a matrix.
> 
>> Maybe enhancing documentaion is an entry point for me to support the
>> project? Or does any doc exists beside the published java docs?
> Be our guest, a good and complete documentation is always useful, especially
> for beginners.
> 
> BR
> Andreas Lehmkühler



[jira] Assigned: (PDFBOX-651) Team list should be filled out or deleted ... it confuses users now

2010-03-08 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler reassigned PDFBOX-651:
-

Assignee: Andreas Lehmkühler

> Team list should be filled out or deleted ... it confuses users now
> ---
>
> Key: PDFBOX-651
> URL: https://issues.apache.org/jira/browse/PDFBOX-651
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Ted Dunning
>Assignee: Andreas Lehmkühler
>
> http://pdfbox.apache.org/team-list.html says that the project has no 
> developers nor committers.  This is very not true and should be fixed.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: pdfbox develpment

2010-03-08 Thread Andreas Lehmkuehler

Hi,

Michael Müller schrieb:

Daniel,

Yes, I found some activities on the lists. But on the project site
neither developer nor commiter. Just missing documentation? ;-)

Great to hear, this project is alive.

I have big problems to use it, due to missing or vague docs.

EG: setTextMatrix
public void setTextMatrix(double a, double b, double c, double d, double
e, double f)
What's a, b, c, d, e, f? I figured out, e and f to be coordinates. Would
be much better to name this x and y or to enhance this documentation.

These values correspond to the naming used in the pdf reference for a matrix.


Maybe enhancing documentaion is an entry point for me to support the
project? Or does any doc exists beside the published java docs?

Be our guest, a good and complete documentation is always useful, especially
for beginners.

BR
Andreas Lehmkühler


Re: pdfbox develpment

2010-03-08 Thread Ted Dunning
OK.  I filed a JIRA.  I would have put myself into the page, but I don't
know how and, more importantly, I am not a committer nor even really a
contributor just yet.

Real committers should speak up!

https://issues.apache.org/jira/browse/PDFBOX-651


On Mon, Mar 8, 2010 at 9:58 PM, Michael Müller <
michael.muel...@mueller-bruehl.de> wrote:

> http://pdfbox.apache.org/team-list.html
>
> Am 09.03.2010 05:44, schrieb Ted Dunning:
> > Which project site were you looking at?
> >
> > On Mon, Mar 8, 2010 at 1:51 PM, Michael Müller <
> > michael.muel...@mueller-bruehl.de> wrote:
> >
> >> But on the project site
> >> neither developer nor commiter. Just missing documentation? ;-)
> >>
> >
>


[jira] Created: (PDFBOX-651) Team list should be filled out or deleted ... it confuses users now

2010-03-08 Thread Ted Dunning (JIRA)
Team list should be filled out or deleted ... it confuses users now
---

 Key: PDFBOX-651
 URL: https://issues.apache.org/jira/browse/PDFBOX-651
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Ted Dunning



http://pdfbox.apache.org/team-list.html says that the project has no developers 
nor committers.  This is very not true and should be fixed.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: pdfbox develpment

2010-03-08 Thread Michael Müller
Hmm, I know this. Don't call a hint just to open a pdf a full
documentation! True, it's hard to document. Me too, I prefer coding than
to writing docs...

Michael

Am 08.03.2010 23:07, schrieb Daniel Wilson:
> There is the user guide:
> http://pdfbox.apache.org/userguide/faq.html
> 
> Then there's the Javadoc stuff ... and there's the PDF reference that Adobe
> provides:
> http://www.adobe.com/devnet/pdf/
> 
> In many cases, function and parameter names have some reference to the PDF
> reference.  I don't know about setTextMatrix.
> 
> I'm sure some more documentation would be appreciated.
> 
> Daniel
> 
> On Mon, Mar 8, 2010 at 4:51 PM, Michael Müller <
> michael.muel...@mueller-bruehl.de> wrote:
> 
>> Daniel,
>>
>> Yes, I found some activities on the lists. But on the project site
>> neither developer nor commiter. Just missing documentation? ;-)
>>
>> Great to hear, this project is alive.
>>
>> I have big problems to use it, due to missing or vague docs.
>>
>> EG: setTextMatrix
>> public void setTextMatrix(double a, double b, double c, double d, double
>> e, double f)
>> What's a, b, c, d, e, f? I figured out, e and f to be coordinates. Would
>> be much better to name this x and y or to enhance this documentation.
>> Maybe enhancing documentaion is an entry point for me to support the
>> project? Or does any doc exists beside the published java docs?
>>
>> Best,
>> Michael
>>
>>
>>
>> Am 08.03.2010 22:30, schrieb Daniel Wilson:
>>> No, not dead at all!  If you'll have a look at
>>> http://issues.apache.org/jira/browse/PDFBOX you'll see that we are
>> updating
>>> and resolving issues actively.
>>>
>>> Most of the committers have started by adding patches.
>>>
>>> So if you can solve some problems, enhance capabilities, etc., that's
>> great!
>>>
>>> Daniel
>>>
>>> On Mon, Mar 8, 2010 at 4:13 PM, Michael Müller <
>>> michael.muel...@mueller-bruehl.de> wrote:
>>>
 Hi,

 I just started using pdfbox and like to help to develop.
 But in project team list neither members nor contributers are listed.
 Has development stoped? Is pdfbox dead? :(

 Michael

>>>
>>
> 


Re: pdfbox develpment

2010-03-08 Thread Michael Müller
http://pdfbox.apache.org/team-list.html

Am 09.03.2010 05:44, schrieb Ted Dunning:
> Which project site were you looking at?
> 
> On Mon, Mar 8, 2010 at 1:51 PM, Michael Müller <
> michael.muel...@mueller-bruehl.de> wrote:
> 
>> But on the project site
>> neither developer nor commiter. Just missing documentation? ;-)
>>
> 


Re: Reopen PDFBOX-483?

2010-03-08 Thread steve poling

Andreas Lehmkuehler schrieb:
If you goto PDFBOX-490 
, you'll find 
attached file filled.pdf that manifests this error, but I've been 
seeing this with a lot of different PDFs: display looks good, print 
looks bad. I can attach another file to PDFBOX-483 
 if you'd like.
I've tried that pdf and it works like a charm except for some 
misplaced characters. I'm using ubuntu linux, java 1.6.0_15 32bit and 
a HP Laserjet 2550N.
I've made another test on my MacBook (MacOSX 10.6., jdk 1.6.0_17 
64bit, same printer) and it works well too.


I'd like to know if anyone has repeated the experiment on any 
Windows-based platform, since Ubuntu and OSX are both Linux-based. If 
someone else can reproduce the failure on Windows, I'll start trusting 
my sanity again.


Re: pdfbox develpment

2010-03-08 Thread Ted Dunning
Which project site were you looking at?

On Mon, Mar 8, 2010 at 1:51 PM, Michael Müller <
michael.muel...@mueller-bruehl.de> wrote:

> But on the project site
> neither developer nor commiter. Just missing documentation? ;-)
>


[jira] Updated: (PDFBOX-553) writing pdf file in Japanese, garbled

2010-03-08 Thread Jacky Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Yang updated PDFBOX-553:
--

  Priority: Critical  (was: Major)
Issue Type: Improvement  (was: Bug)

> writing pdf file in Japanese, garbled
> -
>
> Key: PDFBOX-553
> URL: https://issues.apache.org/jira/browse/PDFBOX-553
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 0.8.0-incubator
> Environment: windows server2003
>Reporter: Jacky Yang
>Priority: Critical
> Attachments: helloFont.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> use PdfBox to write pdf file in Japanese,generated file is garbled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: pdfbox develpment

2010-03-08 Thread Daniel Wilson
There is the user guide:
http://pdfbox.apache.org/userguide/faq.html

Then there's the Javadoc stuff ... and there's the PDF reference that Adobe
provides:
http://www.adobe.com/devnet/pdf/

In many cases, function and parameter names have some reference to the PDF
reference.  I don't know about setTextMatrix.

I'm sure some more documentation would be appreciated.

Daniel

On Mon, Mar 8, 2010 at 4:51 PM, Michael Müller <
michael.muel...@mueller-bruehl.de> wrote:

> Daniel,
>
> Yes, I found some activities on the lists. But on the project site
> neither developer nor commiter. Just missing documentation? ;-)
>
> Great to hear, this project is alive.
>
> I have big problems to use it, due to missing or vague docs.
>
> EG: setTextMatrix
> public void setTextMatrix(double a, double b, double c, double d, double
> e, double f)
> What's a, b, c, d, e, f? I figured out, e and f to be coordinates. Would
> be much better to name this x and y or to enhance this documentation.
> Maybe enhancing documentaion is an entry point for me to support the
> project? Or does any doc exists beside the published java docs?
>
> Best,
> Michael
>
>
>
> Am 08.03.2010 22:30, schrieb Daniel Wilson:
> > No, not dead at all!  If you'll have a look at
> > http://issues.apache.org/jira/browse/PDFBOX you'll see that we are
> updating
> > and resolving issues actively.
> >
> > Most of the committers have started by adding patches.
> >
> > So if you can solve some problems, enhance capabilities, etc., that's
> great!
> >
> > Daniel
> >
> > On Mon, Mar 8, 2010 at 4:13 PM, Michael Müller <
> > michael.muel...@mueller-bruehl.de> wrote:
> >
> >> Hi,
> >>
> >> I just started using pdfbox and like to help to develop.
> >> But in project team list neither members nor contributers are listed.
> >> Has development stoped? Is pdfbox dead? :(
> >>
> >> Michael
> >>
> >
>


Re: pdfbox develpment

2010-03-08 Thread Michael Müller
Daniel,

Yes, I found some activities on the lists. But on the project site
neither developer nor commiter. Just missing documentation? ;-)

Great to hear, this project is alive.

I have big problems to use it, due to missing or vague docs.

EG: setTextMatrix
public void setTextMatrix(double a, double b, double c, double d, double
e, double f)
What's a, b, c, d, e, f? I figured out, e and f to be coordinates. Would
be much better to name this x and y or to enhance this documentation.
Maybe enhancing documentaion is an entry point for me to support the
project? Or does any doc exists beside the published java docs?

Best,
Michael



Am 08.03.2010 22:30, schrieb Daniel Wilson:
> No, not dead at all!  If you'll have a look at
> http://issues.apache.org/jira/browse/PDFBOX you'll see that we are updating
> and resolving issues actively.
> 
> Most of the committers have started by adding patches.
> 
> So if you can solve some problems, enhance capabilities, etc., that's great!
> 
> Daniel
> 
> On Mon, Mar 8, 2010 at 4:13 PM, Michael Müller <
> michael.muel...@mueller-bruehl.de> wrote:
> 
>> Hi,
>>
>> I just started using pdfbox and like to help to develop.
>> But in project team list neither members nor contributers are listed.
>> Has development stoped? Is pdfbox dead? :(
>>
>> Michael
>>
> 


Re: pdfbox develpment

2010-03-08 Thread Daniel Wilson
No, not dead at all!  If you'll have a look at
http://issues.apache.org/jira/browse/PDFBOX you'll see that we are updating
and resolving issues actively.

Most of the committers have started by adding patches.

So if you can solve some problems, enhance capabilities, etc., that's great!

Daniel

On Mon, Mar 8, 2010 at 4:13 PM, Michael Müller <
michael.muel...@mueller-bruehl.de> wrote:

> Hi,
>
> I just started using pdfbox and like to help to develop.
> But in project team list neither members nor contributers are listed.
> Has development stoped? Is pdfbox dead? :(
>
> Michael
>


pdfbox develpment

2010-03-08 Thread Michael Müller
Hi,

I just started using pdfbox and like to help to develop.
But in project team list neither members nor contributers are listed.
Has development stoped? Is pdfbox dead? :(

Michael


Re: Reopen PDFBOX-483?

2010-03-08 Thread Daniel Wilson
These are not in ascending difficulty ... but the files we currently test
for rendering are at trunk\src\test\resources\input\rendering

Daniel

On Sun, Mar 7, 2010 at 11:11 PM, steve poling  wrote:

> Andreas,
>
>  I archived all my code changes and retrieved the latest sources from svn.
>>> So, I should be running the latest and greatest code.
>>>
>>> If you goto PDFBOX-490 ,
>>> you'll find attached file filled.pdf that manifests this error, but I've
>>> been seeing this with a lot of different PDFs: display looks good, print
>>> looks bad. I can attach another file to PDFBOX-483 <
>>> https://issues.apache.org/jira/browse/PDFBOX-483> if you'd like.
>>>
>> I've tried that pdf and it works like a charm except for some misplaced
>> characters. I'm using ubuntu linux, java 1.6.0_15 32bit and a HP Laserjet
>> 2550N.
>>
>> Correct me if I'm wrong, but your Laserjet 4P is a quite old model, isn't
>> it?
>> Did you update the driver? Has it a postscript option or is it a pcl only
>> model?
>> How much memory is installed to the printer? Probably that could be a
>> potential
>> bottleneck.
>>
>
> My printer is as old as the hills.  However, I've been seeing this
> issue when I print to other devices, such as the Canon iR 5570. I'll perform
> my experiment on all the different printers I can find. If memory serves
> (and sometimes it doesn't) I ran the experiment on the HP 990cxi, too (with
> the same effect). I am using Windows XP. So, that's a difference (between
> working for you and breaking for me). Keep in mind, the copy on the display
> looks grand, it's just the printed output that looks bad. I hate working
> bugs that require killing trees to reproduce.
>
>
>  Can you point me to where I should look to see why text appears on screen,
>>> but not on paper?
>>>
>> Hmmm, good question. Perhaps you should try to find differences between
>> working documents and those which don't work. Or you will probably find
>> something that documents have in common which don't work, e.g. font tpye
>> (true type, type1, etc.), fonts are embedded or not, font encoding (ansi,
>> CID, identity-H, etc.), acroform included or not 
>>
>
> I'll give that a go. Can you point me to a test repository of PDF files of
> ascending difficulty? Does such a thing exist or would it be worthwhile to
> create one?
>
> Thanks again,
>
> steve
>


[jira] Updated: (PDFBOX-515) The handle is invalid when merging 2 pdfs from different pdf generators

2010-03-08 Thread Ernst Eibensteiner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ernst Eibensteiner updated PDFBOX-515:
--

Attachment: invalid handle.patch

Please find attached  patch as mentioned above.

> The handle is invalid when merging 2 pdfs from different pdf generators
> ---
>
> Key: PDFBOX-515
> URL: https://issues.apache.org/jira/browse/PDFBOX-515
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 0.7.3
> Environment: Windows 2003 SP2; X86; java version 1.6.0_13
>Reporter: Ernst Eibensteiner
> Attachments: invalid handle.patch, pdfboxpdfs.zip
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> If I try to merge 2 PDFs using PDFMerger.java, that have been created with 2 
> different pdf generators an exception is thrown:
> Exception in thread "main" org.pdfbox.exceptions.COSVisitorException: The 
> handle
>  is invalid
> at org.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:953)
> at org.pdfbox.cos.COSStream.accept(COSStream.java:215)
> at org.pdfbox.cos.COSObject.accept(COSObject.java:220)
> at org.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:444)
> at org.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:375)
> at 
> org.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:782)
> at org.pdfbox.cos.COSDocument.accept(COSDocument.java:388)
> at org.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1084)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:740)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:721)
> at 
> org.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:158)
> at org.pdfbox.PDFMerger.merge(PDFMerger.java:78)
> at org.pdfbox.PDFMerger.main(PDFMerger.java:54)
> java.io.IOException: The handle is invalid
> at java.io.RandomAccessFile.seek(Native Method)
> at org.pdfbox.io.RandomAccessFile.seek(RandomAccessFile.java:73)
> at 
> org.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:110)
> at java.io.BufferedInputStream.fill(Unknown Source)
> at java.io.BufferedInputStream.read1(Unknown Source)
> at java.io.BufferedInputStream.read(Unknown Source)
> at org.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:940)
> at org.pdfbox.cos.COSStream.accept(COSStream.java:215)
> at org.pdfbox.cos.COSObject.accept(COSObject.java:220)
> at org.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:444)
> at org.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:375)
> at 
> org.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:782)
> at org.pdfbox.cos.COSDocument.accept(COSDocument.java:388)
> at org.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1084)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:740)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:721)
> at 
> org.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:158)
> at org.pdfbox.PDFMerger.merge(PDFMerger.java:78)
> at org.pdfbox.PDFMerger.main(PDFMerger.java:54)
> But if I merge PDFs from the same generator everything works fine.
> I have uploaded 4 PDFs for testing purpose on: 
> http://servicedesk.fabasoft.com/download/pdfboxpdfs.zip
> PDFMerger C:\Ghostscript1.pdf C:\Ghostscript2.pdf result.pdf works fine
> PDFMerger C:\ComSquare1.pdf C:\ComSquare2.pdf result.pdf works fine
> -
> PDFMerger C:\Ghostscript1.pdf C:\ComSquare1.pdf result.pdf does not work
> PDFMerger C:\Ghostscript2.pdf C:\ComSquare2.pdf result.pdf does not work
> PDFMerger C:\Ghostscript1.pdf C:\ComSquare2.pdf result.pdf does not work
> PDFMerger C:\Ghostscript2.pdf C:\ComSquare1.pdf result.pdf does not work

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PDFBOX-515) The handle is invalid when merging 2 pdfs from different pdf generators

2010-03-08 Thread Ernst Eibensteiner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ernst Eibensteiner updated PDFBOX-515:
--


According to the issue http://issues.apache.org/jira/browse/PDFBOX-497 I tried 
to close the source files after closing the destination file in 
PDFMergeUtility.java
To avoid unclosed file handles I decided to add each source file to a vector 
and close it afterwards in a loop.

Please find attached the patch. I would like you to validate the patch.
Thx

> The handle is invalid when merging 2 pdfs from different pdf generators
> ---
>
> Key: PDFBOX-515
> URL: https://issues.apache.org/jira/browse/PDFBOX-515
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 0.7.3
> Environment: Windows 2003 SP2; X86; java version 1.6.0_13
>Reporter: Ernst Eibensteiner
> Attachments: pdfboxpdfs.zip
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> If I try to merge 2 PDFs using PDFMerger.java, that have been created with 2 
> different pdf generators an exception is thrown:
> Exception in thread "main" org.pdfbox.exceptions.COSVisitorException: The 
> handle
>  is invalid
> at org.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:953)
> at org.pdfbox.cos.COSStream.accept(COSStream.java:215)
> at org.pdfbox.cos.COSObject.accept(COSObject.java:220)
> at org.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:444)
> at org.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:375)
> at 
> org.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:782)
> at org.pdfbox.cos.COSDocument.accept(COSDocument.java:388)
> at org.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1084)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:740)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:721)
> at 
> org.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:158)
> at org.pdfbox.PDFMerger.merge(PDFMerger.java:78)
> at org.pdfbox.PDFMerger.main(PDFMerger.java:54)
> java.io.IOException: The handle is invalid
> at java.io.RandomAccessFile.seek(Native Method)
> at org.pdfbox.io.RandomAccessFile.seek(RandomAccessFile.java:73)
> at 
> org.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:110)
> at java.io.BufferedInputStream.fill(Unknown Source)
> at java.io.BufferedInputStream.read1(Unknown Source)
> at java.io.BufferedInputStream.read(Unknown Source)
> at org.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:940)
> at org.pdfbox.cos.COSStream.accept(COSStream.java:215)
> at org.pdfbox.cos.COSObject.accept(COSObject.java:220)
> at org.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:444)
> at org.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:375)
> at 
> org.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:782)
> at org.pdfbox.cos.COSDocument.accept(COSDocument.java:388)
> at org.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1084)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:740)
> at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:721)
> at 
> org.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:158)
> at org.pdfbox.PDFMerger.merge(PDFMerger.java:78)
> at org.pdfbox.PDFMerger.main(PDFMerger.java:54)
> But if I merge PDFs from the same generator everything works fine.
> I have uploaded 4 PDFs for testing purpose on: 
> http://servicedesk.fabasoft.com/download/pdfboxpdfs.zip
> PDFMerger C:\Ghostscript1.pdf C:\Ghostscript2.pdf result.pdf works fine
> PDFMerger C:\ComSquare1.pdf C:\ComSquare2.pdf result.pdf works fine
> -
> PDFMerger C:\Ghostscript1.pdf C:\ComSquare1.pdf result.pdf does not work
> PDFMerger C:\Ghostscript2.pdf C:\ComSquare2.pdf result.pdf does not work
> PDFMerger C:\Ghostscript1.pdf C:\ComSquare2.pdf result.pdf does not work
> PDFMerger C:\Ghostscript2.pdf C:\ComSquare1.pdf result.pdf does not work

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PDFBOX-7) extract information from tagged PDF

2010-03-08 Thread Johannes Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johannes Koch updated PDFBOX-7:
---

Attachment: PDFBOX-7_patch_04.txt

I created patch PDFBOX-7_patch_04.txt for the current trunk. Hope that works

> extract information from tagged PDF
> ---
>
> Key: PDFBOX-7
> URL: https://issues.apache.org/jira/browse/PDFBOX-7
> Project: PDFBox
>  Issue Type: New Feature
>  Components: PDModel
> Attachments: PDFBOX-7_patch_00.txt, PDFBOX-7_patch_01.txt, 
> PDFBOX-7_patch_02.txt, PDFBOX-7_patch_03.txt, PDFBOX-7_patch_04.txt, 
> PDFMarkedContentExtractor.properties
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=805623
> Originally submitted by benlitchfield on 2003-09-13 07:38.
> Add the ability to extract information from a tagged PDF 
> document.  See taggedPDF.pdf for an example.
> [comment on SourceForge]
> Originally sent by qumar.
> Logged In: YES 
> user_id=1468838
> Hi,
> we have to parse the PDF object structure tree; all
> structural elements are inside the object tree (see e.g.
> PDFReference 1.4 chapter 9.6 "Logical Structure").
> - parse the PDF page streams to extract drawing and text
> operations;these contain the actual content of the
> structural elements. This content is surrounded by BMC/EMC
> tags which contain information to which element object the
> contained content belongs.This is what i got from pdf reference.
> Regards,
> Qumar.
> [comment on SourceForge]
> Originally sent by benlitchfield.
> Logged In: YES 
> user_id=601708
> http://www.irs.gov/pub/irs-access/f1040ez_accessible.pdf
> would be a good form to start with.
> If you notice they are putting labels on the form fields.  
> these labels contain meta data critical to building tax 
> software in rapid fashion.  Without this meta data, the 
> name of the form field is meaningless. It would be nice to 
> extract this information so I can combine it with other 
> data about the field (name, type, location, etc).  I 
> already know PDFBox can extract the other information about 
> the fields.  I haven't done it with PDFBox, but I did it 
> with iText.
> [comment on SourceForge]
> Originally sent by benlitchfield.
> Logged In: YES 
> user_id=601708
> More comments from users
> Tagged PDF will be a big thing in government because 
> federal government procurement of Acrobat publishing 
> technology falls under Section 508.  States will likely 
> follow.
>  see:
> www.section508.gov
> http://www.irs.gov/pub/irs-access/
> or
> ftp://ftp.irs.gov/pub/irs-access/
> [comment on SourceForge]
> Originally sent by qumar.
> Logged In: YES 
> user_id=1468838
> Hi,
>  i was seeing the specification of pdf and came to know the
> structure information of pdf will be in PDSEdit
> layer,PDSEdit Layer gives access to structure tree with in a
> pdf and methods methods and objects are prefixed by PDS.So
> how can we get access to PDSEdit layer of pdf.
> [comment on SourceForge]
> Originally sent by qumar.
> Logged In: YES 
> user_id=1468838
> It would be nice if pdfbox can provide the ability to
> extract information from tagged PDF.As Adobre Acrobat Reader
> provides the tags for the pdf, pdfbox should also try to get
> the tagged pdfs.
> for example if iwe have a pdf file with a para1 under
> header1 and para2 under header 2 and a table with rows and
> columns.something like 
>  
> Header1 
> This is a para 1 ,it describes about a disease.  
> Header2 
> This is a para2,describes remedies of disease. 
> Table 
> A B  
> C D 
>  
>  
> Now the tagged pdf looks like below in adobe acrobat reader
>  
>  
> Header1 
>   
> This is a para 1 ,it describes about a disease. 
>  
> Header1 
>   
> This is a para2,describes remedies of disease. 
>  
> Table 
>  
>  
>  
>  
>  
> A 
>  
>  
> B 
>  
>  
>  
> C 
>  
>  
> D 
> how can we extract the Heading1 ,Heading 2 and tabular data
> using pdfbox.
> This is a good feature which should be added to the armory
> pdfbox.
> Please provide this feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: How can we help

2010-03-08 Thread Jin Mingjian
great news:) How about asian charset problem like in the bug of
PDFBOX-420[1].

[1] http://issues.apache.org/jira/browse/PDFBOX-420


2010/3/8 Andreas Lehmkühler 

> Hi Maruan,
>
> Betreff: How can we help
> Gesendet: Mo, 08. Mrz 2010
> Von: Maruan Sahyoun
>
> > Dear PDFBox developers,
> >
> > we are a small German based consulting/implementation company working in
> the
> > area of electronic documents. PDF is a key technology in our projects. We
> > are an Adobe partner for their server products (Adobe LiveCycle) and have
> > been working in the past with libs like iText, pdflib and pdfnet.sdk in
> > addition to pdfbox.
> >
> > We would like to commit some ressources to help develop pdfbox further.
> What
> > are the areas where we should look into?
> We appreciate your offer to help us. There are a lot of areas (font
> support, encoding,
> printing, performance etc.) to look into. I suggest to start at JIRA [1] to
> get an overview
> of all already filed issues. As I saw on your website your are familiar
> with PDFBox,
> so that I guess you already know what features have to be improved or are
> just missing.
>
> > With kind regards
> >
> > Maruan Sahyoun
> Thanks for your offer, we are looking forward to your contributions.
>
> BR
> Andreas Lehmkühler
>
> [1] https://issues.apache.org/jira/browse/PDFBOX
>


[jira] Closed: (PDFBOX-640) Add getter/setter for alternate field name (TU) to PDField

2010-03-08 Thread Johannes Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johannes Koch closed PDFBOX-640.



Thanks

> Add getter/setter for alternate field name (TU) to PDField
> --
>
> Key: PDFBOX-640
> URL: https://issues.apache.org/jira/browse/PDFBOX-640
> Project: PDFBox
>  Issue Type: New Feature
>  Components: PDModel.AcroForm
>Reporter: Johannes Koch
> Fix For: 1.1.0
>
> Attachments: PDFBox-640_patch_00.txt
>
>
> Add getter/setter for alternate field name (TU) to PDField. Patch coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Closed: (PDFBOX-628) Too many detours in COSDictionary convenience methods

2010-03-08 Thread Johannes Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johannes Koch closed PDFBOX-628.



Thanks

> Too many detours in COSDictionary convenience methods
> -
>
> Key: PDFBOX-628
> URL: https://issues.apache.org/jira/browse/PDFBOX-628
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Reporter: Johannes Koch
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: PDFBox-628_patch_00.txt
>
>
> I think there are too many detours in some of the COSDictionary convenience 
> methods. E.g.
> getInt( COSName key )
> -> getInt( COSName key, int defaultValue )
>// create String from COSName
> -> getInt( String key, int defaultValue )
> -> getInt( String[] keyList, int defaultValue )
> -> getDictionaryObject( String[] keyList )
>// create COSName from String
> -> getDictionaryObject( COSName key )
> Wouldn't it be easier to just do the following?
> getInt( COSName key )
> -> getDictionaryObject( COSName key )
> Same with getLong(COSName):
> getLong( COSName key )
> -> getLong( COSName key, long defaultValue )
> -> getLong( String key, long defaultValue )
> -> getLong( String[] keyList, long defaultValue )
> -> getDictionaryObject( String[] keyList )
> -> getDictionaryObject( COSName key )
> This could be reduced to:
> getLong( COSName key )
> -> getDictionaryObject( COSName key )
> getFloat(COSName) has only one detour:
> getFloat( COSName key )
> -> getFloat( COSName key, float defaultValue )
> -> getDictionaryObject( COSName key )
> This could be reduced to:
> getFloat( COSName key )
> -> getDictionaryObject( COSName key )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.