Re: [jira] Commented: (PDFBOX-688) Refactoring rendering-related classes/methods for extensibility

2010-04-12 Thread Daniel Wilson
What I have done, I consider a step in the right direction, but you may have
some better code.  I do not develop much in Java, so sometimes I do things
in ways that are not that elegant.

I skipped the Type3 fonts in what I did.  I saw what was going on & decided
I had no idea how to handle it!

As for drawString vs TextLayout.getOutline, I really don't know.

Sorry I didn't read your comments in 678 well enough.  I would have
collaborated on the getAwtFont work!

Daniel

On Sun, Apr 11, 2010 at 3:04 AM, Maruan Sahyoun wrote:

> Hi Daniel,
>
> I think we are currently trying to do the same as I also started
> implementing a getAwtFont Method ;-) as outlined in my comments for
> PDFBOX-678 in order to get all drawing for the different text modes done in
> PageDrawer itself (I think we share the same general idea that there should
> be a clearer separation of concerns). I already have that working for
> TrueType fonts (just copied the code in writeText into the new method) and
> the non clipping text modes. The only difficulty I see is handling e.g.
> Type3 fonts as they can not be so easily converted to a font. Maybe we share
> ideas how to deal with these and then make a decision who implements what in
> order to avoid duplication of efforts. I'm happy to just rely on your
> getAwtFont implementation as you might be further down the road.
>
> One question when drawing text in PageDrawer is how text handling should be
> done in general. E.g. using drawString is faster and produces text objects
> which can be selected for example when you print to a PDF printer. But
> outlines etc. are not possible that way. There I can either use
> TextLayout.getOutline() to draw the outline (and combine that with
> drawString to get selectable text) or selectable text as a result of
> PageDrawer is not important at all. This will then also affect possible
> applications in PDFReader which currently is display only - but what is the
> idea with that further down the road.
>
> Maybe there we should also share some thoughts as you will have a much
> better idea about the longer term plan for PDFBox as I'm new to that
> project.
>
> Kind regards
>
> Maruan Sahyoun
>
> Am 11.04.2010 um 04:32 schrieb Daniel Wilson:
>
> > Thanks, Maruan.
> >
> > The big thing to avoid is direct access to a graphics object in an object
> > other than PageDrawer.  I inherit from PageDrawer and override many of
> the
> > methods, and I believe anyone else who wishes to use PDFBox for rendering
> in
> > .Net would need to do the same.
> >
> > A big hint that direct access to a graphics object is coming is a line of
> > code like
> > Graphics2D graphics = (PageDrawer)context.getGraphics();
> >
> > If that line tries to execute in .Net ... it will return a NULL ... and
> then
> > you get NullPointerExceptions.
> >
> > Better to keep the graphics code in PageDrawer.
> >
> > The refactoring of some of the Font stuff I'm about to commit doesn't
> > completely do this ... but it does provide a getawtFont routine that can
> be
> > called from .Net, permitting the actual graphics stuff down in
> PDSimpleFont
> > to be avoided.
> >
> > Daniel
> >
> > On Fri, Apr 9, 2010 at 2:44 PM, Maruan Sahyoun  >wrote:
> >
> >> Hi Daniel,
> >>
> >> as I'm currently looking at implementing support for some more text
> >> rendering modes in PageDrawer (PDFBOX-678) I would like to understand if
> >> that might affect the .NET Version. Although I don't have a completed
> >> version this is a list of the potential operations I will be using.
> >>
> >> * generating a Shape based on TextLayout.getOutline()
> >> * filling, drawing and clipping using that Shape
> >> * possibly AlphaComposite
> >> * possibly GlyphVector
> >>
> >> Are there things I should avoid?
> >>
> >> Kind regards
> >>
> >>
> >> Maruan Sahyoun
> >>
> >> Am 09.04.2010 um 18:18 schrieb Daniel Wilson (JIRA):
> >>
> >>>
> >>>   [
> >>
> https://issues.apache.org/jira/browse/PDFBOX-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855454#action_12855454
> ]
> >>>
> >>> Daniel Wilson commented on PDFBOX-688:
> >>> --
> >>>
> >>> 928957  -- Make page and pageSize available (again) for
> >> libraries/applications that inherit.
> >>> 931616 -- Stroke & line width/style modifications.
> >>> 931633 -- Invoke / drawImage
> >>> 932179 -- Don't fail to BLACK quite so quickly ... do some more
> >> intelligent guessing.
> >>>  Necessary when implementing in .Net as there are still
> >> some key things IKVM is missing.
> >>>
>  Refactoring rendering-related classes/methods for extensibility
>  ---
> 
>    Key: PDFBOX-688
>    URL: https://issues.apache.org/jira/browse/PDFBOX-688
>    Project: PDFBox
> Issue Type: Improvement
>   Reporter: Daniel Wilson
>   Assignee: Daniel Wilson
> 

Re: [jira] Commented: (PDFBOX-688) Refactoring rendering-related classes/methods for extensibility

2010-04-12 Thread Maruan Sahyoun
Hi Daniel,

I think it's a good first step. Having thought about that a little more the 
question to me becomes if it makes sense to even refactor that a bit more as 
the font handling still contains some platform specific things as now we have 
getawtFont. What if the PD classes are PDF related and application or platform 
specific stuff ends in PageDrawer and some new classes let's call them platform 
classes for the moment. This way PageDrawer would use the platform classes to 
get the font which then would use the PDxxx classes to get the information 
necessary  to "generate" the right information. Currently PDxxx is a mixture of 
PDF specific classes and routines and non PDF specific things like returning a 
java font. Factoring non PDF related stuff out would provide a better 
separation and also the chance to provide additional implementations for 
specific applications e.g. let's say we would like to render PDF to HTML we 
could implement  a HTML rendition without touching the "core" PDF stuff.

If that's a good idea really depends on where PDFBox should go. Currently 
looking at the issues in JIRA and the users mailing list I see a number of 
different types of applications where people are trying to use PDFBox - from 
text extraction to printing to a Reader type application.  In addition to that 
there is also some "core" functionality missing. If there are layers like 
"core", "platform" and "application" I think development would benefit.

As I'm new to PDFBox (and as I wouldn't say that a) I have seen all code and b) 
understood how all the stuff works together) this reflects my current 
understanding.

And I'm also not a Java expert (although I do most of my development in Java).

Last comment ;-) I would call the method getAwtFont and not getawtFont.

Kind regards 

Maruan Sahyoun

Am 12.04.2010 um 15:20 schrieb Daniel Wilson:

> What I have done, I consider a step in the right direction, but you may have
> some better code.  I do not develop much in Java, so sometimes I do things
> in ways that are not that elegant.
> 
> I skipped the Type3 fonts in what I did.  I saw what was going on & decided
> I had no idea how to handle it!
> 
> As for drawString vs TextLayout.getOutline, I really don't know.
> 
> Sorry I didn't read your comments in 678 well enough.  I would have
> collaborated on the getAwtFont work!
> 
> Daniel
> 
> On Sun, Apr 11, 2010 at 3:04 AM, Maruan Sahyoun wrote:
> 
>> Hi Daniel,
>> 
>> I think we are currently trying to do the same as I also started
>> implementing a getAwtFont Method ;-) as outlined in my comments for
>> PDFBOX-678 in order to get all drawing for the different text modes done in
>> PageDrawer itself (I think we share the same general idea that there should
>> be a clearer separation of concerns). I already have that working for
>> TrueType fonts (just copied the code in writeText into the new method) and
>> the non clipping text modes. The only difficulty I see is handling e.g.
>> Type3 fonts as they can not be so easily converted to a font. Maybe we share
>> ideas how to deal with these and then make a decision who implements what in
>> order to avoid duplication of efforts. I'm happy to just rely on your
>> getAwtFont implementation as you might be further down the road.
>> 
>> One question when drawing text in PageDrawer is how text handling should be
>> done in general. E.g. using drawString is faster and produces text objects
>> which can be selected for example when you print to a PDF printer. But
>> outlines etc. are not possible that way. There I can either use
>> TextLayout.getOutline() to draw the outline (and combine that with
>> drawString to get selectable text) or selectable text as a result of
>> PageDrawer is not important at all. This will then also affect possible
>> applications in PDFReader which currently is display only - but what is the
>> idea with that further down the road.
>> 
>> Maybe there we should also share some thoughts as you will have a much
>> better idea about the longer term plan for PDFBox as I'm new to that
>> project.
>> 
>> Kind regards
>> 
>> Maruan Sahyoun
>> 
>> Am 11.04.2010 um 04:32 schrieb Daniel Wilson:
>> 
>>> Thanks, Maruan.
>>> 
>>> The big thing to avoid is direct access to a graphics object in an object
>>> other than PageDrawer.  I inherit from PageDrawer and override many of
>> the
>>> methods, and I believe anyone else who wishes to use PDFBox for rendering
>> in
>>> .Net would need to do the same.
>>> 
>>> A big hint that direct access to a graphics object is coming is a line of
>>> code like
>>> Graphics2D graphics = (PageDrawer)context.getGraphics();
>>> 
>>> If that line tries to execute in .Net ... it will return a NULL ... and
>> then
>>> you get NullPointerExceptions.
>>> 
>>> Better to keep the graphics code in PageDrawer.
>>> 
>>> The refactoring of some of the Font stuff I'm about to commit doesn't
>>> completely do this ... but it does provide a getawtFont routine that can
>> be
>>> called 

Re: [jira] Commented: (PDFBOX-688) Refactoring rendering-related classes/methods for extensibility

2010-04-12 Thread Daniel Wilson
No objection to the capitalization change.  As I just submitted this last
night, I am probably the only one w/ anything depending on that name.

I think your view of the separation of platform classes from PDF classes
makes a lot of sense.

My use of PDFBox is fairly narrow (as is that of many users), so I would
like to hear from Andreas or Jukka before committing to anything too major,
though.

Daniel

On Mon, Apr 12, 2010 at 10:56 AM, Maruan Sahyoun wrote:

> Hi Daniel,
>
> I think it's a good first step. Having thought about that a little more the
> question to me becomes if it makes sense to even refactor that a bit more as
> the font handling still contains some platform specific things as now we
> have getawtFont. What if the PD classes are PDF related and application or
> platform specific stuff ends in PageDrawer and some new classes let's call
> them platform classes for the moment. This way PageDrawer would use the
> platform classes to get the font which then would use the PDxxx classes to
> get the information necessary  to "generate" the right information.
> Currently PDxxx is a mixture of PDF specific classes and routines and non
> PDF specific things like returning a java font. Factoring non PDF related
> stuff out would provide a better separation and also the chance to provide
> additional implementations for specific applications e.g. let's say we would
> like to render PDF to HTML we could implement  a HTML rendition without
> touching the "core" PDF stuff.
>
> If that's a good idea really depends on where PDFBox should go. Currently
> looking at the issues in JIRA and the users mailing list I see a number of
> different types of applications where people are trying to use PDFBox - from
> text extraction to printing to a Reader type application.  In addition to
> that there is also some "core" functionality missing. If there are layers
> like "core", "platform" and "application" I think development would benefit.
>
> As I'm new to PDFBox (and as I wouldn't say that a) I have seen all code
> and b) understood how all the stuff works together) this reflects my current
> understanding.
>
> And I'm also not a Java expert (although I do most of my development in
> Java).
>
> Last comment ;-) I would call the method getAwtFont and not getawtFont.
>
> Kind regards
>
> Maruan Sahyoun
>
> Am 12.04.2010 um 15:20 schrieb Daniel Wilson:
>
> > What I have done, I consider a step in the right direction, but you may
> have
> > some better code.  I do not develop much in Java, so sometimes I do
> things
> > in ways that are not that elegant.
> >
> > I skipped the Type3 fonts in what I did.  I saw what was going on &
> decided
> > I had no idea how to handle it!
> >
> > As for drawString vs TextLayout.getOutline, I really don't know.
> >
> > Sorry I didn't read your comments in 678 well enough.  I would have
> > collaborated on the getAwtFont work!
> >
> > Daniel
> >
> > On Sun, Apr 11, 2010 at 3:04 AM, Maruan Sahyoun  >wrote:
> >
> >> Hi Daniel,
> >>
> >> I think we are currently trying to do the same as I also started
> >> implementing a getAwtFont Method ;-) as outlined in my comments for
> >> PDFBOX-678 in order to get all drawing for the different text modes done
> in
> >> PageDrawer itself (I think we share the same general idea that there
> should
> >> be a clearer separation of concerns). I already have that working for
> >> TrueType fonts (just copied the code in writeText into the new method)
> and
> >> the non clipping text modes. The only difficulty I see is handling e.g.
> >> Type3 fonts as they can not be so easily converted to a font. Maybe we
> share
> >> ideas how to deal with these and then make a decision who implements
> what in
> >> order to avoid duplication of efforts. I'm happy to just rely on your
> >> getAwtFont implementation as you might be further down the road.
> >>
> >> One question when drawing text in PageDrawer is how text handling should
> be
> >> done in general. E.g. using drawString is faster and produces text
> objects
> >> which can be selected for example when you print to a PDF printer. But
> >> outlines etc. are not possible that way. There I can either use
> >> TextLayout.getOutline() to draw the outline (and combine that with
> >> drawString to get selectable text) or selectable text as a result of
> >> PageDrawer is not important at all. This will then also affect possible
> >> applications in PDFReader which currently is display only - but what is
> the
> >> idea with that further down the road.
> >>
> >> Maybe there we should also share some thoughts as you will have a much
> >> better idea about the longer term plan for PDFBox as I'm new to that
> >> project.
> >>
> >> Kind regards
> >>
> >> Maruan Sahyoun
> >>
> >> Am 11.04.2010 um 04:32 schrieb Daniel Wilson:
> >>
> >>> Thanks, Maruan.
> >>>
> >>> The big thing to avoid is direct access to a graphics object in an
> object
> >>> other than PageDrawer.  I inherit from PageDrawer and override many of
> >> t

Re: [jira] Commented: (PDFBOX-688) Refactoring rendering-related classes/methods for extensibility

2010-04-12 Thread Maruan Sahyoun
just wanted to share my thoughts ;-) 

Maruan Sahyoun


Am 12.04.2010 um 18:03 schrieb Daniel Wilson:

> No objection to the capitalization change.  As I just submitted this last
> night, I am probably the only one w/ anything depending on that name.
> 
> I think your view of the separation of platform classes from PDF classes
> makes a lot of sense.
> 
> My use of PDFBox is fairly narrow (as is that of many users), so I would
> like to hear from Andreas or Jukka before committing to anything too major,
> though.
> 
> Daniel
> 
> On Mon, Apr 12, 2010 at 10:56 AM, Maruan Sahyoun 
> wrote:
> 
>> Hi Daniel,
>> 
>> I think it's a good first step. Having thought about that a little more the
>> question to me becomes if it makes sense to even refactor that a bit more as
>> the font handling still contains some platform specific things as now we
>> have getawtFont. What if the PD classes are PDF related and application or
>> platform specific stuff ends in PageDrawer and some new classes let's call
>> them platform classes for the moment. This way PageDrawer would use the
>> platform classes to get the font which then would use the PDxxx classes to
>> get the information necessary  to "generate" the right information.
>> Currently PDxxx is a mixture of PDF specific classes and routines and non
>> PDF specific things like returning a java font. Factoring non PDF related
>> stuff out would provide a better separation and also the chance to provide
>> additional implementations for specific applications e.g. let's say we would
>> like to render PDF to HTML we could implement  a HTML rendition without
>> touching the "core" PDF stuff.
>> 
>> If that's a good idea really depends on where PDFBox should go. Currently
>> looking at the issues in JIRA and the users mailing list I see a number of
>> different types of applications where people are trying to use PDFBox - from
>> text extraction to printing to a Reader type application.  In addition to
>> that there is also some "core" functionality missing. If there are layers
>> like "core", "platform" and "application" I think development would benefit.
>> 
>> As I'm new to PDFBox (and as I wouldn't say that a) I have seen all code
>> and b) understood how all the stuff works together) this reflects my current
>> understanding.
>> 
>> And I'm also not a Java expert (although I do most of my development in
>> Java).
>> 
>> Last comment ;-) I would call the method getAwtFont and not getawtFont.
>> 
>> Kind regards
>> 
>> Maruan Sahyoun
>> 
>> Am 12.04.2010 um 15:20 schrieb Daniel Wilson:
>> 
>>> What I have done, I consider a step in the right direction, but you may
>> have
>>> some better code.  I do not develop much in Java, so sometimes I do
>> things
>>> in ways that are not that elegant.
>>> 
>>> I skipped the Type3 fonts in what I did.  I saw what was going on &
>> decided
>>> I had no idea how to handle it!
>>> 
>>> As for drawString vs TextLayout.getOutline, I really don't know.
>>> 
>>> Sorry I didn't read your comments in 678 well enough.  I would have
>>> collaborated on the getAwtFont work!
>>> 
>>> Daniel
>>> 
>>> On Sun, Apr 11, 2010 at 3:04 AM, Maruan Sahyoun >> wrote:
>>> 
 Hi Daniel,
 
 I think we are currently trying to do the same as I also started
 implementing a getAwtFont Method ;-) as outlined in my comments for
 PDFBOX-678 in order to get all drawing for the different text modes done
>> in
 PageDrawer itself (I think we share the same general idea that there
>> should
 be a clearer separation of concerns). I already have that working for
 TrueType fonts (just copied the code in writeText into the new method)
>> and
 the non clipping text modes. The only difficulty I see is handling e.g.
 Type3 fonts as they can not be so easily converted to a font. Maybe we
>> share
 ideas how to deal with these and then make a decision who implements
>> what in
 order to avoid duplication of efforts. I'm happy to just rely on your
 getAwtFont implementation as you might be further down the road.
 
 One question when drawing text in PageDrawer is how text handling should
>> be
 done in general. E.g. using drawString is faster and produces text
>> objects
 which can be selected for example when you print to a PDF printer. But
 outlines etc. are not possible that way. There I can either use
 TextLayout.getOutline() to draw the outline (and combine that with
 drawString to get selectable text) or selectable text as a result of
 PageDrawer is not important at all. This will then also affect possible
 applications in PDFReader which currently is display only - but what is
>> the
 idea with that further down the road.
 
 Maybe there we should also share some thoughts as you will have a much
 better idea about the longer term plan for PDFBox as I'm new to that
 project.
 
 Kind regards
 
 Maruan Sahyoun
 
 Am 11.04.2010 um 04:32 schrieb Daniel