Hello;

I am just going to jump in and ask about the following warning when used
with the default PDFTextStripper class:

WARNING: Count in xref table is 0 at offset 96825

Attached is the causing document. I thought it may have to do with the
Properties file that Tillman Hausherr pointed out to me, but didn't.

This isn't a big issue as the program still functions, but if I could get
rid of the warning so I don't have to look at it - more the merrier!

Also getting to the PDF spec. If there is anything I could assist with if
the properties file  becomes an active issue (even just testing), let me
know.


Thanks,
-Aaron





On Wed, Jul 30, 2014 at 11:10 AM, John Hewson <j...@jahewson.com> wrote:

> On 29 Jul 2014, at 23:12, Maruan Sahyoun <sahy...@fileaffairs.de> wrote:
>
> > +1 for removing the .properties file if the new mechanism is easier to
> understand and handle. The discussion doesn’t provide that proof or some
> information about that.
> >
> > How would a replacement look like?
>
> Basically like registerOperatorProcessor(), as used in
> PreflightStreamEngine.
>
> >
> > OTOH if it’s a documentation issue we could also add some more
> information to the javadocs to explain the dependencies.
> >
> > We could add a register/unregister method to allow to add/remove custom
> operator handling or provide a service discovery mechanism. This way we
> still have the old flexibility.
> >
>
> As Andreas notes, there’s a registerOperatorProcessor method which does
> this, so the mechanism is already in place. The problem is not that we
> don’t have the mechanism, it’s that we’re using .properties files at all.
> The list of operator’s can’t be controlled from both code and from
> .properties lists, one source has to be authoritative - otherwise we’d end
> up with a situation where we have an operator disabled in a .properties
> file and then re-enabled in code. Currently we have a situation where that
> could happen.
>
> Therefore, removing the .properties is the only workable solution. It’s
> important to note that it’s very, very unlikely that anybody is using the
> .properties files in a use-case where they are not also making some code
> changes, so the supposed benefit of “not having to recompile” never
> existed. Adding an operator would always require compile-time changes to
> PDFBox so that the PDFStreamEngine subclasses actually does something with
> the new operator.
>
> -- John
>
> > BR
> > Maruan
> >
> > Am 29.07.2014 um 21:48 schrieb John Hewson <j...@jahewson.com>:
> >
> >> Right but we need to address the confusion and complexity that has been
> caused by .properties files which made PDFBOX-2246 so tricky to figure out.
> >>
> >> Lets remove this wart!
> >>
> >> -- John
> >>
> >> On 29 Jul 2014, at 10:44, Tilman Hausherr <thaush...@t-online.de>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> At this time, the problem I see and wanted to solve (PDFBOX-2246)
> exists regardless whether we use a properties file or initialize directly
> in the code.
> >>>
> >>> Tilman
> >>>
> >>>
> >>> Am 29.07.2014 19:41, schrieb John Hewson:
> >>>> On 29 Jul 2014, at 03:44, Andreas Lehmkühler <andr...@lehmi.de>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> it's not a black and white issue (comments inline)
> >>>>>
> >>>>>> John Hewson <j...@jahewson.com> hat am 29. Juli 2014 um 07:44
> geschrieben:
> >>>>>>
> >>>>>>
> >>>>>> Yes, really I should have said subclasses of PDFStreamEngine -
>  that's where
> >>>>>> the .properties file originates. I'd propose replacing the
> properties
> >>>>>> mechanism with a simple method containing the mapping which can be
> overridden
> >>>>>> in subclasses. Ultimately, users expect to be able to subclass the
> behaviour
> >>>>>> of a class by just subclassing the class.
> >>>>> PDFStreamEngine doesn't configure any operator set itself. The
> subclasses are
> >>>>> supposed to configure their own set of operators depending on the
> particular
> >>>>> usecase. E.g. to extend the text extraction one has to subclass
> PDFTextStripper
> >>>>> and so on.
> >>>> It’s PDFStreamEngine which implements the .property mechanism though,
> via the
> >>>> PDFStreamEngine(Properties properties) constructor.
> >>>>
> >>>>> E.g. to extend the text extraction one has to subclass
> PDFTextStripper and so on.
> >>>> That’s true, but it’s only half the story, don’t forget that the
> .properties files need
> >>>> to be copied and pasted elsewhere and modified along with overriding
> which .property
> >>>> file is passed in the constructor if you want to truly override the
> class’ behaviour.
> >>>>
> >>>>>> We've seen a number of incidents of confusion on the mailing list
> due to the
> >>>>>> current design.
> >>>>> IMHO, most of the confusion is based on the lack of knowledge of the
> pdf spec.
> >>>>> One can't understand how pdfbox works under the hood by simply
> looking at the
> >>>>> code. One has to understand the pdf spec as well, at least the base
> concepts.
> >>>> I’m specifically talking about confusion surrounding how to override
> operators, and
> >>>> .properties files, this has come up before. This entire thread has
> been caused by
> >>>> PDFBox’s design and *not* the PDF spec.
> >>>>
> >>>>>> I'd say that to the modern Java developer having non-code runtime
> binding has
> >>>>>> become an anti-pattern, resulting in brittle code which can't
> easily be
> >>>>>> navigated in an IDE and which resists automated analysis and
> exhibits runtime
> >>>>>> failures despite compiling ok. This is one of those cases where the
> collective
> >>>>>> wisdom has just evolved over the years.
> >>>>> It depends on the given usecase. All solutions have advantages and
> >>>>> disadvantages. E.g. if someone wants to configure the
> PDFTextStripper without
> >>>>> recompiling the code, it is quite handy to keep the configuration in
> a text
> >>>>> file.
> >>>> Has anybody *ever* wanted to change the operators which
> PDFTextStripper is
> >>>> processing without recompiling the code? These are internal
> implementation
> >>>> details that shouldn’t be exposed in the first place - it’s not a
> “configuration” at
> >>>> all, especially as 99% of possible changes would just break
> PDFTextStripper.
> >>>>
> >>>>> In this case I'm neither pro or con a text based config, but I tend
> to agree
> >>>>> with John to have the different configurations in some method within
> the
> >>>>> subclasses of PDFStreamEngine.
> >>>> As above, this isn’t “configuration” at all, it lacks even a basic
> use case. I don’t
> >>>> see any pros which aren’t fabricated for the sake of argument, but
> the cons are
> >>>> causing us significant problems right here, right now.
> >>>>
> >>>>> BR
> >>>>> Andreas Lehmkühler
> >>>>>
> >>>>>> -- John
> >>>>>>
> >>>>>>> On 28 Jul 2014, at 13:42, Tilman Hausherr <thaush...@t-online.de>
> wrote:
> >>>>>>>
> >>>>>>> I disagree - one doesn't *have* to pass a property file to
> PDFTextStripper
> >>>>>>> and PageDrawer. The properties file for PDFTextStripper is
> optional. The
> >>>>>>> property parameter was already there before it became an apache
> project.
> >>>>>>>
> >>>>>>>
> >>>>>>> Tilman
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Am 28.07.2014 22:08, schrieb John Hewson:
> >>>>>>>> We need to get rid of these .properties files, they’re causing
> endless
> >>>>>>>> confusion, not to mention that they hide runtime dependencies in
> text
> >>>>>>>> files.
> >>>>>>>>
> >>>>>>>> We should make it so that overriding a TextStripper, PageDrawer,
> etc.
> >>>>>>>> doesn’t require external .properties files, currently Preflight
> works in
> >>>>>>>> this manner and it’s much clearer.
> >>>>>>>>
> >>>>>>>> I guess this is a legacy of the “old” ways of Java XML everything.
> >>>>>>>>
> >>>>>>>> -- John
> >>>>>>>>
> >>>>>>>>> On 27 Jul 2014, at 10:09, -A <aa...@hrtmn.net> wrote:
> >>>>>>>>>
> >>>>>>>>> Thank you, that works as promised and removes the warning. I'm
> still
> >>>>>>>>> hoping
> >>>>>>>>> to find a resource that better explains the pieces of PDFBox and
> how they
> >>>>>>>>> work together. Unfortunately most posts on the internet are
> solely how and
> >>>>>>>>> not why.
> >>>>>>>>>
> >>>>>>>>> Appreciate it!
> >>>>>>>>>
> >>>>>>>>> -Aaron
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Sun, Jul 27, 2014 at 8:00 AM, Tilman Hausherr <
> thaush...@t-online.de>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> That didn't happen to me, but maybe it did happen to you with
> another
> >>>>>>>>>> file.
> >>>>>>>>>>
> >>>>>>>>>> Another solution would be to pass your own properties file, and
> it should
> >>>>>>>>>> have this content:
> >>>>>>>>>>
> >>>>>>>>>> =======================
> >>>>>>>>>> # Licensed to the Apache Software Foundation (ASF) under one or
> more
> >>>>>>>>>> # contributor license agreements.  See the NOTICE file
> distributed with
> >>>>>>>>>> # this work for additional information regarding copyright
> ownership.
> >>>>>>>>>> # The ASF licenses this file to You under the Apache License,
> Version 2.0
> >>>>>>>>>> # (the "License"); you may not use this file except in
> compliance with
> >>>>>>>>>> # the License.  You may obtain a copy of the License at
> >>>>>>>>>> #
> >>>>>>>>>> #      http://www.apache.org/licenses/LICENSE-2.0
> >>>>>>>>>> #
> >>>>>>>>>> # Unless required by applicable law or agreed to in writing,
> software
> >>>>>>>>>> # distributed under the License is distributed on an "AS IS"
> BASIS,
> >>>>>>>>>> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
> or
> >>>>>>>>>> implied.
> >>>>>>>>>> # See the License for the specific language governing
> permissions and
> >>>>>>>>>> # limitations under the License.
> >>>>>>>>>>
> >>>>>>>>>> # This table is maps PDF stream operators to concrete
> OperatorProcessor
> >>>>>>>>>> # subclasses that are used by the PDFStreamEngine class to
> interpret the
> >>>>>>>>>> # PDF document. The classes configured here allow the
> PDFTextStripper
> >>>>>>>>>> # subclass of PDFStreamEngine to extract text content of the
> document.
> >>>>>>>>>>
> >>>>>>>>>> BT = org.apache.pdfbox.util.operator.BeginText
> >>>>>>>>>> cm = org.apache.pdfbox.util.operator.Concatenate
> >>>>>>>>>> Do = org.apache.pdfbox.util.operator.Invoke
> >>>>>>>>>> ET = org.apache.pdfbox.util.operator.EndText
> >>>>>>>>>> gs = org.apache.pdfbox.util.operator.SetGraphicsStateParameters
> >>>>>>>>>> q  = org.apache.pdfbox.util.operator.GSave
> >>>>>>>>>> Q  = org.apache.pdfbox.util.operator.GRestore
> >>>>>>>>>> T* = org.apache.pdfbox.util.operator.NextLine
> >>>>>>>>>> Tc = org.apache.pdfbox.util.operator.SetCharSpacing
> >>>>>>>>>> Td = org.apache.pdfbox.util.operator.MoveText
> >>>>>>>>>> TD = org.apache.pdfbox.util.operator.MoveTextSetLeading
> >>>>>>>>>> Tf = org.apache.pdfbox.util.operator.SetTextFont
> >>>>>>>>>> Tj = org.apache.pdfbox.util.operator.ShowText
> >>>>>>>>>> TJ = org.apache.pdfbox.util.operator.ShowTextGlyph
> >>>>>>>>>> TL = org.apache.pdfbox.util.operator.SetTextLeading
> >>>>>>>>>> Tm = org.apache.pdfbox.util.operator.SetMatrix
> >>>>>>>>>> Tr = org.apache.pdfbox.util.operator.SetTextRenderingMode
> >>>>>>>>>> Ts = org.apache.pdfbox.util.operator.SetTextRise
> >>>>>>>>>> Tw = org.apache.pdfbox.util.operator.SetWordSpacing
> >>>>>>>>>> Tz = org.apache.pdfbox.util.operator.SetHorizontalTextScaling
> >>>>>>>>>> w  = org.apache.pdfbox.util.operator.SetLineWidth
> >>>>>>>>>> \' = org.apache.pdfbox.util.operator.MoveAndShow
> >>>>>>>>>> \" = org.apache.pdfbox.util.operator.SetMoveAndShow
> >>>>>>>>>>
> >>>>>>>>>> CS=org.apache.pdfbox.util.operator.SetStrokingColorSpace
> >>>>>>>>>> cs=org.apache.pdfbox.util.operator.SetNonStrokingColorSpace
> >>>>>>>>>> rg=org.apache.pdfbox.util.operator.SetNonStrokingRGBColor
> >>>>>>>>>> G=org.apache.pdfbox.util.operator.SetStrokingGrayColor
> >>>>>>>>>> g=org.apache.pdfbox.util.operator.SetNonStrokingGrayColor
> >>>>>>>>>> K=org.apache.pdfbox.util.operator.SetStrokingCMYKColor
> >>>>>>>>>> k=org.apache.pdfbox.util.operator.SetNonStrokingCMYKColor
> >>>>>>>>>> RG=org.apache.pdfbox.util.operator.SetStrokingRGBColor
> >>>>>>>>>> rg=org.apache.pdfbox.util.operator.SetNonStrokingRGBColor
> >>>>>>>>>> SC=org.apache.pdfbox.util.operator.SetStrokingColor
> >>>>>>>>>> sc=org.apache.pdfbox.util.operator.SetNonStrokingColor
> >>>>>>>>>> SCN=org.apache.pdfbox.util.operator.SetStrokingColor
> >>>>>>>>>> scn=org.apache.pdfbox.util.operator.SetNonStrokingColor
> >>>>>>>>>>
> >>>>>>>>>> # The following operators are not relevant to text extraction,
> >>>>>>>>>> # so we can silently ignore them.
> >>>>>>>>>>
> >>>>>>>>>> b
> >>>>>>>>>> B
> >>>>>>>>>> b*
> >>>>>>>>>> B*
> >>>>>>>>>> BDC
> >>>>>>>>>> BI
> >>>>>>>>>> BMC
> >>>>>>>>>> BX
> >>>>>>>>>> c
> >>>>>>>>>> d
> >>>>>>>>>> d0
> >>>>>>>>>> d1
> >>>>>>>>>> DP
> >>>>>>>>>> El
> >>>>>>>>>> EMC
> >>>>>>>>>> EX
> >>>>>>>>>> f
> >>>>>>>>>> F
> >>>>>>>>>> f*
> >>>>>>>>>> h
> >>>>>>>>>> i
> >>>>>>>>>> ID
> >>>>>>>>>> j
> >>>>>>>>>> J
> >>>>>>>>>> l
> >>>>>>>>>> m
> >>>>>>>>>> M
> >>>>>>>>>> MP
> >>>>>>>>>> n
> >>>>>>>>>> re
> >>>>>>>>>> ri
> >>>>>>>>>> s
> >>>>>>>>>> S
> >>>>>>>>>> sh
> >>>>>>>>>> v
> >>>>>>>>>> W
> >>>>>>>>>> W*
> >>>>>>>>>> y
> >>>>>>>>>>
> >>>>>>>>>> =======================
> >>>>>>>>>>
> >>>>>>>>>> Tilman
> >>>>>>>>>>
> >>>>>>>>>> Am 27.07.2014 15:54, schrieb -A:
> >>>>>>>>>>
> >>>>>>>>>> Tilman;
> >>>>>>>>>>> That is somewhat embarrassing. At one point I brought this to
> the
> >>>>>>>>>>> mailing
> >>>>>>>>>>> list (because of the following warning) and was told to remove
> that line
> >>>>>>>>>>> because the TextStripper wasn't actually a PageDrawer. The
> functionality
> >>>>>>>>>>> still worked after that, however.
> >>>>>>>>>>>
> >>>>>>>>>>> Is there a way to do this without the warning, perhaps
> something within
> >>>>>>>>>>> PageDrawer?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Thank you,
> >>>>>>>>>>> -Aaron
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> WARNING: java.lang.ClassCastException: IncrementalPDFStripper
> cannot be
> >>>>>>>>>>> cast to org.apache.pdfbox.pdfviewer.PageDrawer
> >>>>>>>>>>> java.lang.ClassCastException: IncrementalPDFStripper cannot be
> cast to
> >>>>>>>>>>> org.apache.pdfbox.pdfviewer.PageDrawer
> >>>>>>>>>>> at
> >>>>>>>>>>>
> org.apache.pdfbox.util.operator.pagedrawer.AppendRectangleToPath.process(
> >>>>>>>>>>> AppendRectangleToPath.java:46)
> >>>>>>>>>>> at
> >>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processOperator(
> >>>>>>>>>>> PDFStreamEngine.java:557)
> >>>>>>>>>>> at
> >>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(
> >>>>>>>>>>> PDFStreamEngine.java:268)
> >>>>>>>>>>> at
> >>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(
> >>>>>>>>>>> PDFStreamEngine.java:235)
> >>>>>>>>>>> at
> >>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processStream(
> >>>>>>>>>>> PDFStreamEngine.java:215)
> >>>>>>>>>>> at
> IncrementalPDFStripper.containsRed(IncrementalPDFStripper.java:90)
> >>>>>>>>>>> at IncrementalPDFStripper.main(IncrementalPDFStripper.java:56)
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Sun, Jul 27, 2014 at 5:47 AM, Tilman Hausherr <
> thaush...@t-online.de>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> It is even easier than I thought - replace super() with this:
> >>>>>>>>>>>> super(ResourceLoader.loadProperties("org/apache/
> >>>>>>>>>>>> pdfbox/resources/PageDrawer.properties", true));
> >>>>>>>>>>>>
> >>>>>>>>>>>> Tilman
> >>>>>>>>>>>>
> >>>>>>>>>>>> Am 27.07.2014 13:03, schrieb Tilman Hausherr:
> >>>>>>>>>>>>
> >>>>>>>>>>>> After having written the text below, I tested by including
> the "rg"
> >>>>>>>>>>>>
> >>>>>>>>>>>>> operator in the properties list and now it worked. I also
> tested
> >>>>>>>>>>>>> deleting
> >>>>>>>>>>>>> your println and instead adding this if the text is red:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    System.out.print (textPos.getCharacter());
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> and so I got this output:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 21_Key .1295 R~Wall Prof LinP 0.003             0.004
> 0.000 true
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> which is exactly what is red in the PDF.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Another way (probably better) to do it would probably be to
> not derive
> >>>>>>>>>>>>> PDFTextStripper but |PDFStreamEngine and construct it with||
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ResourceLoader.loadProperties("org/apache/pdfbox/resources/
> >>>>>>>>>>>>> PageDrawer.properties")|
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> see also http://stackoverflow.com/a/9157714/535646
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Tilman
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Am 27.07.2014 12:14, schrieb Tilman Hausherr:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>> Do you still have the code that worked?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm not the text extraction specialist here, but what I did
> was to
> >>>>>>>>>>>>>> look
> >>>>>>>>>>>>>> in the uncompressed source of the PDF. The stream has code
> like this:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 0 0 0 rg
> >>>>>>>>>>>>>> 0 0.5019 0 rg
> >>>>>>>>>>>>>> 1 0 0 rg
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The first line sets to black, the second to green, the
> third to red.
> >>>>>>>>>>>>>> And
> >>>>>>>>>>>>>> from what I saw, it can't work at all, because the "rg"
> operator
> >>>>>>>>>>>>>> isn't
> >>>>>>>>>>>>>> processed when extracting text, because
> PDFTextStripper.properties
> >>>>>>>>>>>>>> doesn't
> >>>>>>>>>>>>>> contain the "rg" operator. (The operator is in another
> list, which is
> >>>>>>>>>>>>>> used
> >>>>>>>>>>>>>> when rendering)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So that is what puzzles me. I think it can't work at all.
> But you
> >>>>>>>>>>>>>> said
> >>>>>>>>>>>>>> it did work at a time.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Tilman
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Am 27.07.2014 07:43, schrieb Tilman Hausherr:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>> Please upload the PDF somewhere and post the URL, PDF
> files are
> >>>>>>>>>>>>>>> removed
> >>>>>>>>>>>>>>> from the mailing list.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Tilman
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Am 27.07.2014 02:35, schrieb -A:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hello again. I've been trying to figure out this issue
> that has come
> >>>>>>>>>>>>>>>> up for me and in my research I found someone posting on
> >>>>>>>>>>>>>>>> StackOverflow (
> >>>>>>>>>>>>>>>> http://stackoverflow.com/questions/10844271/how-to-get-
> >>>>>>>>>>>>>>>> font-color-using-pdfbox) a similar issue where they could
> not read
> >>>>>>>>>>>>>>>> any colors from a PDF. The user posted the code and
> someone else
> >>>>>>>>>>>>>>>> took it,
> >>>>>>>>>>>>>>>> ran it, and reported that it worked. The users approach
> was
> >>>>>>>>>>>>>>>> different than
> >>>>>>>>>>>>>>>> mine, but alas.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'm not sure at this point what is going on. I have
> stepped through
> >>>>>>>>>>>>>>>> each individual character and checked the PDGraphicsState
> object,
> >>>>>>>>>>>>>>>> and even
> >>>>>>>>>>>>>>>> when I am looking at an open file with visibly red text
> (attached)
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> debugger only reports DeviceGray. If I print out the
> ColorSpace
> >>>>>>>>>>>>>>>> name
> >>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>> the PDGraphicsState this is what is printed - for every
> character.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I would appreciate if someone could perhaps run the
> attached text
> >>>>>>>>>>>>>>>> stripper with the attached PDF file and report back if it
> actually
> >>>>>>>>>>>>>>>> prints
> >>>>>>>>>>>>>>>> trueinstead of false, as it does for me. Since I saw this
> >>>>>>>>>>>>>>>> occurrence
> >>>>>>>>>>>>>>>> elsewhere I'd like to rule that out - in case an IDE
> setting of
> >>>>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>> sort
> >>>>>>>>>>>>>>>> may be causing this?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> It should be noted that I began using PDFBox with 1.8.5
> and had
> >>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>> code working fine. Still with 1.8.5 yesterday it was
> failing.
> >>>>>>>>>>>>>>>> Upgrading to
> >>>>>>>>>>>>>>>> 1.8.6 yielded the same results.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> If this is an actual issue I do not mind attempting to
> solve it if
> >>>>>>>>>>>>>>>> someone may have a general idea where to point me as to
> prevent
> >>>>>>>>>>>>>>>> needless
> >>>>>>>>>>>>>>>> meddling with graphics state objects. Or, if this should
> be
> >>>>>>>>>>>>>>>> reported
> >>>>>>>>>>>>>>>> I can
> >>>>>>>>>>>>>>>> do that as well.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks!
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -Aaron
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> *Previous Message:*
> >>>>>>>>>>>>>>>> *
> >>>>>>>>>>>>>>>> *
> >>>>>>>>>>>>>>>> *
> >>>>>>>>>>>>>>>> *
> >>>>>>>>>>>>>>>> I've attached an updated stripper file with the only
> addition being
> >>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>> main function to test the class specifically.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> When ran with the PDF I have also attached it indeed does
> not
> >>>>>>>>>>>>>>>> recognize the red text.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> At this point it seems that this issue is solely
> dependent on
> >>>>>>>>>>>>>>>> PDFBox.
> >>>>>>>>>>>>>>>> I'll stay tuned for some insight hopefully. If any other
> >>>>>>>>>>>>>>>> information
> >>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>> needed, let me know!
> >>>
> >>
> >
>
>

Reply via email to