On 29 Jul 2014, at 23:12, Maruan Sahyoun <sahy...@fileaffairs.de> wrote:

> +1 for removing the .properties file if the new mechanism is easier to 
> understand and handle. The discussion doesn’t provide that proof or some 
> information about that.
> 
> How would a replacement look like?

Basically like registerOperatorProcessor(), as used in PreflightStreamEngine.

> 
> OTOH if it’s a documentation issue we could also add some more information to 
> the javadocs to explain the dependencies. 
> 
> We could add a register/unregister method to allow to add/remove custom 
> operator handling or provide a service discovery mechanism. This way we still 
> have the old flexibility.
> 

As Andreas notes, there’s a registerOperatorProcessor method which does this, 
so the mechanism is already in place. The problem is not that we don’t have the 
mechanism, it’s that we’re using .properties files at all. The list of 
operator’s can’t be controlled from both code and from .properties lists, one 
source has to be authoritative - otherwise we’d end up with a situation where 
we have an operator disabled in a .properties file and then re-enabled in code. 
Currently we have a situation where that could happen.

Therefore, removing the .properties is the only workable solution. It’s 
important to note that it’s very, very unlikely that anybody is using the 
.properties files in a use-case where they are not also making some code 
changes, so the supposed benefit of “not having to recompile” never existed. 
Adding an operator would always require compile-time changes to PDFBox so that 
the PDFStreamEngine subclasses actually does something with the new operator.

-- John

> BR
> Maruan
> 
> Am 29.07.2014 um 21:48 schrieb John Hewson <j...@jahewson.com>:
> 
>> Right but we need to address the confusion and complexity that has been 
>> caused by .properties files which made PDFBOX-2246 so tricky to figure out.
>> 
>> Lets remove this wart!
>> 
>> -- John
>> 
>> On 29 Jul 2014, at 10:44, Tilman Hausherr <thaush...@t-online.de> wrote:
>> 
>>> Hi,
>>> 
>>> At this time, the problem I see and wanted to solve (PDFBOX-2246) exists 
>>> regardless whether we use a properties file or initialize directly in the 
>>> code.
>>> 
>>> Tilman
>>> 
>>> 
>>> Am 29.07.2014 19:41, schrieb John Hewson:
>>>> On 29 Jul 2014, at 03:44, Andreas Lehmkühler <andr...@lehmi.de> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> it's not a black and white issue (comments inline)
>>>>> 
>>>>>> John Hewson <j...@jahewson.com> hat am 29. Juli 2014 um 07:44 
>>>>>> geschrieben:
>>>>>> 
>>>>>> 
>>>>>> Yes, really I should have said subclasses of PDFStreamEngine -  that's 
>>>>>> where
>>>>>> the .properties file originates. I'd propose replacing the properties
>>>>>> mechanism with a simple method containing the mapping which can be 
>>>>>> overridden
>>>>>> in subclasses. Ultimately, users expect to be able to subclass the 
>>>>>> behaviour
>>>>>> of a class by just subclassing the class.
>>>>> PDFStreamEngine doesn't configure any operator set itself. The subclasses 
>>>>> are
>>>>> supposed to configure their own set of operators depending on the 
>>>>> particular
>>>>> usecase. E.g. to extend the text extraction one has to subclass 
>>>>> PDFTextStripper
>>>>> and so on.
>>>> It’s PDFStreamEngine which implements the .property mechanism though, via 
>>>> the
>>>> PDFStreamEngine(Properties properties) constructor.
>>>> 
>>>>> E.g. to extend the text extraction one has to subclass PDFTextStripper 
>>>>> and so on.
>>>> That’s true, but it’s only half the story, don’t forget that the 
>>>> .properties files need
>>>> to be copied and pasted elsewhere and modified along with overriding which 
>>>> .property
>>>> file is passed in the constructor if you want to truly override the class’ 
>>>> behaviour.
>>>> 
>>>>>> We've seen a number of incidents of confusion on the mailing list due to 
>>>>>> the
>>>>>> current design.
>>>>> IMHO, most of the confusion is based on the lack of knowledge of the pdf 
>>>>> spec.
>>>>> One can't understand how pdfbox works under the hood by simply looking at 
>>>>> the
>>>>> code. One has to understand the pdf spec as well, at least the base 
>>>>> concepts.
>>>> I’m specifically talking about confusion surrounding how to override 
>>>> operators, and
>>>> .properties files, this has come up before. This entire thread has been 
>>>> caused by
>>>> PDFBox’s design and *not* the PDF spec.
>>>> 
>>>>>> I'd say that to the modern Java developer having non-code runtime 
>>>>>> binding has
>>>>>> become an anti-pattern, resulting in brittle code which can't easily be
>>>>>> navigated in an IDE and which resists automated analysis and exhibits 
>>>>>> runtime
>>>>>> failures despite compiling ok. This is one of those cases where the 
>>>>>> collective
>>>>>> wisdom has just evolved over the years.
>>>>> It depends on the given usecase. All solutions have advantages and
>>>>> disadvantages. E.g. if someone wants to configure the PDFTextStripper 
>>>>> without
>>>>> recompiling the code, it is quite handy to keep the configuration in a 
>>>>> text
>>>>> file.
>>>> Has anybody *ever* wanted to change the operators which PDFTextStripper is
>>>> processing without recompiling the code? These are internal implementation
>>>> details that shouldn’t be exposed in the first place - it’s not a 
>>>> “configuration” at
>>>> all, especially as 99% of possible changes would just break 
>>>> PDFTextStripper.
>>>> 
>>>>> In this case I'm neither pro or con a text based config, but I tend to 
>>>>> agree
>>>>> with John to have the different configurations in some method within the
>>>>> subclasses of PDFStreamEngine.
>>>> As above, this isn’t “configuration” at all, it lacks even a basic use 
>>>> case. I don’t
>>>> see any pros which aren’t fabricated for the sake of argument, but the 
>>>> cons are
>>>> causing us significant problems right here, right now.
>>>> 
>>>>> BR
>>>>> Andreas Lehmkühler
>>>>> 
>>>>>> -- John
>>>>>> 
>>>>>>> On 28 Jul 2014, at 13:42, Tilman Hausherr <thaush...@t-online.de> wrote:
>>>>>>> 
>>>>>>> I disagree - one doesn't *have* to pass a property file to 
>>>>>>> PDFTextStripper
>>>>>>> and PageDrawer. The properties file for PDFTextStripper is optional. The
>>>>>>> property parameter was already there before it became an apache project.
>>>>>>> 
>>>>>>> 
>>>>>>> Tilman
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Am 28.07.2014 22:08, schrieb John Hewson:
>>>>>>>> We need to get rid of these .properties files, they’re causing endless
>>>>>>>> confusion, not to mention that they hide runtime dependencies in text
>>>>>>>> files.
>>>>>>>> 
>>>>>>>> We should make it so that overriding a TextStripper, PageDrawer, etc.
>>>>>>>> doesn’t require external .properties files, currently Preflight works 
>>>>>>>> in
>>>>>>>> this manner and it’s much clearer.
>>>>>>>> 
>>>>>>>> I guess this is a legacy of the “old” ways of Java XML everything.
>>>>>>>> 
>>>>>>>> -- John
>>>>>>>> 
>>>>>>>>> On 27 Jul 2014, at 10:09, -A <aa...@hrtmn.net> wrote:
>>>>>>>>> 
>>>>>>>>> Thank you, that works as promised and removes the warning. I'm still
>>>>>>>>> hoping
>>>>>>>>> to find a resource that better explains the pieces of PDFBox and how 
>>>>>>>>> they
>>>>>>>>> work together. Unfortunately most posts on the internet are solely 
>>>>>>>>> how and
>>>>>>>>> not why.
>>>>>>>>> 
>>>>>>>>> Appreciate it!
>>>>>>>>> 
>>>>>>>>> -Aaron
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Sun, Jul 27, 2014 at 8:00 AM, Tilman Hausherr 
>>>>>>>>> <thaush...@t-online.de>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> That didn't happen to me, but maybe it did happen to you with another
>>>>>>>>>> file.
>>>>>>>>>> 
>>>>>>>>>> Another solution would be to pass your own properties file, and it 
>>>>>>>>>> should
>>>>>>>>>> have this content:
>>>>>>>>>> 
>>>>>>>>>> =======================
>>>>>>>>>> # Licensed to the Apache Software Foundation (ASF) under one or more
>>>>>>>>>> # contributor license agreements.  See the NOTICE file distributed 
>>>>>>>>>> with
>>>>>>>>>> # this work for additional information regarding copyright ownership.
>>>>>>>>>> # The ASF licenses this file to You under the Apache License, 
>>>>>>>>>> Version 2.0
>>>>>>>>>> # (the "License"); you may not use this file except in compliance 
>>>>>>>>>> with
>>>>>>>>>> # the License.  You may obtain a copy of the License at
>>>>>>>>>> #
>>>>>>>>>> #      http://www.apache.org/licenses/LICENSE-2.0
>>>>>>>>>> #
>>>>>>>>>> # Unless required by applicable law or agreed to in writing, software
>>>>>>>>>> # distributed under the License is distributed on an "AS IS" BASIS,
>>>>>>>>>> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>>>>>>>>>> implied.
>>>>>>>>>> # See the License for the specific language governing permissions and
>>>>>>>>>> # limitations under the License.
>>>>>>>>>> 
>>>>>>>>>> # This table is maps PDF stream operators to concrete 
>>>>>>>>>> OperatorProcessor
>>>>>>>>>> # subclasses that are used by the PDFStreamEngine class to interpret 
>>>>>>>>>> the
>>>>>>>>>> # PDF document. The classes configured here allow the PDFTextStripper
>>>>>>>>>> # subclass of PDFStreamEngine to extract text content of the 
>>>>>>>>>> document.
>>>>>>>>>> 
>>>>>>>>>> BT = org.apache.pdfbox.util.operator.BeginText
>>>>>>>>>> cm = org.apache.pdfbox.util.operator.Concatenate
>>>>>>>>>> Do = org.apache.pdfbox.util.operator.Invoke
>>>>>>>>>> ET = org.apache.pdfbox.util.operator.EndText
>>>>>>>>>> gs = org.apache.pdfbox.util.operator.SetGraphicsStateParameters
>>>>>>>>>> q  = org.apache.pdfbox.util.operator.GSave
>>>>>>>>>> Q  = org.apache.pdfbox.util.operator.GRestore
>>>>>>>>>> T* = org.apache.pdfbox.util.operator.NextLine
>>>>>>>>>> Tc = org.apache.pdfbox.util.operator.SetCharSpacing
>>>>>>>>>> Td = org.apache.pdfbox.util.operator.MoveText
>>>>>>>>>> TD = org.apache.pdfbox.util.operator.MoveTextSetLeading
>>>>>>>>>> Tf = org.apache.pdfbox.util.operator.SetTextFont
>>>>>>>>>> Tj = org.apache.pdfbox.util.operator.ShowText
>>>>>>>>>> TJ = org.apache.pdfbox.util.operator.ShowTextGlyph
>>>>>>>>>> TL = org.apache.pdfbox.util.operator.SetTextLeading
>>>>>>>>>> Tm = org.apache.pdfbox.util.operator.SetMatrix
>>>>>>>>>> Tr = org.apache.pdfbox.util.operator.SetTextRenderingMode
>>>>>>>>>> Ts = org.apache.pdfbox.util.operator.SetTextRise
>>>>>>>>>> Tw = org.apache.pdfbox.util.operator.SetWordSpacing
>>>>>>>>>> Tz = org.apache.pdfbox.util.operator.SetHorizontalTextScaling
>>>>>>>>>> w  = org.apache.pdfbox.util.operator.SetLineWidth
>>>>>>>>>> \' = org.apache.pdfbox.util.operator.MoveAndShow
>>>>>>>>>> \" = org.apache.pdfbox.util.operator.SetMoveAndShow
>>>>>>>>>> 
>>>>>>>>>> CS=org.apache.pdfbox.util.operator.SetStrokingColorSpace
>>>>>>>>>> cs=org.apache.pdfbox.util.operator.SetNonStrokingColorSpace
>>>>>>>>>> rg=org.apache.pdfbox.util.operator.SetNonStrokingRGBColor
>>>>>>>>>> G=org.apache.pdfbox.util.operator.SetStrokingGrayColor
>>>>>>>>>> g=org.apache.pdfbox.util.operator.SetNonStrokingGrayColor
>>>>>>>>>> K=org.apache.pdfbox.util.operator.SetStrokingCMYKColor
>>>>>>>>>> k=org.apache.pdfbox.util.operator.SetNonStrokingCMYKColor
>>>>>>>>>> RG=org.apache.pdfbox.util.operator.SetStrokingRGBColor
>>>>>>>>>> rg=org.apache.pdfbox.util.operator.SetNonStrokingRGBColor
>>>>>>>>>> SC=org.apache.pdfbox.util.operator.SetStrokingColor
>>>>>>>>>> sc=org.apache.pdfbox.util.operator.SetNonStrokingColor
>>>>>>>>>> SCN=org.apache.pdfbox.util.operator.SetStrokingColor
>>>>>>>>>> scn=org.apache.pdfbox.util.operator.SetNonStrokingColor
>>>>>>>>>> 
>>>>>>>>>> # The following operators are not relevant to text extraction,
>>>>>>>>>> # so we can silently ignore them.
>>>>>>>>>> 
>>>>>>>>>> b
>>>>>>>>>> B
>>>>>>>>>> b*
>>>>>>>>>> B*
>>>>>>>>>> BDC
>>>>>>>>>> BI
>>>>>>>>>> BMC
>>>>>>>>>> BX
>>>>>>>>>> c
>>>>>>>>>> d
>>>>>>>>>> d0
>>>>>>>>>> d1
>>>>>>>>>> DP
>>>>>>>>>> El
>>>>>>>>>> EMC
>>>>>>>>>> EX
>>>>>>>>>> f
>>>>>>>>>> F
>>>>>>>>>> f*
>>>>>>>>>> h
>>>>>>>>>> i
>>>>>>>>>> ID
>>>>>>>>>> j
>>>>>>>>>> J
>>>>>>>>>> l
>>>>>>>>>> m
>>>>>>>>>> M
>>>>>>>>>> MP
>>>>>>>>>> n
>>>>>>>>>> re
>>>>>>>>>> ri
>>>>>>>>>> s
>>>>>>>>>> S
>>>>>>>>>> sh
>>>>>>>>>> v
>>>>>>>>>> W
>>>>>>>>>> W*
>>>>>>>>>> y
>>>>>>>>>> 
>>>>>>>>>> =======================
>>>>>>>>>> 
>>>>>>>>>> Tilman
>>>>>>>>>> 
>>>>>>>>>> Am 27.07.2014 15:54, schrieb -A:
>>>>>>>>>> 
>>>>>>>>>> Tilman;
>>>>>>>>>>> That is somewhat embarrassing. At one point I brought this to the
>>>>>>>>>>> mailing
>>>>>>>>>>> list (because of the following warning) and was told to remove that 
>>>>>>>>>>> line
>>>>>>>>>>> because the TextStripper wasn't actually a PageDrawer. The 
>>>>>>>>>>> functionality
>>>>>>>>>>> still worked after that, however.
>>>>>>>>>>> 
>>>>>>>>>>> Is there a way to do this without the warning, perhaps something 
>>>>>>>>>>> within
>>>>>>>>>>> PageDrawer?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thank you,
>>>>>>>>>>> -Aaron
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> WARNING: java.lang.ClassCastException: IncrementalPDFStripper 
>>>>>>>>>>> cannot be
>>>>>>>>>>> cast to org.apache.pdfbox.pdfviewer.PageDrawer
>>>>>>>>>>> java.lang.ClassCastException: IncrementalPDFStripper cannot be cast 
>>>>>>>>>>> to
>>>>>>>>>>> org.apache.pdfbox.pdfviewer.PageDrawer
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.pdfbox.util.operator.pagedrawer.AppendRectangleToPath.process(
>>>>>>>>>>> AppendRectangleToPath.java:46)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processOperator(
>>>>>>>>>>> PDFStreamEngine.java:557)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(
>>>>>>>>>>> PDFStreamEngine.java:268)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(
>>>>>>>>>>> PDFStreamEngine.java:235)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processStream(
>>>>>>>>>>> PDFStreamEngine.java:215)
>>>>>>>>>>> at 
>>>>>>>>>>> IncrementalPDFStripper.containsRed(IncrementalPDFStripper.java:90)
>>>>>>>>>>> at IncrementalPDFStripper.main(IncrementalPDFStripper.java:56)
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Sun, Jul 27, 2014 at 5:47 AM, Tilman Hausherr 
>>>>>>>>>>> <thaush...@t-online.de>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> It is even easier than I thought - replace super() with this:
>>>>>>>>>>>> super(ResourceLoader.loadProperties("org/apache/
>>>>>>>>>>>> pdfbox/resources/PageDrawer.properties", true));
>>>>>>>>>>>> 
>>>>>>>>>>>> Tilman
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 27.07.2014 13:03, schrieb Tilman Hausherr:
>>>>>>>>>>>> 
>>>>>>>>>>>> After having written the text below, I tested by including the "rg"
>>>>>>>>>>>> 
>>>>>>>>>>>>> operator in the properties list and now it worked. I also tested
>>>>>>>>>>>>> deleting
>>>>>>>>>>>>> your println and instead adding this if the text is red:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>    System.out.print (textPos.getCharacter());
>>>>>>>>>>>>> 
>>>>>>>>>>>>> and so I got this output:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 21_Key .1295 R~Wall Prof LinP 0.003             0.004     0.000 
>>>>>>>>>>>>> true
>>>>>>>>>>>>> 
>>>>>>>>>>>>> which is exactly what is red in the PDF.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Another way (probably better) to do it would probably be to not 
>>>>>>>>>>>>> derive
>>>>>>>>>>>>> PDFTextStripper but |PDFStreamEngine and construct it with||
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ResourceLoader.loadProperties("org/apache/pdfbox/resources/
>>>>>>>>>>>>> PageDrawer.properties")|
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> see also http://stackoverflow.com/a/9157714/535646
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Am 27.07.2014 12:14, schrieb Tilman Hausherr:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> Do you still have the code that worked?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'm not the text extraction specialist here, but what I did was 
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> look
>>>>>>>>>>>>>> in the uncompressed source of the PDF. The stream has code like 
>>>>>>>>>>>>>> this:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 0 0 0 rg
>>>>>>>>>>>>>> 0 0.5019 0 rg
>>>>>>>>>>>>>> 1 0 0 rg
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The first line sets to black, the second to green, the third to 
>>>>>>>>>>>>>> red.
>>>>>>>>>>>>>> And
>>>>>>>>>>>>>> from what I saw, it can't work at all, because the "rg" operator
>>>>>>>>>>>>>> isn't
>>>>>>>>>>>>>> processed when extracting text, because 
>>>>>>>>>>>>>> PDFTextStripper.properties
>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>> contain the "rg" operator. (The operator is in another list, 
>>>>>>>>>>>>>> which is
>>>>>>>>>>>>>> used
>>>>>>>>>>>>>> when rendering)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So that is what puzzles me. I think it can't work at all. But you
>>>>>>>>>>>>>> said
>>>>>>>>>>>>>> it did work at a time.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Am 27.07.2014 07:43, schrieb Tilman Hausherr:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> Please upload the PDF somewhere and post the URL, PDF files are
>>>>>>>>>>>>>>> removed
>>>>>>>>>>>>>>> from the mailing list.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Am 27.07.2014 02:35, schrieb -A:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hello again. I've been trying to figure out this issue that has 
>>>>>>>>>>>>>>> come
>>>>>>>>>>>>>>>> up for me and in my research I found someone posting on
>>>>>>>>>>>>>>>> StackOverflow (
>>>>>>>>>>>>>>>> http://stackoverflow.com/questions/10844271/how-to-get-
>>>>>>>>>>>>>>>> font-color-using-pdfbox) a similar issue where they could not 
>>>>>>>>>>>>>>>> read
>>>>>>>>>>>>>>>> any colors from a PDF. The user posted the code and someone 
>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>> took it,
>>>>>>>>>>>>>>>> ran it, and reported that it worked. The users approach was
>>>>>>>>>>>>>>>> different than
>>>>>>>>>>>>>>>> mine, but alas.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I'm not sure at this point what is going on. I have stepped 
>>>>>>>>>>>>>>>> through
>>>>>>>>>>>>>>>> each individual character and checked the PDGraphicsState 
>>>>>>>>>>>>>>>> object,
>>>>>>>>>>>>>>>> and even
>>>>>>>>>>>>>>>> when I am looking at an open file with visibly red text 
>>>>>>>>>>>>>>>> (attached)
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> debugger only reports DeviceGray. If I print out the ColorSpace
>>>>>>>>>>>>>>>> name
>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>> the PDGraphicsState this is what is printed - for every 
>>>>>>>>>>>>>>>> character.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I would appreciate if someone could perhaps run the attached 
>>>>>>>>>>>>>>>> text
>>>>>>>>>>>>>>>> stripper with the attached PDF file and report back if it 
>>>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>> prints
>>>>>>>>>>>>>>>> trueinstead of false, as it does for me. Since I saw this
>>>>>>>>>>>>>>>> occurrence
>>>>>>>>>>>>>>>> elsewhere I'd like to rule that out - in case an IDE setting of
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>> sort
>>>>>>>>>>>>>>>> may be causing this?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> It should be noted that I began using PDFBox with 1.8.5 and had
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> code working fine. Still with 1.8.5 yesterday it was failing.
>>>>>>>>>>>>>>>> Upgrading to
>>>>>>>>>>>>>>>> 1.8.6 yielded the same results.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> If this is an actual issue I do not mind attempting to solve 
>>>>>>>>>>>>>>>> it if
>>>>>>>>>>>>>>>> someone may have a general idea where to point me as to prevent
>>>>>>>>>>>>>>>> needless
>>>>>>>>>>>>>>>> meddling with graphics state objects. Or, if this should be
>>>>>>>>>>>>>>>> reported
>>>>>>>>>>>>>>>> I can
>>>>>>>>>>>>>>>> do that as well.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -Aaron
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *Previous Message:*
>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>> I've attached an updated stripper file with the only addition 
>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> main function to test the class specifically.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> When ran with the PDF I have also attached it indeed does not
>>>>>>>>>>>>>>>> recognize the red text.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> At this point it seems that this issue is solely dependent on
>>>>>>>>>>>>>>>> PDFBox.
>>>>>>>>>>>>>>>> I'll stay tuned for some insight hopefully. If any other
>>>>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> needed, let me know!
>>> 
>> 
> 

Reply via email to