thx for the hint.

Maruan Sahyoun

> Am 30.07.2014 um 12:33 schrieb Andreas Lehmkühler <andr...@lehmi.de>:
> 
> 
> 
>> Maruan Sahyoun <sahy...@fileaffairs.de> hat am 30. Juli 2014 um 08:12
>> geschrieben:
>> 
>> 
>> +1 for removing the .properties file if the new mechanism is easier to
>> understand and handle. The discussion doesn’t provide that proof or some
>> information about that.
>> 
>> How would a replacement look like?
>> 
>> OTOH if it’s a documentation issue we could also add some more information to
>> the javadocs to explain the dependencies.
>> 
>> We could add a register/unregister method to allow to add/remove custom
>> operator handling or provide a service discovery mechanism. This way we still
>> have the old flexibility.
> There is already the method registerOperatorProcessor in PDFStreamEngine to
> register operators. In most cases it's called when processing the property 
> file.
> In the case of preflight (see PreflightStreamEngine) those register calls are
> done directly within the constructor. There isn't any unregister method.
> 
> BR
> Andreas Lehmkühler
> 
>> 
>> BR
>> Maruan
>> 
>>> Am 29.07.2014 um 21:48 schrieb John Hewson <j...@jahewson.com>:
>>> 
>>> Right but we need to address the confusion and complexity that has been
>>> caused by .properties files which made PDFBOX-2246 so tricky to figure out.
>>> 
>>> Lets remove this wart!
>>> 
>>> -- John
>>> 
>>>> On 29 Jul 2014, at 10:44, Tilman Hausherr <thaush...@t-online.de> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> At this time, the problem I see and wanted to solve (PDFBOX-2246) exists
>>>> regardless whether we use a properties file or initialize directly in the
>>>> code.
>>>> 
>>>> Tilman
>>>> 
>>>> 
>>>> Am 29.07.2014 19:41, schrieb John Hewson:
>>>>> On 29 Jul 2014, at 03:44, Andreas Lehmkühler <andr...@lehmi.de> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> it's not a black and white issue (comments inline)
>>>>>> 
>>>>>>> John Hewson <j...@jahewson.com> hat am 29. Juli 2014 um 07:44
>>>>>>> geschrieben:
>>>>>>> 
>>>>>>> 
>>>>>>> Yes, really I should have said subclasses of PDFStreamEngine -  that's
>>>>>>> where
>>>>>>> the .properties file originates. I'd propose replacing the properties
>>>>>>> mechanism with a simple method containing the mapping which can be
>>>>>>> overridden
>>>>>>> in subclasses. Ultimately, users expect to be able to subclass the
>>>>>>> behaviour
>>>>>>> of a class by just subclassing the class.
>>>>>> PDFStreamEngine doesn't configure any operator set itself. The subclasses
>>>>>> are
>>>>>> supposed to configure their own set of operators depending on the
>>>>>> particular
>>>>>> usecase. E.g. to extend the text extraction one has to subclass
>>>>>> PDFTextStripper
>>>>>> and so on.
>>>>> It’s PDFStreamEngine which implements the .property mechanism though, via
>>>>> the
>>>>> PDFStreamEngine(Properties properties) constructor.
>>>>> 
>>>>>> E.g. to extend the text extraction one has to subclass PDFTextStripper
>>>>>> and so on.
>>>>> That’s true, but it’s only half the story, don’t forget that the
>>>>> .properties files need
>>>>> to be copied and pasted elsewhere and modified along with overriding which
>>>>> .property
>>>>> file is passed in the constructor if you want to truly override the class’
>>>>> behaviour.
>>>>> 
>>>>>>> We've seen a number of incidents of confusion on the mailing list due to
>>>>>>> the
>>>>>>> current design.
>>>>>> IMHO, most of the confusion is based on the lack of knowledge of the pdf
>>>>>> spec.
>>>>>> One can't understand how pdfbox works under the hood by simply looking at
>>>>>> the
>>>>>> code. One has to understand the pdf spec as well, at least the base
>>>>>> concepts.
>>>>> I’m specifically talking about confusion surrounding how to override
>>>>> operators, and
>>>>> .properties files, this has come up before. This entire thread has been
>>>>> caused by
>>>>> PDFBox’s design and *not* the PDF spec.
>>>>> 
>>>>>>> I'd say that to the modern Java developer having non-code runtime
>>>>>>> binding has
>>>>>>> become an anti-pattern, resulting in brittle code which can't easily be
>>>>>>> navigated in an IDE and which resists automated analysis and exhibits
>>>>>>> runtime
>>>>>>> failures despite compiling ok. This is one of those cases where the
>>>>>>> collective
>>>>>>> wisdom has just evolved over the years.
>>>>>> It depends on the given usecase. All solutions have advantages and
>>>>>> disadvantages. E.g. if someone wants to configure the PDFTextStripper
>>>>>> without
>>>>>> recompiling the code, it is quite handy to keep the configuration in a
>>>>>> text
>>>>>> file.
>>>>> Has anybody *ever* wanted to change the operators which PDFTextStripper is
>>>>> processing without recompiling the code? These are internal implementation
>>>>> details that shouldn’t be exposed in the first place - it’s not a
>>>>> “configuration” at
>>>>> all, especially as 99% of possible changes would just break
>>>>> PDFTextStripper.
>>>>> 
>>>>>> In this case I'm neither pro or con a text based config, but I tend to
>>>>>> agree
>>>>>> with John to have the different configurations in some method within the
>>>>>> subclasses of PDFStreamEngine.
>>>>> As above, this isn’t “configuration” at all, it lacks even a basic use
>>>>> case. I don’t
>>>>> see any pros which aren’t fabricated for the sake of argument, but the
>>>>> cons are
>>>>> causing us significant problems right here, right now.
>>>>> 
>>>>>> BR
>>>>>> Andreas Lehmkühler
>>>>>> 
>>>>>>> -- John
>>>>>>> 
>>>>>>>> On 28 Jul 2014, at 13:42, Tilman Hausherr <thaush...@t-online.de>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> I disagree - one doesn't *have* to pass a property file to
>>>>>>>> PDFTextStripper
>>>>>>>> and PageDrawer. The properties file for PDFTextStripper is optional.
>>>>>>>> The
>>>>>>>> property parameter was already there before it became an apache
>>>>>>>> project.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Tilman
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Am 28.07.2014 22:08, schrieb John Hewson:
>>>>>>>>> We need to get rid of these .properties files, they’re causing endless
>>>>>>>>> confusion, not to mention that they hide runtime dependencies in text
>>>>>>>>> files.
>>>>>>>>> 
>>>>>>>>> We should make it so that overriding a TextStripper, PageDrawer, etc.
>>>>>>>>> doesn’t require external .properties files, currently Preflight works
>>>>>>>>> in
>>>>>>>>> this manner and it’s much clearer.
>>>>>>>>> 
>>>>>>>>> I guess this is a legacy of the “old” ways of Java XML everything.
>>>>>>>>> 
>>>>>>>>> -- John
>>>>>>>>> 
>>>>>>>>>> On 27 Jul 2014, at 10:09, -A <aa...@hrtmn.net> wrote:
>>>>>>>>>> 
>>>>>>>>>> Thank you, that works as promised and removes the warning. I'm still
>>>>>>>>>> hoping
>>>>>>>>>> to find a resource that better explains the pieces of PDFBox and how
>>>>>>>>>> they
>>>>>>>>>> work together. Unfortunately most posts on the internet are solely
>>>>>>>>>> how and
>>>>>>>>>> not why.
>>>>>>>>>> 
>>>>>>>>>> Appreciate it!
>>>>>>>>>> 
>>>>>>>>>> -Aaron
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Sun, Jul 27, 2014 at 8:00 AM, Tilman Hausherr
>>>>>>>>>> <thaush...@t-online.de>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> That didn't happen to me, but maybe it did happen to you with
>>>>>>>>>>> another
>>>>>>>>>>> file.
>>>>>>>>>>> 
>>>>>>>>>>> Another solution would be to pass your own properties file, and it
>>>>>>>>>>> should
>>>>>>>>>>> have this content:
>>>>>>>>>>> 
>>>>>>>>>>> =======================
>>>>>>>>>>> # Licensed to the Apache Software Foundation (ASF) under one or more
>>>>>>>>>>> # contributor license agreements.  See the NOTICE file distributed
>>>>>>>>>>> with
>>>>>>>>>>> # this work for additional information regarding copyright
>>>>>>>>>>> ownership.
>>>>>>>>>>> # The ASF licenses this file to You under the Apache License,
>>>>>>>>>>> Version 2.0
>>>>>>>>>>> # (the "License"); you may not use this file except in compliance
>>>>>>>>>>> with
>>>>>>>>>>> # the License.  You may obtain a copy of the License at
>>>>>>>>>>> #
>>>>>>>>>>> #      http://www.apache.org/licenses/LICENSE-2.0
>>>>>>>>>>> #
>>>>>>>>>>> # Unless required by applicable law or agreed to in writing,
>>>>>>>>>>> software
>>>>>>>>>>> # distributed under the License is distributed on an "AS IS" BASIS,
>>>>>>>>>>> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>>>>>>>>>>> implied.
>>>>>>>>>>> # See the License for the specific language governing permissions
>>>>>>>>>>> and
>>>>>>>>>>> # limitations under the License.
>>>>>>>>>>> 
>>>>>>>>>>> # This table is maps PDF stream operators to concrete
>>>>>>>>>>> OperatorProcessor
>>>>>>>>>>> # subclasses that are used by the PDFStreamEngine class to interpret
>>>>>>>>>>> the
>>>>>>>>>>> # PDF document. The classes configured here allow the
>>>>>>>>>>> PDFTextStripper
>>>>>>>>>>> # subclass of PDFStreamEngine to extract text content of the
>>>>>>>>>>> document.
>>>>>>>>>>> 
>>>>>>>>>>> BT = org.apache.pdfbox.util.operator.BeginText
>>>>>>>>>>> cm = org.apache.pdfbox.util.operator.Concatenate
>>>>>>>>>>> Do = org.apache.pdfbox.util.operator.Invoke
>>>>>>>>>>> ET = org.apache.pdfbox.util.operator.EndText
>>>>>>>>>>> gs = org.apache.pdfbox.util.operator.SetGraphicsStateParameters
>>>>>>>>>>> q  = org.apache.pdfbox.util.operator.GSave
>>>>>>>>>>> Q  = org.apache.pdfbox.util.operator.GRestore
>>>>>>>>>>> T* = org.apache.pdfbox.util.operator.NextLine
>>>>>>>>>>> Tc = org.apache.pdfbox.util.operator.SetCharSpacing
>>>>>>>>>>> Td = org.apache.pdfbox.util.operator.MoveText
>>>>>>>>>>> TD = org.apache.pdfbox.util.operator.MoveTextSetLeading
>>>>>>>>>>> Tf = org.apache.pdfbox.util.operator.SetTextFont
>>>>>>>>>>> Tj = org.apache.pdfbox.util.operator.ShowText
>>>>>>>>>>> TJ = org.apache.pdfbox.util.operator.ShowTextGlyph
>>>>>>>>>>> TL = org.apache.pdfbox.util.operator.SetTextLeading
>>>>>>>>>>> Tm = org.apache.pdfbox.util.operator.SetMatrix
>>>>>>>>>>> Tr = org.apache.pdfbox.util.operator.SetTextRenderingMode
>>>>>>>>>>> Ts = org.apache.pdfbox.util.operator.SetTextRise
>>>>>>>>>>> Tw = org.apache.pdfbox.util.operator.SetWordSpacing
>>>>>>>>>>> Tz = org.apache.pdfbox.util.operator.SetHorizontalTextScaling
>>>>>>>>>>> w  = org.apache.pdfbox.util.operator.SetLineWidth
>>>>>>>>>>> \' = org.apache.pdfbox.util.operator.MoveAndShow
>>>>>>>>>>> \" = org.apache.pdfbox.util.operator.SetMoveAndShow
>>>>>>>>>>> 
>>>>>>>>>>> CS=org.apache.pdfbox.util.operator.SetStrokingColorSpace
>>>>>>>>>>> cs=org.apache.pdfbox.util.operator.SetNonStrokingColorSpace
>>>>>>>>>>> rg=org.apache.pdfbox.util.operator.SetNonStrokingRGBColor
>>>>>>>>>>> G=org.apache.pdfbox.util.operator.SetStrokingGrayColor
>>>>>>>>>>> g=org.apache.pdfbox.util.operator.SetNonStrokingGrayColor
>>>>>>>>>>> K=org.apache.pdfbox.util.operator.SetStrokingCMYKColor
>>>>>>>>>>> k=org.apache.pdfbox.util.operator.SetNonStrokingCMYKColor
>>>>>>>>>>> RG=org.apache.pdfbox.util.operator.SetStrokingRGBColor
>>>>>>>>>>> rg=org.apache.pdfbox.util.operator.SetNonStrokingRGBColor
>>>>>>>>>>> SC=org.apache.pdfbox.util.operator.SetStrokingColor
>>>>>>>>>>> sc=org.apache.pdfbox.util.operator.SetNonStrokingColor
>>>>>>>>>>> SCN=org.apache.pdfbox.util.operator.SetStrokingColor
>>>>>>>>>>> scn=org.apache.pdfbox.util.operator.SetNonStrokingColor
>>>>>>>>>>> 
>>>>>>>>>>> # The following operators are not relevant to text extraction,
>>>>>>>>>>> # so we can silently ignore them.
>>>>>>>>>>> 
>>>>>>>>>>> b
>>>>>>>>>>> B
>>>>>>>>>>> b*
>>>>>>>>>>> B*
>>>>>>>>>>> BDC
>>>>>>>>>>> BI
>>>>>>>>>>> BMC
>>>>>>>>>>> BX
>>>>>>>>>>> c
>>>>>>>>>>> d
>>>>>>>>>>> d0
>>>>>>>>>>> d1
>>>>>>>>>>> DP
>>>>>>>>>>> El
>>>>>>>>>>> EMC
>>>>>>>>>>> EX
>>>>>>>>>>> f
>>>>>>>>>>> F
>>>>>>>>>>> f*
>>>>>>>>>>> h
>>>>>>>>>>> i
>>>>>>>>>>> ID
>>>>>>>>>>> j
>>>>>>>>>>> J
>>>>>>>>>>> l
>>>>>>>>>>> m
>>>>>>>>>>> M
>>>>>>>>>>> MP
>>>>>>>>>>> n
>>>>>>>>>>> re
>>>>>>>>>>> ri
>>>>>>>>>>> s
>>>>>>>>>>> S
>>>>>>>>>>> sh
>>>>>>>>>>> v
>>>>>>>>>>> W
>>>>>>>>>>> W*
>>>>>>>>>>> y
>>>>>>>>>>> 
>>>>>>>>>>> =======================
>>>>>>>>>>> 
>>>>>>>>>>> Tilman
>>>>>>>>>>> 
>>>>>>>>>>> Am 27.07.2014 15:54, schrieb -A:
>>>>>>>>>>> 
>>>>>>>>>>> Tilman;
>>>>>>>>>>>> That is somewhat embarrassing. At one point I brought this to the
>>>>>>>>>>>> mailing
>>>>>>>>>>>> list (because of the following warning) and was told to remove that
>>>>>>>>>>>> line
>>>>>>>>>>>> because the TextStripper wasn't actually a PageDrawer. The
>>>>>>>>>>>> functionality
>>>>>>>>>>>> still worked after that, however.
>>>>>>>>>>>> 
>>>>>>>>>>>> Is there a way to do this without the warning, perhaps something
>>>>>>>>>>>> within
>>>>>>>>>>>> PageDrawer?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thank you,
>>>>>>>>>>>> -Aaron
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> WARNING: java.lang.ClassCastException: IncrementalPDFStripper
>>>>>>>>>>>> cannot be
>>>>>>>>>>>> cast to org.apache.pdfbox.pdfviewer.PageDrawer
>>>>>>>>>>>> java.lang.ClassCastException: IncrementalPDFStripper cannot be cast
>>>>>>>>>>>> to
>>>>>>>>>>>> org.apache.pdfbox.pdfviewer.PageDrawer
>>>>>>>>>>>>   at
>>>>>>>>>>>> org.apache.pdfbox.util.operator.pagedrawer.AppendRectangleToPath.process(
>>>>>>>>>>>> AppendRectangleToPath.java:46)
>>>>>>>>>>>>   at
>>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processOperator(
>>>>>>>>>>>> PDFStreamEngine.java:557)
>>>>>>>>>>>> at
>>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(
>>>>>>>>>>>> PDFStreamEngine.java:268)
>>>>>>>>>>>>   at
>>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(
>>>>>>>>>>>> PDFStreamEngine.java:235)
>>>>>>>>>>>>   at
>>>>>>>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processStream(
>>>>>>>>>>>> PDFStreamEngine.java:215)
>>>>>>>>>>>> at
>>>>>>>>>>>> IncrementalPDFStripper.containsRed(IncrementalPDFStripper.java:90)
>>>>>>>>>>>>   at IncrementalPDFStripper.main(IncrementalPDFStripper.java:56)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Sun, Jul 27, 2014 at 5:47 AM, Tilman Hausherr
>>>>>>>>>>>> <thaush...@t-online.de>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> It is even easier than I thought - replace super() with this:
>>>>>>>>>>>>> super(ResourceLoader.loadProperties("org/apache/
>>>>>>>>>>>>> pdfbox/resources/PageDrawer.properties", true));
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Am 27.07.2014 13:03, schrieb Tilman Hausherr:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   After having written the text below, I tested by including the
>>>>>>>>>>>>> "rg"
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> operator in the properties list and now it worked. I also tested
>>>>>>>>>>>>>> deleting
>>>>>>>>>>>>>> your println and instead adding this if the text is red:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>      System.out.print (textPos.getCharacter());
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> and so I got this output:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 21_Key .1295 R~Wall Prof LinP 0.003             0.004     0.000
>>>>>>>>>>>>>> true
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> which is exactly what is red in the PDF.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Another way (probably better) to do it would probably be to not
>>>>>>>>>>>>>> derive
>>>>>>>>>>>>>> PDFTextStripper but |PDFStreamEngine and construct it with||
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ResourceLoader.loadProperties("org/apache/pdfbox/resources/
>>>>>>>>>>>>>> PageDrawer.properties")|
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> see also http://stackoverflow.com/a/9157714/535646
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Am 27.07.2014 12:14, schrieb Tilman Hausherr:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> Do you still have the code that worked?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I'm not the text extraction specialist here, but what I did was
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> look
>>>>>>>>>>>>>>> in the uncompressed source of the PDF. The stream has code like
>>>>>>>>>>>>>>> this:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 0 0 0 rg
>>>>>>>>>>>>>>> 0 0.5019 0 rg
>>>>>>>>>>>>>>> 1 0 0 rg
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The first line sets to black, the second to green, the third to
>>>>>>>>>>>>>>> red.
>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>> from what I saw, it can't work at all, because the "rg" operator
>>>>>>>>>>>>>>> isn't
>>>>>>>>>>>>>>> processed when extracting text, because
>>>>>>>>>>>>>>> PDFTextStripper.properties
>>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>> contain the "rg" operator. (The operator is in another list,
>>>>>>>>>>>>>>> which is
>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>> when rendering)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> So that is what puzzles me. I think it can't work at all. But
>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>> said
>>>>>>>>>>>>>>> it did work at a time.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Am 27.07.2014 07:43, schrieb Tilman Hausherr:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>> Please upload the PDF somewhere and post the URL, PDF files are
>>>>>>>>>>>>>>>> removed
>>>>>>>>>>>>>>>> from the mailing list.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Tilman
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Am 27.07.2014 02:35, schrieb -A:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hello again. I've been trying to figure out this issue that has
>>>>>>>>>>>>>>>> come
>>>>>>>>>>>>>>>>> up for me and in my research I found someone posting on
>>>>>>>>>>>>>>>>> StackOverflow (
>>>>>>>>>>>>>>>>> http://stackoverflow.com/questions/10844271/how-to-get-
>>>>>>>>>>>>>>>>> font-color-using-pdfbox) a similar issue where they could not
>>>>>>>>>>>>>>>>> read
>>>>>>>>>>>>>>>>> any colors from a PDF. The user posted the code and someone
>>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>>> took it,
>>>>>>>>>>>>>>>>> ran it, and reported that it worked. The users approach was
>>>>>>>>>>>>>>>>> different than
>>>>>>>>>>>>>>>>> mine, but alas.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I'm not sure at this point what is going on. I have stepped
>>>>>>>>>>>>>>>>> through
>>>>>>>>>>>>>>>>> each individual character and checked the PDGraphicsState
>>>>>>>>>>>>>>>>> object,
>>>>>>>>>>>>>>>>> and even
>>>>>>>>>>>>>>>>> when I am looking at an open file with visibly red text
>>>>>>>>>>>>>>>>> (attached)
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> debugger only reports DeviceGray. If I print out the
>>>>>>>>>>>>>>>>> ColorSpace
>>>>>>>>>>>>>>>>> name
>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>> the PDGraphicsState this is what is printed - for every
>>>>>>>>>>>>>>>>> character.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I would appreciate if someone could perhaps run the attached
>>>>>>>>>>>>>>>>> text
>>>>>>>>>>>>>>>>> stripper with the attached PDF file and report back if it
>>>>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>>> prints
>>>>>>>>>>>>>>>>> trueinstead of false, as it does for me. Since I saw this
>>>>>>>>>>>>>>>>> occurrence
>>>>>>>>>>>>>>>>> elsewhere I'd like to rule that out - in case an IDE setting
>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>> sort
>>>>>>>>>>>>>>>>> may be causing this?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> It should be noted that I began using PDFBox with 1.8.5 and
>>>>>>>>>>>>>>>>> had
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>> code working fine. Still with 1.8.5 yesterday it was failing.
>>>>>>>>>>>>>>>>> Upgrading to
>>>>>>>>>>>>>>>>> 1.8.6 yielded the same results.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> If this is an actual issue I do not mind attempting to solve
>>>>>>>>>>>>>>>>> it if
>>>>>>>>>>>>>>>>> someone may have a general idea where to point me as to
>>>>>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>>> needless
>>>>>>>>>>>>>>>>> meddling with graphics state objects. Or, if this should be
>>>>>>>>>>>>>>>>> reported
>>>>>>>>>>>>>>>>> I can
>>>>>>>>>>>>>>>>> do that as well.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -Aaron
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> *Previous Message:*
>>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>>> I've attached an updated stripper file with the only addition
>>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> main function to test the class specifically.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> When ran with the PDF I have also attached it indeed does not
>>>>>>>>>>>>>>>>> recognize the red text.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> At this point it seems that this issue is solely dependent on
>>>>>>>>>>>>>>>>> PDFBox.
>>>>>>>>>>>>>>>>> I'll stay tuned for some insight hopefully. If any other
>>>>>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> needed, let me know!
>> 

Reply via email to