> > > I understand what you mean but I still think it is better to have this fix > than keeping the old piece of code. > > I let the administrators decide what to do. > > I am not so sure whether this is a good idea. Actually PDFs with EI without whitespace may be more prevalent than PDFs which contain false positive "EI " as part of image data. Imagine PDF generators which strictly follow pdf reference and do not put such extra whitespace before EI.
I think the correct way would be to follow encoded data (decode until is length known and then find EI). > Christophe > > > Le ven. 19 févr. 2021 à 14:48, Michal Sudolsky <sudols...@gmail.com> a > écrit : > >> >>> >>> I've tried to write the EI just after the stream and it does not work in >>> Acrobat. So, I think the whitespace before EI is really mandatory. >>> >>> Please find my example as attachment if you want to verify on your own. >>> Remove the CR before EI at line 40. >>> >>> >> Your image data are longer than should be. It ends before "7A 7A 7A 0D 0A >> 45 49" (zzz\r\nEI) and this pdf is not so correct. Pdf viewer must be able >> to recover from this but it cannot if there would be "zzzEI". >> >> Try attached pdf. >> >> >> >> >>> Christophe >>> >>> >>> Le ven. 19 févr. 2021 à 12:34, Michal Sudolsky <sudols...@gmail.com> a >>> écrit : >>> >>>> >>>> I understand you try to find the corner-case where it will fail again. >>>>> >>>>> So, let's just consider for the moment that this code is better than >>>>> the previous one. >>>>> >>>>> >>>> I think it would be better but it may not parse valid pdf because it >>>> seems that whitespace before EI is not required. >>>> >>>> >>>> >>>>> Christophe >>>>> >>>>> >>>>> Le ven. 19 févr. 2021 à 12:13, Michal Sudolsky <sudols...@gmail.com> >>>>> a écrit : >>>>> >>>>>> Hi, >>>>>> >>>>>> >>>>>>> I agree with you but this is a rare case as the PDF generators never >>>>>>> generate such a sequence. >>>>>>> >>>>>> >>>>>> It is just that the probability for something like " EI " is small >>>>>> (smaller than "EI" at the end) but it can happen. I doubt that generators >>>>>> are actively trying to avoid that. >>>>>> >>>>>> Also I see nothing in the pdf reference about required whitespace >>>>>> before EI. >>>>>> >>>>>> >>>>>> Christophe >>>>>>> >>>>>>> >>>>>>> Le ven. 19 févr. 2021 à 10:50, Michal Sudolsky <sudols...@gmail.com> >>>>>>> a écrit : >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> This should be better than it was. But I have another example: >>>>>>>> >>>>>>>> BI /W 4 /H 4 /CS /RGB /BPC 8 >>>>>>>> ID >>>>>>>> 00000z0z00zzz00z0zzz0zz EI aazazaazzzaazazzzazzz >>>>>>>> EI >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Feb 19, 2021 at 10:28 AM Christophe <xto...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> Please find as attachment a patch to better read inline images. >>>>>>>>> >>>>>>>>> The image was read until 'EI<whitespace" but this can confuse with >>>>>>>>> image stream. It is the case in one of my PDF file: >>>>>>>>> === >>>>>>>>> ... >>>>>>>>> ID >>>>>>>>> ... >>>>>>>>> >>>>>>>>> qZ$Tls8Vrqs8)cqqZ$Tls8Vops7u]pq>^Kjs8Vops7u]pq>^Kjs8Vlns7lTnq#:<grr;cms7cKlp\k-drVuops8V]bn,N:Ls7QEI >>>>>>>>> ... >>>>>>>>> EI >>>>>>>>> ... >>>>>>>>> === >>>>>>>>> >>>>>>>>> So, the patch looks for "<whitespace>EI<whitespace>". >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Christophe >>>>>>>>> _______________________________________________ >>>>>>>>> Podofo-users mailing list >>>>>>>>> Podofo-users@lists.sourceforge.net >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users >>>>>>>>> >>>>>>>>
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users