Hi I think I’ve got to the root cause of the reported problems in PdfString.
PdfStrings can be in 2 states: valid or invalid: /** The string is valid if no error in the constructor has occurred. * If it is valid it is safe to call all the other member functions. * \returns true if this is a valid initialized PdfString */ inline bool IsValid() const; The default PdfString constructor deliberately constructs an invalid string - this is used for things like PdfString::StringNull (which is different from an empty string) and is returned by various methods like PdfInfo::GetStringFromInfoDict, PdfField::GetFieldName, PdfField::GetAlternateName. There are other PdfString constructors that also create an invalid string: PdfString( (char*)NULL ) for example. When IsValid() returns false various undefined behaviours occur if an invalid PdfString is used: - GetLength / GetUnicodeLength / GetCharacterLength return -1 or -2 - ToUnicode faults accessing a NULL pointer - PdfEncoding::ConvertToUnicode - tries to allocate (SIZE_MAX-1)/2 or (SIZE_MAX-2)/2 bytes and throws ePdfError_OutOfMemory if GetCharacterLength < 0 - PdfSimpleEncoding::ConvertToUnicode - tries to allocate SIZE_MAX-1 bytes and throws ePdfError_OutOfMemory if GetCharacterLength < 0 - PdfIdentityEncoding::ConvertToEncoding - tries to allocate SIZE_MAX-1 bytes and throws ePdfError_OutOfMemory if GetCharacterLength < 0 - PdfDifferenceEncoding::ConvertToUnicode tries to allocate SIZE_MAX-1 or SIZE_MAX-2 bytes and throws ePdfError_OutOfMemory if GetCharacterLength < 0 - PdfDifferenceEncoding::ConvertToEncoding tries to allocate SIZE_MAX-1 or SIZE_MAX-2 bytes and throws ePdfError_OutOfMemory if GetCharacterLength < 0 - PdfString comparison and equality operators may fault or throw ePdfError_OutOfMemory if they try to convert the encoding of one of the operands I think the problems happen because none of the PoDoFo code checks PdfString::IsValid, apart from PdfString::GetStringUtf8. I would guess the same is true of most PoDoFo client code. The patch makes PdfString methods have document well-defined safe behaviour if IsValid() returns false: - PdfString::GetLength / PdfString::GetUnicodeLength / PdfString::GetCharacterLength return 0 (this prevents allocations of SIZE_MAX-1 or SIZE_MAX-2) - PdfString::ToUnicode returns an invalid string if it’s called on an invalid string - the < and > operators return false if LHS and/or RHS are invalid - the == operator return false if either LHS or RHS are invalid - the == operator return true if both LHS and RHS are invalid The patch is designed to only change behaviour when the current behaviour is bad (i.e. access faults or out of memory errors). Where the current behaviour is reasonable there are no changes other than documenting the behaviour. Best Regards Mark Mark Rogers - mark.rog...@powermapper.com PowerMapper Software Ltd - www.powermapper.com Registered in Scotland No 362274 Quartermile 2 Edinburgh EH3 9GL
patch-pdfstring-20160528.diff
Description: patch-pdfstring-20160528.diff
------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users