Am Samstag, dem 05.03.2022 um 16:30 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> I'm not sure if we dicussed that topic in the past or if I simply
> mixed it up 
> with a discussion about "equals" and "="

Not sure that we discussed that in the context of primitives but having
worked on that myself a bit for the primitives there were already
hashCode/equals methods. In addition as you pointed out there are also
static instances for quite a while.

I also had to revert changes done around equals/hashCode in the past as
the current implementation of the COS type classes and (mainly)
COSWriter are dependent on keeping a specific handling.

So to me the question really is what one would expect from a PDF
perspective. If that means that we need to treat equals as being
identical then I'm fine going down that route. This would also resolve
the mutability question we had a while ago. E.g. what happens if the
COSInteger value of COSObject 100 changes but COSObject 200 points to
the same (Java) object?

Currently we have a mix which is inconsitent and the source of
surprises.

On the long run that also means that we need to look at a more
intelligent way of COSWriter as when writing a PDF we should be able to
benefit from storing "same" content only once and reference that.

BR
Maruan

> 
> However, PDFBOX-5286 shows the we have an issue with objects which
> aren't the 
> same but are treated as the same because of the same hash. This is
> true for all 
> simple objects such as COSInteger, COSFLoat, COSBoolean and COSName.
> 
> Think about the following two indirect /Length objects
> 
> 100 0 obj
> 512
> endobj
> 
> 
> 200 0 obj
> 512
> endobj
> 
> * there two different COSObjects "100 0" and "200 0"
> * both COSObjects have different hashes
> * both COSObjects are referencing a COSInteger holding the same value
> "512"
> * both COSIntegers are different objects
> * both COSIntegers have the SAME hash, as the current implementation
> of hashCode 
> is based on the value of the COSInteger
> 
> Or some pseudo code
> 
> COSObject(100,0) != COSObject(200,0)
> COSInteger(100,0) != COSInteger(200,0)
> COSObject(100,0).hashCode != COSObject(200,0).hashCode
> COSInteger(100,0).hashCode == COSInteger(200,0).hashCode
> COSInteger(100,0).equals(COSInteger(200,0) == true
> 
> IMHO we should change the implementation of hashCode so that
> different objects 
> will have different hashCodes.
> 
> I expect some side effects
> * we are using a lot of hash-based collections and I'm afraid there
> may be some 
> cases where the fact of having the same hash for different objects is
> wanted 
> (knowingly or not)
> * we have to remove the static instances for COSInteger values in a
> range from 
> -100 to 256 which will result in an increased number of COSInteger
> instances
> * there are just two static instances of COSBoolean ("true" and
> "false") which 
> have to be replaced too
> * COSName is caching a lot of values as static instances as well,
> which should 
> be removed as well
> * looks like COSFloat shouldn't be a problem
> 
> WDYT? Should we simply start with COSFloat and COSInteger and see how
> it ends up?
> 
> Andreas
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to