Re: serialization of Derby DataSources
I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to set serialVersionUID in the second version. I wrote some tests to verify serialization compatibility between versions of classes but never got to the point of checking them in. They may be valuable, and could be added to our compatibility tests, so if you'd like I can poke around and find them. One bug I uncovered in my tests was that for one of the data sources the serialversion UID was not public, so I was getting failures. Now I can't remember if I checked in that fix or not. David Rick Hillegas wrote: I'm confused about the presence of serialVersionUIDs in the DataSources exposed by our network client (e.g., ClientConnectionPoolDataSource). I think I understand why these classes are serializable (JNDI wants to serialize them). But I don't understand why we are forcibly setting the serialization id. I don't see any documentation explaining the serialization problem this addresses, stating the implications for engineers editting these classes, or describing our expectations at version upgrade. Can someone shed some light on this? Thanks, -Rick
Re: serialization of Derby DataSources
Thanks, David. I'm afraid I'm still muddled. I think I understand the basic purpose of serialVersionUID: It's a compiler-generated checksum of the source which serialization uses as a sanity check. By explicitly setting this field, the engineer promises to keep the following contract: Although the class behavior may change between versions, the non-transient fields won't. But I'm still not grasping the serialization issue we're addressing here. How do we get into a situation where there are two different versions of one of these classes? Is anyone persisting these classes across upgrades of the Derby code? Perhaps all that's being addressed here is the following recommendation from the javadoc of java.io.Serializable: However, it is /strongly recommended/ that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations... I don't think we have this problem, though: at release time we produce a standard, vetted version of Derby for which the compiler is constant. Thanks for helping me puzzle through this. Regards, -Rick David W. Van Couvering wrote: I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to set serialVersionUID in the second version. I wrote some tests to verify serialization compatibility between versions of classes but never got to the point of checking them in. They may be valuable, and could be added to our compatibility tests, so if you'd like I can poke around and find them. One bug I uncovered in my tests was that for one of the data sources the serialversion UID was not public, so I was getting failures. Now I can't remember if I checked in that fix or not. David Rick Hillegas wrote: I'm confused about the presence of serialVersionUIDs in the DataSources exposed by our network client (e.g., ClientConnectionPoolDataSource). I think I understand why these classes are serializable (JNDI wants to serialize them). But I don't understand why we are forcibly setting the serialization id. I don't see any documentation explaining the serialization problem this addresses, stating the implications for engineers editting these classes, or describing our expectations at version upgrade. Can someone shed some light on this? Thanks, -Rick
Re: serialization of Derby DataSources
Hi Rick, once the serialVerisonUID is there, you should not remove it as chaos can break out if the IDs start to differ. IMHO would leave them alone. One example is you have say someone using say derby version x with a an ID of 1 and then persisted the object... now u remove the ID in derby y and the compiler generates say -2 for the ID , you will encounter problems when you try and grab the persisted version as the IDs no longer match. Rick Hillegas wrote: Thanks, David. I'm afraid I'm still muddled. I think I understand the basic purpose of serialVersionUID: It's a compiler-generated checksum of the source which serialization uses as a sanity check. By explicitly setting this field, the engineer promises to keep the following contract: Although the class behavior may change between versions, the non-transient fields won't. But I'm still not grasping the serialization issue we're addressing here. How do we get into a situation where there are two different versions of one of these classes? Is anyone persisting these classes across upgrades of the Derby code? Perhaps all that's being addressed here is the following recommendation from the javadoc of java.io.Serializable: However, it is /strongly recommended/ that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations... I don't think we have this problem, though: at release time we produce a standard, vetted version of Derby for which the compiler is constant. Thanks for helping me puzzle through this. Regards, -Rick David W. Van Couvering wrote: I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to set serialVersionUID in the second version. I wrote some tests to verify serialization compatibility between versions of classes but never got to the point of checking them in. They may be valuable, and could be added to our compatibility tests, so if you'd like I can poke around and find them. One bug I uncovered in my tests was that for one of the data sources the serialversion UID was not public, so I was getting failures. Now I can't remember if I checked in that fix or not. David Rick Hillegas wrote: I'm confused about the presence of serialVersionUIDs in the DataSources exposed by our network client (e.g., ClientConnectionPoolDataSource). I think I understand why these classes are serializable (JNDI wants to serialize them). But I don't understand why we are forcibly setting the serialization id. I don't see any documentation explaining the serialization problem this addresses, stating the implications for engineers editting these classes, or describing our expectations at version upgrade. Can someone shed some light on this? Thanks, -Rick
Re: serialization of Derby DataSources
David W. Van Couvering wrote: My understanding was that they may persist across upgrades because the data source objects are serialized into a JNDI store. In general we can *add* non-transient fields but we can't remove or change them. Thanks for that warning about the JNDI store. It would be better if we could flush the old object from the JNDI store. Sigh. According to an experiment I just ran, the de-serialization silently fails to populate the added field with a meaningful value, even if you specify a default in the field declaration or in a no-arg constructor. The added field is forced to the Java default for that type. I think this is tricky enough to warrant comments in these classes. Thanks again, -Rick I think also since we support the Referenceable interface, the object is reconstructed in a compatible way using our own code, rather than depending upon serialization's default mechanism. But that's where I'm still a little muddled. By the way, using the *exact* same compiler, I tried to gently modify a DataSource following all the rules I could imagine, and because I didn't know the serialVersionUID was accidentally made private, I kept getting an incompatible class error or whatever it's called. I was doing everything perfectly, and it was still breaking. Once I set the serialVersionUID to be public, peace reigned. David Rick Hillegas wrote: Thanks, Lance. I agree. We seem to have a muddle if someone adds a new non-transient field to one of these classes: either a) the engineer changes the serialVersionUID, giving rise to the problem you mention or b) the serialVersionUID isn't changed and deserialization fails because the new field is missing from the persisted stream. Hopefully we don't mean for these objects to persist across Derby upgrades. Hard to tell from the code. Regards, -Rick Lance J. Andersen wrote: Hi Rick, once the serialVerisonUID is there, you should not remove it as chaos can break out if the IDs start to differ. IMHO would leave them alone. One example is you have say someone using say derby version x with a an ID of 1 and then persisted the object... now u remove the ID in derby y and the compiler generates say -2 for the ID , you will encounter problems when you try and grab the persisted version as the IDs no longer match. Rick Hillegas wrote: Thanks, David. I'm afraid I'm still muddled. I think I understand the basic purpose of serialVersionUID: It's a compiler-generated checksum of the source which serialization uses as a sanity check. By explicitly setting this field, the engineer promises to keep the following contract: Although the class behavior may change between versions, the non-transient fields won't. But I'm still not grasping the serialization issue we're addressing here. How do we get into a situation where there are two different versions of one of these classes? Is anyone persisting these classes across upgrades of the Derby code? Perhaps all that's being addressed here is the following recommendation from the javadoc of java.io.Serializable: However, it is /strongly recommended/ that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations... I don't think we have this problem, though: at release time we produce a standard, vetted version of Derby for which the compiler is constant. Thanks for helping me puzzle through this. Regards, -Rick David W. Van Couvering wrote: I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to set serialVersionUID in the second version. I wrote some tests to verify serialization compatibility between versions of classes but never got to the point of checking them in. They may be valuable, and could be added to our compatibility tests, so if you'd like I can poke around and find them. One bug I uncovered in my tests was that for one of the data sources the serialversion UID was not public, so I was getting failures. Now I can't remember if I checked in that fix or not. David Rick Hillegas wrote: I'm confused about the
Re: serialization of Derby DataSources
Rick Hillegas wrote: David W. Van Couvering wrote: My understanding was that they may persist across upgrades because the data source objects are serialized into a JNDI store. In general we can *add* non-transient fields but we can't remove or change them. Thanks for that warning about the JNDI store. It would be better if we could flush the old object from the JNDI store. Sigh. According to an experiment I just ran, the de-serialization silently fails to populate the added field with a meaningful value, even if you specify a default in the field declaration or in a no-arg constructor. The added field is forced to the Java default for that type. I think this is tricky enough to warrant comments in these classes. if you add fields, you need to code it so that they get initialized to a reasonable value with when de-serialized using an older copy of the object. Thanks again, -Rick I think also since we support the Referenceable interface, the object is reconstructed in a compatible way using our own code, rather than depending upon serialization's default mechanism. But that's where I'm still a little muddled. By the way, using the *exact* same compiler, I tried to gently modify a DataSource following all the rules I could imagine, and because I didn't know the serialVersionUID was accidentally made private, I kept getting an incompatible class error or whatever it's called. I was doing everything perfectly, and it was still breaking. Once I set the serialVersionUID to be public, peace reigned. David Rick Hillegas wrote: Thanks, Lance. I agree. We seem to have a muddle if someone adds a new non-transient field to one of these classes: either a) the engineer changes the serialVersionUID, giving rise to the problem you mention or b) the serialVersionUID isn't changed and deserialization fails because the new field is missing from the persisted stream. Hopefully we don't mean for these objects to persist across Derby upgrades. Hard to tell from the code. Regards, -Rick Lance J. Andersen wrote: Hi Rick, once the serialVerisonUID is there, you should not remove it as chaos can break out if the IDs start to differ. IMHO would leave them alone. One example is you have say someone using say derby version x with a an ID of 1 and then persisted the object... now u remove the ID in derby y and the compiler generates say -2 for the ID , you will encounter problems when you try and grab the persisted version as the IDs no longer match. Rick Hillegas wrote: Thanks, David. I'm afraid I'm still muddled. I think I understand the basic purpose of serialVersionUID: It's a compiler-generated checksum of the source which serialization uses as a sanity check. By explicitly setting this field, the engineer promises to keep the following contract: Although the class behavior may change between versions, the non-transient fields won't. But I'm still not grasping the serialization issue we're addressing here. How do we get into a situation where there are two different versions of one of these classes? Is anyone persisting these classes across upgrades of the Derby code? Perhaps all that's being addressed here is the following recommendation from the javadoc of java.io.Serializable: However, it is /strongly recommended/ that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations... I don't think we have this problem, though: at release time we produce a standard, vetted version of Derby for which the compiler is constant. Thanks for helping me puzzle through this. Regards, -Rick David W. Van Couvering wrote: I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to set serialVersionUID in the second version. I wrote some tests to verify serialization compatibility between versions of classes but never got to the point of checking them in. They may be valuable, and could be added to our compatibility tests, so if you'd like I can poke around and find them. One bug I uncovered in my tests was that for one of the data sources the
Re: serialization of Derby DataSources
Thanks, Lance. This is a very sensible suggestion. Regards, -Rick Lance J. Andersen wrote: People do strange things. We in the Java SE land have tests which save serialized object and then test that they can be deserialized with the next revision. I would suggest adding some tests to the test suite for any serializable objects Rick Hillegas wrote: Thanks, Lance. I agree. We seem to have a muddle if someone adds a new non-transient field to one of these classes: either a) the engineer changes the serialVersionUID, giving rise to the problem you mention or b) the serialVersionUID isn't changed and deserialization fails because the new field is missing from the persisted stream. Hopefully we don't mean for these objects to persist across Derby upgrades. Hard to tell from the code. Regards, -Rick Lance J. Andersen wrote: Hi Rick, once the serialVerisonUID is there, you should not remove it as chaos can break out if the IDs start to differ. IMHO would leave them alone. One example is you have say someone using say derby version x with a an ID of 1 and then persisted the object... now u remove the ID in derby y and the compiler generates say -2 for the ID , you will encounter problems when you try and grab the persisted version as the IDs no longer match. Rick Hillegas wrote: Thanks, David. I'm afraid I'm still muddled. I think I understand the basic purpose of serialVersionUID: It's a compiler-generated checksum of the source which serialization uses as a sanity check. By explicitly setting this field, the engineer promises to keep the following contract: Although the class behavior may change between versions, the non-transient fields won't. But I'm still not grasping the serialization issue we're addressing here. How do we get into a situation where there are two different versions of one of these classes? Is anyone persisting these classes across upgrades of the Derby code? Perhaps all that's being addressed here is the following recommendation from the javadoc of java.io.Serializable: However, it is /strongly recommended/ that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations... I don't think we have this problem, though: at release time we produce a standard, vetted version of Derby for which the compiler is constant. Thanks for helping me puzzle through this. Regards, -Rick David W. Van Couvering wrote: I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to set serialVersionUID in the second version. I wrote some tests to verify serialization compatibility between versions of classes but never got to the point of checking them in. They may be valuable, and could be added to our compatibility tests, so if you'd like I can poke around and find them. One bug I uncovered in my tests was that for one of the data sources the serialversion UID was not public, so I was getting failures. Now I can't remember if I checked in that fix or not. David Rick Hillegas wrote: I'm confused about the presence of serialVersionUIDs in the DataSources exposed by our network client (e.g., ClientConnectionPoolDataSource). I think I understand why these classes are serializable (JNDI wants to serialize them). But I don't understand why we are forcibly setting the serialization id. I don't see any documentation explaining the serialization problem this addresses, stating the implications for engineers editting these classes, or describing our expectations at version upgrade. Can someone shed some light on this? Thanks, -Rick
Re: serialization of Derby DataSources
Right, can't you override the readObject method or whatever it's called? (Sorry, too lazy to look up the javadoc) I have some tests, Rick. If you'd like I can send them to you. Alternately log a JIRA and I can attach the source to the JIRA. Can't actually spend the time to fully implement and check in the tests right now, maybe later. David Lance J. Andersen wrote: Rick Hillegas wrote: David W. Van Couvering wrote: My understanding was that they may persist across upgrades because the data source objects are serialized into a JNDI store. In general we can *add* non-transient fields but we can't remove or change them. Thanks for that warning about the JNDI store. It would be better if we could flush the old object from the JNDI store. Sigh. According to an experiment I just ran, the de-serialization silently fails to populate the added field with a meaningful value, even if you specify a default in the field declaration or in a no-arg constructor. The added field is forced to the Java default for that type. I think this is tricky enough to warrant comments in these classes. if you add fields, you need to code it so that they get initialized to a reasonable value with when de-serialized using an older copy of the object. Thanks again, -Rick I think also since we support the Referenceable interface, the object is reconstructed in a compatible way using our own code, rather than depending upon serialization's default mechanism. But that's where I'm still a little muddled. By the way, using the *exact* same compiler, I tried to gently modify a DataSource following all the rules I could imagine, and because I didn't know the serialVersionUID was accidentally made private, I kept getting an incompatible class error or whatever it's called. I was doing everything perfectly, and it was still breaking. Once I set the serialVersionUID to be public, peace reigned. David Rick Hillegas wrote: Thanks, Lance. I agree. We seem to have a muddle if someone adds a new non-transient field to one of these classes: either a) the engineer changes the serialVersionUID, giving rise to the problem you mention or b) the serialVersionUID isn't changed and deserialization fails because the new field is missing from the persisted stream. Hopefully we don't mean for these objects to persist across Derby upgrades. Hard to tell from the code. Regards, -Rick Lance J. Andersen wrote: Hi Rick, once the serialVerisonUID is there, you should not remove it as chaos can break out if the IDs start to differ. IMHO would leave them alone. One example is you have say someone using say derby version x with a an ID of 1 and then persisted the object... now u remove the ID in derby y and the compiler generates say -2 for the ID , you will encounter problems when you try and grab the persisted version as the IDs no longer match. Rick Hillegas wrote: Thanks, David. I'm afraid I'm still muddled. I think I understand the basic purpose of serialVersionUID: It's a compiler-generated checksum of the source which serialization uses as a sanity check. By explicitly setting this field, the engineer promises to keep the following contract: Although the class behavior may change between versions, the non-transient fields won't. But I'm still not grasping the serialization issue we're addressing here. How do we get into a situation where there are two different versions of one of these classes? Is anyone persisting these classes across upgrades of the Derby code? Perhaps all that's being addressed here is the following recommendation from the javadoc of java.io.Serializable: However, it is /strongly recommended/ that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations... I don't think we have this problem, though: at release time we produce a standard, vetted version of Derby for which the compiler is constant. Thanks for helping me puzzle through this. Regards, -Rick David W. Van Couvering wrote: I had to look into this when I was playing around with a classloader for code sharing. Basically, by setting the serialVersionUID, you are telling the VM that you guarantee that the newer version of the class is compatible with the old version (in terms of serialization). If you don't set this, then you will get an exception saying the class is not compatible if the VM determines that version UID (basically a hash) is different. There is documentation explaining how this UID is determined, and I struggled to get it right, but finally I had to set the serialVersionUID. Note that you have to set the serial version UID on the *second* and subsequent versions of the class, it's not required for the first version of the class. Basically, you run serialver on the first version of the class, and then use this to