I believe if we add fix to remove the __g__dirty field for from code generation it won't break older code generated from older Gora compiler as long as we check for it before executing client operations.
Would that be an upgradable fix if we later decide to support schema evolution one way or another? - Henry On Sat, May 3, 2014 at 10:21 AM, Lewis John Mcgibbney <lewis.mcgibb...@gmail.com> wrote: > Hi Alparslan, > > On Sat, May 3, 2014 at 1:02 AM, <dev-digest-h...@gora.apache.org> wrote: > >> >> In auto-generated persistent classes, we create an array field called >> _ALL_FIELDS as you know. > > > Yes this code was not originally planned for inclusion in GORA-94 but was > instead implemented later on... as a kind of 'work around'. > > >> But this array also contains __g__dirty field, >> which is not a stored field at all. Maybe we should remove __g__dirty field >> from the array, since the array is used for getting all fields in the >> stored table. We can also remove it from Field enum, so the users do not >> know about the __g__dirty field. >> >> > This field should only be visible on our (the Gora) side... you are > absolutely right! > The larger issue here relates to writer schema and class (current model of > Persistent class extends org.apache.gora.persistency.impl.PersistentBase > implements org.apache.avro.specific.SpecificRecord, > org.apache.gora.persistency.Persistent) and reader schema/class which would > not necessarily need to be same as writer schema/class. > Right now in Gora we DO NOT have a standardized approach to supporting > schema evolution other than taking the chance that a schema change > _hopefully_doesn't_ break things. > We have no method for ensuring backwards compatability with data which is > written and then read using a different schema... > IMHO this is the larger issue we need to consier. Removing __g__ dirty > bytes field is another work around. > I've spoken over on user@avro with Martin Klepmann about this and his > suggestions are very sensible. > *http://s.apache.org/7QY* <http://s.apache.org/7QY> > *http://s.apache.org/biI* <http://s.apache.org/biI> > I've also raised this topic on this list. > We need to support dynamic schema evolution or else data can be redundant > very quickly depending on the use case. > Right now I am struggling to envisage how we approach this... should we be > working on Avro code? Should we work on a Gora specific implementation, if > for example we wish to have a pluggable serialization layer in Gora? > Right now, whenever anyone uses Persistent classes generated by > GoraCompiler packaged in 0.4 release, they will ALWAYS be exposed to __g__ > dirty bytes field... this is not an ideal situation however and AFAIK the > only work around is to remove this field on the client side prior to doing > operations such as Query... this is far from perfect but it DOES work. > You may also be interested in AVRO-1124 > There is still work to be done with Persistency API in Gora for sure.