On Tue, Jul 3, 2012 at 12:42 AM, Gabriel Reid <[email protected]> wrote: > Hi guys, > > While implementing the map side joins, I needed to make the PType > interface extend Serializable, and that caused me to stumble upon an > issue in AvrosTest. The testNestedTables method creates a nested table > schema (unsurprisingly), with the call in question being equivalent to > this: > > Avros.tableOf(Avros.strings(), Avros.tableOf(Avros.ints(), > Avros.doubles())); > > This results in an invalid schema being created due to the same > namespace and name (org.apache.avro.mapred.Pair) being used twice in > the schema. > > The error with the the invalid schema occurs in the Schema#toString > method -- it should probably result in an exception during the > creation of the Schema itself, but the toString method is used > everywhere, so this will fail if it's used no matter what. > > Does anyone know nested Avro tables is a real use case that needs to > be supported (seeing as they won't actually work right now)? Am I > right in assuming that pretty much the same thing could be > accomplished by just doing the following call? > > Avros.tableOf(Avros.strings(), Avros.pairs(Avros.ints(), > Avros.doubles())); > > I know that unique names are created for Avro Tuple schemas in Crunch, > and I assume that this is done to avoid name collisions like this > case. I'm thinking that we could do some kind of similar trick to > allow the nested tables to work, but I don't think that this is worth > the effort (unless someone says that nested tables are a real use case > that needs to be supported). > > I'll report the late catching of the invalid Schema to the Avro > project, but unless anyone objects, I think we can probably just > remove this test case. Anyone against that idea?
I added it as a convenience method for a project I was working on-- my thought was that if you tried to put a PTableType inside of another PTableType, we should detect it and automatically convert the nested PTableType to a pairs(keyType, valueType), as you indicated was the right thing to do. I'm surprised that it didn't work properly, since I thought that I was doing the conversion from PTableType to pairs(keyType, valueType) when the PType was being constructed. Do you mind if I take a look at it first and try to fix it? > > > - Gabriel
