Makes sense. I think I found such an artifact in Schema.getPigSchema https://issues.apache.org/jira/browse/PIG-2379
2011/11/20 Santhosh Srinivasan <[email protected]> > Its an implementation artifact of the old parser JavaCC in release prior > to and including 0.8. The new parser, as Alan points out, should not > require this. > > Santhosh > > -----Original Message----- > From: Alan Gates [mailto:[email protected]] > Sent: Friday, November 18, 2011 9:00 AM > To: [email protected] > Subject: Re: Does the name of the tuple that a bag has to have matter? > > The name doesn't matter. We mostly left it there for backward > compatibility, for both specifying schemas and for UDFs. I do think we > should make sure we ignore it everywhere (including equality for schemas). > > Alan. > > On Nov 16, 2011, at 7:17 PM, Jonathan Coveney wrote: > > > This is related to an issue I'll probably be emailing about once I > > isolate it, but I was curious what the philosophy is around the name > > of the tuple that is in a bag. > > > > example: > > Schema s1 = > > Utils.getSchemaFromString("b:bag{t:tuple(name:chararray,age:int)}"); > > > > In pig8, you had the whole two level access nonsense, so let's ignore > that. > > In pig9, the tuple name seemed to be preserved, and would print with > > toString. > > In trunk, the schema object throws away that name, and it doesn't print. > > > > I'm curious if there is any reason to keep it around, esp. given you > > can just do Schema.equals(s1,s2,false,true) for equality without field > > names, not to mention the fact that the name never really is going to > > matter since a bag only has one element and it is a tuple. > > > > Thanks! > > Jon > >
