Re: Does the name of the tuple that a bag has to have matter?

Jonathan Coveney Sun, 20 Nov 2011 23:43:47 -0800

Makes sense. I think I found such an artifact in Schema.getPigSchema

https://issues.apache.org/jira/browse/PIG-2379


2011/11/20 Santhosh Srinivasan <[email protected]>

> Its an implementation artifact of the old parser JavaCC in release prior
> to and including 0.8. The new parser, as Alan points out, should not
> require this.
>
> Santhosh
>
> -----Original Message-----
> From: Alan Gates [mailto:[email protected]]
> Sent: Friday, November 18, 2011 9:00 AM
> To: [email protected]
> Subject: Re: Does the name of the tuple that a bag has to have matter?
>
> The name doesn't matter.  We mostly left it there for backward
> compatibility, for both specifying schemas and for UDFs.  I do think we
> should make sure we ignore it everywhere (including equality for schemas).
>
> Alan.
>
> On Nov 16, 2011, at 7:17 PM, Jonathan Coveney wrote:
>
> > This is related to an issue I'll probably be emailing about once I
> > isolate it, but I was curious what the philosophy is around the name
> > of the tuple that is in a bag.
> >
> > example:
> > Schema s1 =
> > Utils.getSchemaFromString("b:bag{t:tuple(name:chararray,age:int)}");
> >
> > In pig8, you had the whole two level access nonsense, so let's ignore
> that.
> > In pig9, the tuple name seemed to be preserved, and would print with
> > toString.
> > In trunk, the schema object throws away that name, and it doesn't print.
> >
> > I'm curious if there is any reason to keep it around, esp. given you
> > can just do Schema.equals(s1,s2,false,true) for equality without field
> > names, not to mention the fact that the name never really is going to
> > matter since a bag only has one element and it is a tuple.
> >
> > Thanks!
> > Jon
>
>

Re: Does the name of the tuple that a bag has to have matter?

Reply via email to