this seems to work: Utils.getSchemaFromString("(b:bag{f1: chararray, f2: int})");
On Tue, Oct 4, 2011 at 6:01 AM, Andrew Clegg <andrew.clegg+mah...@gmail.com>wrote: > Yep, getSchemaFromString is what I was looking for, but I can't get it > to generate a schema (for unit test purposes) that matches what I get > inside my script during a real run. > > As an example, say I have a file like this: > > foo\t2 > bar\t3 > baz\t3 > marge\t4 > homer\t4 > > and I load it like this: > > infile = load 'test.txt' as (name:chararray, weight:int); > grouped = group infile all; > bucketed = foreach grouped generate flatten(Buckets(infile)); > > the outputSchema method of my UDF (Buckets) gets called with a schema > that stringifies like so: > > {infile: {name: chararray,weight: int}} > > i.e. it has a single field, which is a bag, containing two elements > directly (no wrapping tuple, presumably because this is Pig 0.8.1?). > > (sidenote, I guess the outermost {}s are a display convention, as > there's only one bag there) > > When I'm unit-testing the UDF's outputSchema method, I'd like to > generate exactly that schema. > > But if I call getSchemaFromString like this: > > Utils.getSchemaFromString("B: {f1: chararray, f2: int}") > > It throws a parser error: > > Encountered " "{" "{ "" at line 1, column 4. > Was expecting one of: > "int" ... > "long" ... > "float" ... > "double" ... > "chararray" ... > "bytearray" ... > "int" ... > "long" ... > "float" ... > "double" ... > "chararray" ... > "bytearray" ... > > Two questions I guess. > > (1) Is there a way of generating a schema like that via Utils? > > (2) ... or is this schema actually wrong, and I'm looking at a symptom > of https://issues.apache.org/jira/browse/PIG-767 that would behave > differently if I was in Pig 0.9.0? > > Many thanks, > > Andrew. > > > On 4 October 2011 00:14, Raghu Angadi <rang...@apache.org> wrote: > > Utils.getSchemaFromString() seems like exactly what you want ( > > from org_apache_pig_impl_util ). > > > > Raghu. > > > > [btw. my two previous attempts to send to the list got rejected as spam ] > > > > On Mon, Oct 3, 2011 at 3:41 PM, Andrew Clegg > > <andrew.clegg+mah...@gmail.com>wrote: > > > >> Thanks Raghu (and Dmitry). > >> > >> Could this maybe get added to the docs page on UDFs? (Apologies if > >> it's there already and I missed it.) > >> > >> Also -- it's a bit cumbersome writing all these nested Schema and > >> FieldSchema constructors, especially when you're writing tests for > >> UDFs with flexible schema support. > >> > >> I was wondering if it would be practical to reuse whatever code the > >> front-end uses to parse schema descriptions from load statements in > >> scripts. Is this a silly idea? If it isn't silly, does anyone know > >> where I need to look for that code? > >> > >> > >> On 3 October 2011 22:56, Raghu Angadi <ang...@gmail.com> wrote: > >> > my understanding is that Pig 0.8 expects the first form and Pig 0.9 > >> requires > >> > the second. > >> > > >> > Raghu. > >> > > >> > On Mon, Oct 3, 2011 at 8:27 AM, Andrew Clegg > >> > <andrew.clegg+mah...@gmail.com>wrote: > >> > > >> >> Hi, > >> >> > >> >> When you have a UDF that returns a bag, and you're writing the > >> >> outputSchema method, do you have to explicitly include the mandatory > >> >> 'container' tuple within the bag, or is this implicit? > >> >> > >> >> i.e. if I'm returning a bag of ints, do I have to do: > >> >> > >> >> return new Schema( > >> >> new FieldSchema(null, > >> >> new Schema( > >> >> new FieldSchema(null, DataType.INTEGER)), DataType.BAG)); > >> >> > >> >> Or do I have to explicitly define a tuple like so: > >> >> > >> >> return new Schema( > >> >> new FieldSchema(null, > >> >> new Schema( > >> >> new FieldSchema(null, > >> >> new Schema( > >> >> new FieldSchema(null, DataType.INTEGER)), DataType.TUPLE)), > >> >> DataType.BAG)); > >> >> > >> >> The docs seem pretty vague on this, and you're allowed to do either. > >> >> My feeling would be that if the first form was illegal, you wouldn't > >> >> be allowed to create a schema like that, but this may be wishful > >> >> thinking. > >> >> > >> >> Thanks, > >> >> > >> >> Andrew. > >> >> > >> >> -- > >> >> > >> >> http://tinyurl.com/andrew-clegg-linkedin | > >> http://twitter.com/andrew_clegg > >> >> > >> > > >> > >> > >> > >> -- > >> > >> http://tinyurl.com/andrew-clegg-linkedin | > http://twitter.com/andrew_clegg > >> > > > > > > -- > > http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg >