Re: Best Practice: LOAD returns null

2012-04-10 Thread Bill Graham
Typically, file pattern globing is very strict and LOADs fail if not all glob variants are met. This makes sense when you think that someone might pass a glob path with each of the 24 hours in a day. If one of those hours doesn't exist you want the LOAD to fail. thanks, Bill On Tue, Apr 10, 2012

Re: Type mismatch in key from map

2012-04-10 Thread shan s
When I load my data I defined all fields to be chararray in the schema. I can afford to treat everything as chararray. rid cold be chararray. ( but no real expectations from my side, it's a guid from coming from db) AA and BB do come from UDF, UDF does some string processing and returns substring

Re: Type mismatch in key from map

2012-04-10 Thread Dmitriy Ryaboy
What type do you expect rid to be? Where did AA and BB come from? D On Tue, Apr 10, 2012 at 12:03 PM, shan s wrote: > I am currently getting  “Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableBytesWritable, recieved > org.apache.pig.impl.io.NullableText “ > > > I looked u

Re: LIMIT operator doesn't work with variables

2012-04-10 Thread Dmitriy Ryaboy
Yeah, that would work. 10 should be released within weeks. D On Tue, Apr 10, 2012 at 2:42 PM, James Newhaven wrote: > Thanks for the responses. Until 0.10 is released what alternatives do > I have if limit can only take constants? > > I suspect I could use TOP if that supports variables? > > >

Re: LIMIT operator doesn't work with variables

2012-04-10 Thread James Newhaven
Thanks for the responses. Until 0.10 is released what alternatives do I have if limit can only take constants? I suspect I could use TOP if that supports variables? On 10 Apr 2012, at 10:20 PM, Dmitriy Ryaboy wrote: > Fixed in 0.10 actually > > https://issues.apache.org/jira/browse/PIG-1926 >

Re: LIMIT operator doesn't work with variables

2012-04-10 Thread Dmitriy Ryaboy
Fixed in 0.10 actually https://issues.apache.org/jira/browse/PIG-1926 But if you are using the scalar feature, you should cast explicitly. D On Tue, Apr 10, 2012 at 2:11 PM, Stan Rosenberg wrote: > I believe the syntax of LIMIT does not admit an arbitrary expression; > it only admits constants

Re: LIMIT operator doesn't work with variables

2012-04-10 Thread Stan Rosenberg
I believe the syntax of LIMIT does not admit an arbitrary expression; it only admits constants. At least this is what the documentation says. stan On Tue, Apr 10, 2012 at 4:33 PM, James Newhaven wrote: > Hi, > > I am trying to a limit the output size using LIMIT. I want to the limit > size to

LIMIT operator doesn't work with variables

2012-04-10 Thread James Newhaven
Hi, I am trying to a limit the output size using LIMIT. I want to the limit size to be 5 percent of the total output size like this: -- Put all the inids in a bag so we can count them. G = GROUP F ALL; -- Count everything in the bag H = FOREACH G GENERATE COUNT_STAR(F) AS total; -- Limit out t

ClassNotFoundException: org.antlr.runtime.tree.Tree

2012-04-10 Thread Paolo Castagna
Hi, I am using Pig version 0.9.2 and I have this in my pom.xml file: org.apache.pig pig 0.9.2 I am running a trivial example: PigServer pig = new PigServer("local"); pig.registerJar("./target/jena-grande-0.1-SNAPSHOT.jar"); pig.registerQuery("q

Type mismatch in key from map

2012-04-10 Thread shan s
I am currently getting “Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText “ I looked up the PIG-919 and related comments, but could not understand the reason or the workaround for this problem. Could you please kin

Re: Load XML file using PIG

2012-04-10 Thread Jagat
Hello Krishnan Please see http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/XMLLoader.html This is XML Loader and it has example also to use it. Thanks Jagat Singh On Tue, Apr 10, 2012 at 10:53 PM, krishnan N wrote: > Hi All, > > I am new to Hadoop and trying to learn PIG

Re: PigDump not available anymore ?

2012-04-10 Thread Norbert Burger
Have you taken a look already at CSVExcelStorage from the piggybank? PigDump may work ok for basic datatypes, but seems like you'd quickly run into issues with quoted strings and more complex types. http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html Norbe

PigDump not available anymore ?

2012-04-10 Thread Kevin LION
Hi, I would like load content from somewhere and store result in a CSV file. According to the doc here ( http://pig.apache.org/docs/r0.9.2/func.html#pigdump), I can use PigDump(). I've also this script : test = LOAD 'test.csv' USING org.apache.pig.builtin.PigStorage(',') AS > (key:chararray, valu

Best Practice: LOAD returns null

2012-04-10 Thread Markus Resch
Hey everyone, I have a new question about how to handle a very common issue the best: We have a LOAD statement loading AVRO files using globbing by a given regex. By some wired reason this might return null as there is no file matching the regex. There are two thinkable cases where this can happe

Re: "duplicate uid in schema" feature or bug?

2012-04-10 Thread Norbert Burger
Not sure if this will work in your use-case, but adding a FLATTEN to strip the outer tuple before the FOREACHs seems to detour Pig enough to work around the bug: B = FOREACH A GENERATE FLATTEN(a); B1 = FOREACH B GENERATE x, y; B2 = FOREACH B GENERATE x, y; Norbert On Tue, Apr 10, 2012 at 2:42 AM

Fwd: AvroStorage/Avro Schema Question

2012-04-10 Thread Russell Jurney
I am having trouble with ARRAY_ELEM getting injected into my pig data, when I store. Scott Carey had good insight into how to address the issue. -- Forwarded message -- From: Scott Carey Date: Mon, Apr 2, 2012 at 9:13 AM Subject: Re: AvroStorage/Avro Schema Question To: u...@avro