Seconded for PigUnit. As for a faster debugging procedure, I've gone modular. First I JUnit test individual UDFs against their functional requirements and use cases a priori. Then I mockup my whiteboard workflow as multiple pig script logical blocks (multiple pig files to test), start a pig -x local, and try each aliased line one-by-one per each logical block, with a DESCRIBE after each. This ensures that I have correct syntactical formulation in the scripting, schemas, desired re-aliasing, etc., and you can merge logical blocks back together for optimizations when blocks are completed.
Once a block is completed, you can do an ILLUSTRATE on each block to spot-check results as well, but be forewarned, I've had issues with larger scripts failing prematurely in this regard due to complexity. Hope this helps, -Dan On Tue, May 20, 2014 at 3:26 PM, Suraj Nayak <snay...@gmail.com> wrote: > Also, Pig is data flow language where the statements gets converted to > java and then run. In case of python, its native. Thus runs faster. > On 21-May-2014 12:52 AM, "Suraj Nayak" <snay...@gmail.com> wrote: > > > Why not consider PigUnit? PigUnit gives flexibility to test locally. Also > > debugging is pretty simple, almost similar to JUnit. > > > > -- > > Suraj > > On 21-May-2014 12:47 AM, "Paul Houle" <ontolo...@gmail.com> wrote: > > > >> Slow iteration is a problem with Pig. > >> > >> I still write MR jobs mainly in Java because (1) I control the > >> execution plan, (2) can do things nearly zero-copy, and (3) I can > >> get a quick iteration cycle by using JUnit to test mappers, reducers, > >> and other components. > >> > >> On Tue, May 20, 2014 at 3:02 PM, Kevin Burton <bur...@spinn3r.com> > wrote: > >> > I've noticed that while working with pig my stress level and > frustration > >> > with the system is higher than other systems I've worked with. > >> > > >> > I think it's because the iteration cycle is longer. > >> > > >> > Even pig -x local takes a while to execute. > >> > > >> > Is this just me? > >> > > >> > If you're trying to learn and debug python lists, dictionaries, etc. > >> It's > >> > almost instant response time. > >> > > >> > But with pig literally everything takes 30-60 seconds to play with. > >> > > >> > -- > >> > > >> > Founder/CEO Spinn3r.com > >> > Location: *San Francisco, CA* > >> > Skype: *burtonator* > >> > blog: http://burtonator.wordpress.com > >> > … or check out my Google+ > >> > profile<https://plus.google.com/102718274791889610666/posts> > >> > <http://spinn3r.com> > >> > War is peace. Freedom is slavery. Ignorance is strength. Corporations > >> are > >> > people. > >> > >> > >> > >> -- > >> Paul Houle > >> Expert on Freebase, DBpedia, Hadoop and RDF > >> (607) 539 6254 paul.houle on Skype ontolo...@gmail.com > >> > > >