Re: Grunt Shell hangs on Cygwin.

2013-08-08 Thread Alan Gates
Yes, no cygwin tools are required. You will need the hadoop version from branch-1-win as well as pig trunk to make this work. Alan. On Aug 5, 2013, at 10:03 PM, Darpan R wrote: Thanks Alan I am not sure if I quite understand this. Do you mean directly from Windows command prompt?

schema definition and subschema

2013-08-08 Thread Keren Ouaknine
Hi, A schema in Pig (LogicalSchema.java) is defined as an array list of LogicalFieldSchema whose class members are: - String alias - byte type - long uid - LogicalSchema schema I am wondering why is LogicalFieldShema containing a LogicalSchema member? My guess so far is that perhaps there's a

Best practices for handling dependencies in Java UDFs?

2013-08-08 Thread Paul Houle
I'm building a system for processing large RDF data sets with Hadoop. https://github.com/paulhoule/infovore/wiki The first stages are written in Java and perform the function of normalizing, validating and cleaning up the data. The stage that comes after this is going to subdivide Freebase

Re: Best practices for handling dependencies in Java UDFs?

2013-08-08 Thread Ryan Compton
I often do this, and then just register one giant .jar !-- Plugin to create a single jar that includes all dependencies -- plugin artifactIdmaven-assembly-plugin/artifactId version2.4/version

field name reference - alias

2013-08-08 Thread Keren Ouaknine
Hello, Can one refer to a field name with no ambiguity by its full name (A::x instead of x)? Below are two contradictory behaviors: * * *First example:* A = load '1.txt' using PigStorage(' ') as (x:int, y:chararray, z:chararray); B = load '1_ext.txt' using PigStorage(' ') as (a:int,

I think an example in the docs is wrong

2013-08-08 Thread Paul Houle
I recently wrote a load function and to get started I cut-n-pasted from the SimpleTextLoader example on the page http://pig.apache.org/docs/r0.11.1/udf.html#load-store-functions This contains the following code: boolean notDone = in.nextKeyValue(); if (notDone) {

Re: field name reference - alias

2013-08-08 Thread Pradeep Gollakota
This is expected behavior. The disambiguation comes only after two or more relations are brought together. As per the docs at http://pig.apache.org/docs/r0.11.1/basic.html#disambiguate, the disambiguate operator can only be used to identify field names after JOIN, COGROUP, CROSS, or FLATTEN

Re: I think an example in the docs is wrong

2013-08-08 Thread Pradeep Gollakota
I believe the procedure is to file a bug report on JIRA and set the component field to 'documentation'. Pig veterans, please correct me if I'm wrong. On Thu, Aug 8, 2013 at 10:19 PM, Paul Houle ontolo...@gmail.com wrote: I recently wrote a load function and to get started I cut-n-pasted from