Setting pig job name

2013-04-15 Thread Jeff Yuan
Hi guys, This is probably just a quick answer, but how do I set the pig job name? I'm generating Pig jobs in Java, and each job has the name "PigLatin:DefaultJobName" in the hadoop tracker. How can I change it? Or is there an easy way to generate a more useful name (in Hive, the default job name i

Re: Pig configuration for embedded programs

2013-03-29 Thread Jeff Yuan
ng as the conf dir/files are on > the classpath. > > Sent from my iPhone > > On Mar 29, 2013, at 12:50 PM, Jeff Yuan wrote: > >> Hi guys, >> >> I have a quick question about configuring Pig correctly when used in a >> embedded java program: ie my code instanti

Pig configuration for embedded programs

2013-03-29 Thread Jeff Yuan
Hi guys, I have a quick question about configuring Pig correctly when used in a embedded java program: ie my code instantiates PigServer and registers queries to it. How do I set the directory to load the hadoop configuration and pig properties? Is it just a matter of setting the path in the PIG_

Re: Pig jobs - get stdout and stderr

2013-03-22 Thread Jeff Yuan
ould. >> >> Thanks, >> Cheolsoo >> >> >> On Wed, Mar 20, 2013 at 2:00 PM, Jeff Yuan wrote: >> >> > Is there an interface to get the standard out and standard error >> > streams for a pig execution? I'm using the Java interface and dire

Pig jobs - get stdout and stderr

2013-03-20 Thread Jeff Yuan
Is there an interface to get the standard out and standard error streams for a pig execution? I'm using the Java interface and directly calling PigServer.executeBatch() for example and getting back List. The ExecJob interface has some interface for getSTDOut and getSTDError, but any calls to these

Re: Loader partitioning on field

2013-03-14 Thread Jeff Yuan
hu, Mar 14, 2013 at 3:15 PM, Jonathan Coveney wrote: > No, it is not. But if it knew that, how would that filter be meaningful? > What do you have in mind? > > > 2013/3/14 Jeff Yuan > >> Rohini, I see your point. >> >> One followup question: it's possibl

Re: Loader partitioning on field

2013-03-14 Thread Jeff Yuan
posed. Refer > https://issues.apache.org/jira/browse/PIG-3199 > > Regards, > Rohini > > > On Thu, Mar 14, 2013 at 2:00 PM, Jeff Yuan wrote: > >> Thanks! Regarding 1), where there is a UDF in the filter step on a >> partition field. The UDF is not first evaluated b

Re: Loader partitioning on field

2013-03-14 Thread Jeff Yuan
14, 2013 at 1:51 PM, Rohini Palaniswamy wrote: > Jeff, > > 1) It should not. If it does push, then it is a bug in pig. > > 2) I think it should be fine. > > 3) Look at PColFilterExtractor and PartitionFilterOptimizer > > Regards, > > Rohini > > > On Thu,

Loader partitioning on field

2013-03-14 Thread Jeff Yuan
I am writing a loader for a storage format, which partitions by a particular field in the record. So I would like to implement something which can push down filters on the partitioned field so that the record reader does not need to read files that are outside the filtered range. In the interface "

Re: Pig job result output and schema

2013-03-05 Thread Jeff Yuan
astAlias()); ... Thanks, Jeff On Tue, Mar 5, 2013 at 11:30 AM, Johnny Zhang wrote: > Hi, Jeff: > Reply inline. > > > On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan wrote: > >> I have a couple of questions regarding job result and schema. The >> context is that I

Pig job result output and schema

2013-03-05 Thread Jeff Yuan
I have a couple of questions regarding job result and schema. The context is that I'm trying to create a custom entry point for Pig that takes a script, executes it, and always stores the last declared alias/variable in a file. Would appreciate any insights to the 2 questions I have below or any ad

Pig default properties/configuration

2013-03-04 Thread Jeff Yuan
I'm calling Pig (with entry point from Main.java) from another Java program. When it runs, I get a ton of warnings like these: 13/03/03 15:38:55 WARN conf.Configuration: dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects 13/03/03 15:38:55 WARN conf.Configuration: mapred.task.id i

Re: Error while parsing commandline properties

2013-03-02 Thread Jeff Yuan
e > to parameter in pig script in runtime. However, the value should be passed > in by "pig -param =" instead of "pig -p", Can you try that? > > Johnny > > > > > On Fri, Mar 1, 2013 at 4:58 PM, Jeff Yuan wrote: > >> Hi Johnny, >> Actually

Re: Error while parsing commandline properties

2013-03-01 Thread Jeff Yuan
head what kind of functionality does this class provide? Would I be correct in assuming it's normally not needed? Jeff On Fri, Mar 1, 2013 at 4:11 PM, Johnny Zhang wrote: > Hi, Jeff: > It works for me though. Can you paste the whole command? > > Johnny > > > On Fri, Ma

Error while parsing commandline properties

2013-03-01 Thread Jeff Yuan
Hi guys, I'm running pig from the command line in local mode, and trying to pass in some properties, for example: pig -x local ... -p mapred.map.tasks=2 -p mapred.reduce.tasks=1 ... I'm getting errors; INFO parameters.ParameterSubstitutionPreprocessor: Encountered " ".map.tasks=2 "" at line 1, c

Re: Pig 0.11: new features and improvements

2013-02-25 Thread Jeff Yuan
Thanks! I'm still new to pig, but it's very informative. On Fri, Feb 22, 2013 at 6:35 PM, Dmitriy Ryaboy wrote: > I pulled together some of the highlights of the pig 0.11 release on the > Apache Pig blog (which now officially exists!): > > https://blogs.apache.org/pig/ > > D

Re: Question about properties for Loader

2013-02-24 Thread Jeff Yuan
ut user being aware of it? > Can you pass it in as a constructor argument instead? > > UDFContext could be used, like you said to set/retrieve properties. You > might want to take a look at PigStorage that does something very similar > (look for the method applySchema(Tuple tup) ) >

Question about properties for Loader

2013-02-24 Thread Jeff Yuan
I'm trying to write a loader, extending LoadFunc, to read a specific file format. My question, how do I pass properties to it (for example the schema of the file type I'm loading)? Would it be using the -p parameter from the cmdline when issuing the query? The second part of the question is, how

Question about setting default loader

2013-02-21 Thread Jeff Yuan
Hi, I'm a new user of pig, so I apologize if my question seems simplistic. Is there a way to specify (via configuration or cmdline input) a different loader to be used as default? What I mean is, if you don't specify explicitly in your load statement, PigStorage is used as a loader. Is there a way