Re: different mapred.min.split.size within one pig script?

2012-06-13 Thread Yang
thanks, I tried, but it does not seem to work, even after I put the second set split.size= at the very end of the script, it is the second SET that takes effect for both places i used the SET. Yang On Tue, Jun 12, 2012 at 3:56 PM, Alex Rovner wrote: > Yes. Use the "set" keyword right before t

Re: How to redirect the Pig script summary to a file

2012-06-13 Thread Prasanth J
1>a.txt will redirect anything written to System.out to the file. All diagnostic operators, dump statement write to System.out and hence it will be written to a.txt. Some of the [INFO] messages which you see in the console are from log4j which is configured to print to System.err. Following is

Re: How to redirect the Pig script summary to a file

2012-06-13 Thread Jonathan Coveney
Shan, while Prashant's solution works, why not just have a STORE statement to the local filesystem? This is the much cleaner way to do it. 2012/6/13 shan s > It works, thanks. > I looked up http://tldp.org/LDP/abs/html/io-redirection.html but still > could not figure why your suggestion works. >

Re: How to redirect the Pig script summary to a file

2012-06-13 Thread shan s
It works, thanks. I looked up http://tldp.org/LDP/abs/html/io-redirection.html but still could not figure why your suggestion works. If & is inclusive of 1 & 2, either 1 or 2 should have worked... But 1>a.txt ignores it. Curious..Could you please explain. Thanks! On Thu, Jun 14, 2012 at 3:51 AM,

Re: How to redirect the Pig script summary to a file

2012-06-13 Thread Prashant Kommireddi
Try pig -x mapred -l logs -param $xyz=1000 pqr.pig &>a.txt On Wed, Jun 13, 2012 at 9:05 AM, shan s wrote: > How do I store the pig console output to a file. > pig -x mapred -l logs -param $xyz=1000 pqr.pig >> a.txt does not work for > me. Are there any tricks to make this work? > Or is it avai

RE: Job setup for a pig run takes ages

2012-06-13 Thread Danfeng Li
We also run into the long setup time issue, but our problem is different 1. The setup time takes about 20minutes, we can't see anything on the jobtracker during this setup time. 2. Our data is saved in flat file, uncompressed. 3. Our code consists of many small pig files, they are used in the fol

Pig not picking up configuration?

2012-06-13 Thread Mario Lassnig
Hi, I'm getting a lot of deprecation messages when running Pig (version 0.9.2-cdh4.0.0). It seems to me it's not picking up the configuration of yarn/mr2 properly. For example: 2012-06-13 11:48:03,074 [main] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead,

Re: Re: Re: How pig get hadoop and hbase configuration?

2012-06-13 Thread shashwat shriparv
You need to connect hbase to hadoop not hadoop to hbase On Wed, Jun 13, 2012 at 2:45 PM, lulynn_2008 wrote: > I found that there is no hbase configuration in tasktracker classpath. > After we add hbase conf directory into tasktracker hadoop classpath, the > test case passed. But I think the

Re: Job setup for a pig run takes ages

2012-06-13 Thread Markus Resch
Hey Alex, On one side I think you're right but we need to keep in mind that the schema could change within some files of a glob (e.g. schema extension update) the Avro Storage should check at least some hash of the schema to verify all schemas of all input files are the same and/or to split them i

Re:Re: Re: How pig get hadoop and hbase configuration?

2012-06-13 Thread lulynn_2008
I found that there is no hbase configuration in tasktracker classpath. After we add hbase conf directory into tasktracker hadoop classpath, the test case passed. But I think the hbase configuration should be passed by jobtracker node, and I can find correct hbase configuration in jobtracker nod

Re: How can I use load function to load bag field?

2012-06-13 Thread yonghu
Thanks for your guys. I tried the code and found out what was the right pattern of the bag which could be loaded. regards! Yong On Mon, Jun 11, 2012 at 10:32 PM, Russell Jurney wrote: > my_data = LOAD 'location' AS (name:chararray, val1:int, val2:int); > by_name = foreach (group my_data by name

Re: Re: How pig get hadoop and hbase configuration?

2012-06-13 Thread Mohammad Tariq
Could you send me your hadoop and hbase config files??? Regards,     Mohammad Tariq On Wed, Jun 13, 2012 at 1:18 PM, Mohammad Tariq wrote: > "HBase is able to connect to ZooKeeper but the connection closes > immediately." - This error means that your HMaster is not able to talk > to your Nameno

Re: Re: How pig get hadoop and hbase configuration?

2012-06-13 Thread Mohammad Tariq
"HBase is able to connect to ZooKeeper but the connection closes immediately." - This error means that your HMaster is not able to talk to your Namenode. Regards,     Mohammad Tariq On Wed, Jun 13, 2012 at 1:12 PM, lulynn_2008 wrote: > Hello, > hadoop-core-*.jar and commons-configuration-1.6.ja

Re:Re: How pig get hadoop and hbase configuration?

2012-06-13 Thread lulynn_2008
Hello, hadoop-core-*.jar and commons-configuration-1.6.jar have been in hbase lib directory. jobtracker node can get correct hbase configuration, but tasktracker node can not. At 2012-06-13 15:35:21,"Mohammad Tariq" wrote: >Hello, > > Copy the hadoop-core-*.jar from your hadoop folder to t

Re: How pig get hadoop and hbase configuration?

2012-06-13 Thread Mohammad Tariq
Hello, Copy the hadoop-core-*.jar from your hadoop folder to the hbase/lib folder.Also copy commons-configuration-1.6.jar from hadoop/lib folder to hbase/lib folder...Some times due to incompatible jars this may happen..do it and see if it works for you. Regards,     Mohammad Tariq On Wed, J

How pig get hadoop and hbase configuration?

2012-06-13 Thread lulynn_2008
Hi everyone, Following is mine test environment: node 1:namenode, secondarynamenode, jobtracker, hbase master node 2:datanode, tasktracker In node 1, I run following COMMANDS in pig shell, but I found map task failed in tasktracker node with error "HBase is able to connect to ZooKeeper but the c