Hi hc, Sorry that I didn't mention it. But load works ok. Here is a portion of the output of dump W
(2162,4111,yellow,a) (4652,1317,yep,interjection) (157,60592,yes,interjection) (533,19459,yesterday,adv) (265,35058,yet,adv) (4040,1626,yield,n) (3339,2139,yield,v) Only the store command is not working... Alex On Tue, Sep 21, 2010 at 2:48 PM, hc busy <hc.b...@gmail.com> wrote: > probly because load failed. > > W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int, > name:chararray, > type:chararray); > T = group W all; > U = foreach T generate COUNT(W); > dump U; > > will probably say that the wordbag contained nothing. Debug the loading > portion to fix this problem. > > > > > On Tue, Sep 21, 2010 at 1:50 PM, Alex Wang <wanga...@gmail.com> wrote: > > > Hi, > > > > > > > > I am using pig 0.7.0 in hadoop mapreduce mode. > > > > > > > > The problem I have is that I simply can't use > > > > > > > > STORE INTO alias USING PigStorage(); > > > > > > > > I can load dataset in, write UDFs to manipulate the dataset, but I can't > > store it. The output is a directory in HDFS with 0 bytes. > > > > > > > > As an example, I've been testing with a simple script: > > > > > > > > W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int, > > name:chararray, > > type:chararray); > > > > store W into 'wordtesting' using PigStorage(' '); > > > > > > > > I run the code in grunt, and the output of hadoop fs -ls is: > > > > > > > > drwxr-xr-x - awang supergroup 0 2010-09-21 13:45 > > /user/awang/wordtesting > > > > > > > > The grunt messages are: > > > > > > > > grunt> store filteredW into 'wordtesting' using PigStorage(' '); > > > > 2010-09-21 13:45:35,210 [main] INFO > > org.apache.pig.impl.logicalLayer.optimizer.PruneColumns > > - No column pruned for W > > > > 2010-09-21 13:45:35,210 [main] INFO > > org.apache.pig.impl.logicalLayer.optimizer.PruneColumns > > - No map keys pruned for W > > > > 2010-09-21 13:45:35,440 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine > > - (Name: Store(hdfs://pineal:9000/user/awang/wordtesting:PigStorage(' ')) > - > > 1-46 Operator Key: 1-46) > > > > 2010-09-21 13:45:35,498 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer > > - MR plan size before optimization: 1 > > > > 2010-09-21 13:45:35,498 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer > > - MR plan size after optimization: 1 > > > > 2010-09-21 13:45:35,549 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > - mapred.job.reduce.markreset.buffer.percent is not set, set to default > 0.3 > > > > 2010-09-21 13:45:38,100 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > - Setting up single store job > > > > 2010-09-21 13:45:38,166 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - 1 map-reduce job(s) waiting for submission. > > > > 2010-09-21 13:45:38,173 [Thread-15] WARN > > org.apache.hadoop.mapred.JobClient > > - Use GenericOptionsParser for parsing the arguments. Applications should > > implement Tool for the same. > > > > 2010-09-21 13:45:38,307 [Thread-15] INFO > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat > > - Total input paths to process : 1 > > > > 2010-09-21 13:45:38,307 [Thread-15] INFO > > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil > > - Total input paths to process : 1 > > > > 2010-09-21 13:45:38,670 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - HadoopJobId: job_201009211320_0002 > > > > 2010-09-21 13:45:38,670 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - More information at: > > http://pineal:50030/jobdetails.jsp?jobid=job_201009211320_0002 > > > > 2010-09-21 13:45:38,673 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - 0% complete > > > > 2010-09-21 13:45:48,755 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - 50% complete > > > > 2010-09-21 13:45:53,835 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - 100% complete > > > > 2010-09-21 13:45:53,835 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Successfully stored result in: > > "hdfs://pineal:9000/user/awang/wordtesting" > > > > 2010-09-21 13:45:53,846 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Records written : 1 > > > > 2010-09-21 13:45:53,846 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Bytes written : 20 > > > > 2010-09-21 13:45:53,846 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Spillable Memory Manager spill count : 0 > > > > 2010-09-21 13:45:53,847 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Proactive spill count : 0 > > > > 2010-09-21 13:45:53,847 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Success! > > > > > > > > > > > > I've been struggling with this for a long timeā¦. It works if I have a one > > bytearray in my tuple, but once I defined my schema, it no longer works. > > > > > > > > Anyone has any idea? Please help!! Thanks! > > > > > > > > Best regards, > > > > Alex > > >