RE: How to Generate single output file(part-m-0001) instead of multiple files

Marek Miglinski Tue, 06 Sep 2011 02:32:41 -0700

"When writing to a file system processed will be a directory with part files 
rather than a single file. But how many part files will be created? That 
depends on the parallelism of the last job before the store. If it has reduces, 
it will be determined by the parallel level set for that job (1). If it is a 
map-only job then it will be determined by the number of maps, which is 
controlled by Hadoop and not Pig."
(r) Alan Gates


1:
--defaultparallel.pig
set default_parallel 10;


Read here http://wiki.apache.org/pig/PigLatin#Increasing_the_parallelism



-----Original Message-----
From: kiranprasad [mailto:kiranprasa...@imimobile.com] 
Sent: Tuesday, September 06, 2011 12:19 PM
To: user@pig.apache.org
Subject: Re: How to Generate single output file(part-m-0001) instead of 
multiple files

Hi Marek,

Thanks for quick response.
I have tried it, after using the below mentioned, multiple files are generated 
inside the result folder( 'output/result' ).
But I would like to know how to generate only single output file. the size of 
each outputfile generated is 3000k.

Regards
Kiran.G

IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033 M +91 9000170909 T 
+91 40 2355 5945 - Ext: 229 www.imimobile.com -----Original Message-----
From: Marek Miglinski
Sent: Tuesday, September 06, 2011 2:40 PM
To: user@pig.apache.org
Subject: RE: How to Generate single output file(part-m-0001) instead of 
multiple files

Hi,

STORE param INTO 'output/result' USING PigStorage(',');

If your data is comma delimited.


Marek M.

-----Original Message-----
From: kiranprasad [mailto:kiranprasa...@imimobile.com]
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple 
files

Hi

I am new to PIG, I would like to know how to generate only single output file 
by using STORE.

Regards
Kiran.G

RE: How to Generate single output file(part-m-0001) instead of multiple files

Reply via email to