Re: load files

Jeff Zhang Mon, 28 Jun 2010 07:07:44 -0700

part-xxxxx for is old hadoop mapred api, and part-m-xxxxx and
part-r-xxxxx is for new hadoop mapred api
You can use hadoop's globstatus("part-*") to handle both of these cases.




2010/6/28 Gang Luo <[email protected]>:
> Thanks, Jeff.
> In pig, the file name look like this: part-m-xxxxx(for map result) or 
> part-r-xxxxx(for reduce result), which are different from the hadoop style 
> (part-xxxxx). So, can we control the name of each generated file? How?
>
> Thanks,
> -Gang
>
>
>
> ----- 原始邮件 ----
> 发件人： Jeff Zhang <[email protected]>
> 收件人： [email protected]
> 发送日期： 2010/6/27 (周日) 9:22:30 下午
> 主   题： Re: load files
>
> Hi Gang,
>
> The path specified in load can be both file or directory, besides you
> can also leverage hadoop's globstatus.  The path specified in store is
> a directory.
>
>
>
> On Mon, Jun 28, 2010 at 4:44 AM, Gang Luo <[email protected]> wrote:
>> Hi all,
>> when we specify the path of input to a load operator, is it a file or a 
>> directory? Similarly, when we use store-load to connect two MR operators, is 
>> the path specified in the store and load a directory?
>>
>> Thanks,
>> -Gang
>>
>>
>>
>>
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>
>
>
>
>



-- 
Best Regards

Jeff Zhang

Re: load files

Reply via email to