Re: Error Executing a Fragment Replicated Join

Renato Marroquín Mogrovejo Wed, 27 Apr 2011 17:27:14 -0700

Now that the Apache server is ok with me again, I can write back to
the list. I wrote to the Apache Infra team and they told me to write
messages just in plain text, disabling any html within the message
(not that I ever sent html but oh well), I guess that worked :)
Well, first thanks for answering. I am using pig 0.7 and my pig script
is as follows:


{code}
sr = LOAD 'pigData/sr.dat' using PigStorage('|') AS
(sr_ret_date_sk:int, sr_ret_tim_sk:int, sr_ite_sk:int, sr_cus_sk:int,
sr_cde_sk:int, sr_hde_sk:int, sr_add_sk:int, sr_sto_sk:int,
sr_rea_sk:int, sr_tic_num:int, sr_ret_qua:int, sr_ret_amt:double,
sr_ret_tax:double, sr_ret_amt_inc_tax:double, sr_fee:double,
sr_ret_sh_cst:double, sr_ref_csh:double, sr_rev_cha:double,
sr_sto_cred:double, sr_net_lss:double);

cd = LOAD 'pigData/cd.dat' using PigStorage('|') AS (cd_dem_sk:int,
cd_gnd:chararray, cd_mrt_sts:chararray, cd_edt_sts:chararray,
cd_pur_est:int, cd_cred_rtg:chararray, cd_dep_cnt:int,
cd_dep_emp_cnt:int, cd_dep_col_count:int);

proy_sR = FOREACH sr GENERATE sr_cde_sk;
proy_cD = FOREACH cd GENERATE cd_dem_sk;

join_sR_cD = JOIN proy_sR BY sr_cde_sk, proy_cD BY cd_dem_sk USING 'replicated';

STORE join_sR_cD INTO 'queryResults/query.11.sr.cd.5.1' using PigStorage('|');
{/code}

Being "cd" the relation of 77MB and "sr" the relation of 32MB. I had
some other similar queries in which the 32MB relation was being joined
with smaller relations (<10MB) giving the same problem, I modified
those, so the queries <10MB would be ones being replicated.
Thanks again.

Renato M.

2011/4/27 Thejas M Nair <te...@yahoo-inc.com>:
> The exception indicates that the hadoop job creation failed. Are you able to
> run simple MR queries using each of the inputs ?
> It could also caused by some problem pig is having with copying the file
> being replicated to distributed cache.
> -Thejas
>
>
> On 4/27/11 3:42 PM, "Renato Marroquín Mogrovejo"
> <renatoj.marroq...@gmail.com> wrote:
>
> Does anybody have any suggestions? Please???
> Thanks again.
>
> Renato M.
>
> 2011/4/26 Alan Gates <ga...@yahoo-inc.com>
>>
>> Sent for Renato, since Apache's mail system has decided it doesn't like
>> him.
>>
>> Alan.
>>
>> I am getting an error while trying to execute a simple fragment replicated
>> join on two files (one of 77MB and the other one of 32MB). I am using the
>> 32MB file as the small one to be replicated, but I keep getting this
>> error.
>> Does any body know how this count is done? I mean how Pig determines that
>> the small file is not small enough, or how I could modify this?
>> I am executing these on four PC's with 3GB of RAM running DebianLenny.
>> Thanks in advance.
>>
>>
>> Renato M.
>>
>> Pig Stack Trace
>> ---------------
>> ERROR 2017: Internal error creating job configuration.
>>
>> org.apache.pig.backend.executionengine.ExecException: ERROR 2043:
>> Unexpected
>> error during execution.
>>      at
>>
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:332)
>>      at
>> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
>>      at org.apache.pig.PigServer.execute(PigServer.java:828)
>>      at org.apache.pig.PigServer.access$100(PigServer.java:105)
>>      at org.apache.pig.PigServer$Graph.execute(PigServer.java:1080)
>>      at org.apache.pig.PigServer.executeBatch(PigServer.java:288)
>>      at
>> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:109)
>>      at
>>
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
>>      at
>>
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
>>      at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
>>      at org.apache.pig.Main.main(Main.java:391)
>> Caused by:
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException:
>> ERROR 2017: Internal error creating job configuration.
>>      at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:624)
>>      at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:246)
>>      at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.
>>
>>
>>
>
>
>
> --
>
>

Re: Error Executing a Fragment Replicated Join

Reply via email to