Issue in access static object in MapReduce
Hi, I have a configuration JSON file which is accessed by MR job for every input.So , I created a class with a static block, load the JSON file in static Instance variable. So everytime my mapper or reducer wants to access configuration can use this Instance variable. But on a single node cluster,when I run this as a jar, my mapper is executing fine, But in reducer static Instance variable is retuning "null". I get this mainly because mapper and reducer runs on separate jvms. How can I handle this situation gracefully. There are few ways that I can think of: 1. Serialize the JSON load object and store it on HDFS and then deserialize and use in the code. Problem is obj is not too large so not be useful to store on HDFS 2. I load the JSON for every Input line of map reduce call. This is highly inefficient and as Il be loading JSON for so many lines again and again. How to resolve this issue.. Any Inputs are welcome. Note , same code works with eclipse well. :( Thanks Stuti Regards, Stuti Awasthi HCL Comnet Systems and Services Ltd ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.
Re: Issue in access static object in MapReduce
Hi Stuti You can pass the json object as a configuration property from your main class then Initialize this static json object on the configure() method. Every instance of map or reduce task will have this configure() method executed once before the map()/reduce() function . So all the executions of map()/reduce() on each record can have a look up into the json object. *public* *void* configure(JobConf job) Regards Bejoy KS On Tue, Sep 11, 2012 at 8:23 PM, Stuti Awasthi wrote: > ** ** > > Hi, > > I have a configuration JSON file which is accessed by MR job for every > input.So , I created a class with a static block, load the JSON file in > static Instance variable. > > So everytime my mapper or reducer wants to access configuration can use > this Instance variable. But on a single node cluster,when I run this as a > jar, my mapper is executing fine, > > But in reducer static Instance variable is retuning "null". > > ** ** > > I get this mainly because mapper and reducer runs on separate jvms. How > can I handle this situation gracefully. There are few ways that I can think > of: > > ** ** > > 1. Serialize the JSON load object and store it on HDFS and then > deserialize and use in the code. Problem is obj is not too large so not be > useful to store on HDFS > > 2. I load the JSON for every Input line of map reduce call. This is highly > inefficient and as Il be loading JSON for so many lines again and again.* > *** > > ** ** > > How to resolve this issue.. Any Inputs are welcome. > > ** ** > > Note , same code works with eclipse well. :( > > ** ** > > Thanks > > Stuti > > ** ** > > ** ** > > Regards, > > *Stuti Awasthi* > > HCL Comnet Systems and Services Ltd > > ** ** > > > > ::DISCLAIMER:: > > > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability > on the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior > written consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error > please delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > > >
RE: Issue in access static object in MapReduce
Thanks Bejoy, I try to implement and if face any issues will let you know. Thanks Stuti From: Bejoy Ks [mailto:bejoy.had...@gmail.com] Sent: Tuesday, September 11, 2012 8:39 PM To: user@hadoop.apache.org Subject: Re: Issue in access static object in MapReduce Hi Stuti You can pass the json object as a configuration property from your main class then Initialize this static json object on the configure() method. Every instance of map or reduce task will have this configure() method executed once before the map()/reduce() function . So all the executions of map()/reduce() on each record can have a look up into the json object. public void configure(JobConf job) Regards Bejoy KS On Tue, Sep 11, 2012 at 8:23 PM, Stuti Awasthi mailto:stutiawas...@hcl.com>> wrote: Hi, I have a configuration JSON file which is accessed by MR job for every input.So , I created a class with a static block, load the JSON file in static Instance variable. So everytime my mapper or reducer wants to access configuration can use this Instance variable. But on a single node cluster,when I run this as a jar, my mapper is executing fine, But in reducer static Instance variable is retuning "null". I get this mainly because mapper and reducer runs on separate jvms. How can I handle this situation gracefully. There are few ways that I can think of: 1. Serialize the JSON load object and store it on HDFS and then deserialize and use in the code. Problem is obj is not too large so not be useful to store on HDFS 2. I load the JSON for every Input line of map reduce call. This is highly inefficient and as Il be loading JSON for so many lines again and again. How to resolve this issue.. Any Inputs are welcome. Note , same code works with eclipse well. :( Thanks Stuti Regards, Stuti Awasthi HCL Comnet Systems and Services Ltd ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.
Re: Issue in access static object in MapReduce
Have you looked at Terracotta or any other distributed caching system? Kunal -- Sent while mobile -- On Sep 11, 2012, at 9:30 PM, Stuti Awasthi wrote: > Thanks Bejoy, > I try to implement and if face any issues will let you know. > > Thanks > Stuti > > From: Bejoy Ks [mailto:bejoy.had...@gmail.com] > Sent: Tuesday, September 11, 2012 8:39 PM > To: user@hadoop.apache.org > Subject: Re: Issue in access static object in MapReduce > > Hi Stuti > > You can pass the json object as a configuration property from your main class > then Initialize this static json object on the configure() method. Every > instance of map or reduce task will have this configure() method executed > once before the map()/reduce() function . So all the executions of > map()/reduce() on each record can have a look up into the json object. > > public void configure(JobConf job) > > Regards > Bejoy KS > > On Tue, Sep 11, 2012 at 8:23 PM, Stuti Awasthi wrote: > > Hi, > > I have a configuration JSON file which is accessed by MR job for every > input.So , I created a class with a static block, load the JSON file in > static Instance variable. > > So everytime my mapper or reducer wants to access configuration can use this > Instance variable. But on a single node cluster,when I run this as a jar, my > mapper is executing fine, > > But in reducer static Instance variable is retuning "null". > > > > I get this mainly because mapper and reducer runs on separate jvms. How can I > handle this situation gracefully. There are few ways that I can think of: > > > > 1. Serialize the JSON load object and store it on HDFS and then deserialize > and use in the code. Problem is obj is not too large so not be useful to > store on HDFS > > 2. I load the JSON for every Input line of map reduce call. This is highly > inefficient and as Il be loading JSON for so many lines again and again. > > > > How to resolve this issue.. Any Inputs are welcome. > > > > Note , same code works with eclipse well. :( > > > > Thanks > > Stuti > > > > > Regards, > Stuti Awasthi > HCL Comnet Systems and Services Ltd > > > > ::DISCLAIMER:: > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability on > the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error please > delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > >