Re: Local jobtracker in test env?

2012-08-07 Thread Harsh J
Yes, singular JVM (The test JVM itself) and the latter approach (no TT/JT daemons). On Wed, Aug 8, 2012 at 4:50 AM, Mohit Anchlia wrote: > On Tue, Aug 7, 2012 at 2:08 PM, Harsh J wrote: > >> It used the local mode of operation: >> org.apache.hadoop.mapred.LocalJobRunner >> >> > In localmode ever

Re: [ANNOUNCE] - New user@ mailing list for hadoop users in-lieu of (common,hdfs,mapreduce)-user@

2012-08-07 Thread Arun C Murthy
Apologies (again) for the cross-post, I've filed https://issues.apache.org/jira/browse/INFRA-5123 to close down (common, hdfs, mapreduce)-user@ since user@ is functional now. thanks, Arun On Aug 4, 2012, at 9:59 PM, Arun C Murthy wrote: > All, > > Given our recent discussion (http://s.apache

Re: Local jobtracker in test env?

2012-08-07 Thread Mohit Anchlia
On Tue, Aug 7, 2012 at 2:08 PM, Harsh J wrote: > It used the local mode of operation: > org.apache.hadoop.mapred.LocalJobRunner > > In localmode everything is done inside the same JVM i.e. tasktracker,jobtracker etc. all run in the same JVM. Or does it mean that none of those processes run everyt

Re: Local jobtracker in test env?

2012-08-07 Thread Harsh J
It used the local mode of operation: org.apache.hadoop.mapred.LocalJobRunner A JobTracker (via MiniMRCluster) is only required for simulating distributed tests. On Wed, Aug 8, 2012 at 2:27 AM, Mohit Anchlia wrote: > I just wrote a test where fs.default.name is file:/// and > mapred.job.tracker i

Re: Setting Configuration for local file:///

2012-08-07 Thread Harsh J
If you instantiate the JobConf with your existing conf object, then you needn't have that fear. On Wed, Aug 8, 2012 at 1:40 AM, Mohit Anchlia wrote: > On Tue, Aug 7, 2012 at 12:50 PM, Harsh J wrote: > >> What is GeoLookupConfigRunner and how do you utilize the setConf(conf) >> object within it?

Local jobtracker in test env?

2012-08-07 Thread Mohit Anchlia
I just wrote a test where fs.default.name is file:/// and mapred.job.tracker is set to local. The test ran fine, I also see mapper and reducer were invoked but what I am trying to understand is that how did this run without specifying the job tracker port and which port task tracker connected with

Re: Setting Configuration for local file:///

2012-08-07 Thread Mohit Anchlia
On Tue, Aug 7, 2012 at 12:50 PM, Harsh J wrote: > What is GeoLookupConfigRunner and how do you utilize the setConf(conf) > object within it? Thanks for the pointer I wasn't setting my JobConf object with the conf that I passed. Just one more related question, if I use JobConf conf = new JobConf

Re: Setting Configuration for local file:///

2012-08-07 Thread Harsh J
What is GeoLookupConfigRunner and how do you utilize the setConf(conf) object within it? On Wed, Aug 8, 2012 at 1:10 AM, Mohit Anchlia wrote: > I am trying to write a test on local file system but this test keeps taking > xml files in the path even though I am setting a different Configuration >

Setting Configuration for local file:///

2012-08-07 Thread Mohit Anchlia
I am trying to write a test on local file system but this test keeps taking xml files in the path even though I am setting a different Configuration object. Is there a way for me to override it? I thought the way I am doing overwrites the configuration but doesn't seem to be working: @Test publi

Re: Basic Question

2012-08-07 Thread Mohit Anchlia
On Tue, Aug 7, 2012 at 11:33 AM, Harsh J wrote: > Each write call registers (writes) a KV pair to the output. The output > collector does not look for similarities nor does it try to de-dupe > it, and even if the object is the same, its value is copied so that > doesn't matter. > > So you will ge

Re: Basic Question

2012-08-07 Thread Harsh J
Each write call registers (writes) a KV pair to the output. The output collector does not look for similarities nor does it try to de-dupe it, and even if the object is the same, its value is copied so that doesn't matter. So you will get two KV pairs in your output - since duplication is allowed

Re: migrate cluster to different datacenter

2012-08-07 Thread Michael Segel
The OP hasn't provided enough information to even start trying to make a real recommendation on how to solve this problem. On Aug 4, 2012, at 7:32 AM, Nitin Kesarwani wrote: > Given the size of data, there can be several approaches here: > > 1. Moving the boxes > > Not possible, as I suppose

Re: migrate cluster to different datacenter

2012-08-07 Thread Patrick Angeles
It would help to know your data ingest and processing patterns (and any applicable SLAs). In most cases, you'd only need to move the raw ingested data, then you can derive the rest in the other cluster. Assuming that you have some sort of date-based partitioning on the ingest, then it's easy to de