Java programming Conditional select

2010-01-13 Thread Rob Stewart
Hi folks, I am trying to provide a comparative study on the various Hadoop languages. I have all but completed the implementation in JAQL, Pig and Hive for the tasks, leaving Java MR to last. The applications are simple, and they include: 1. Conditional Select (A and B would be read from files A

Unable to exit safe mode

2010-01-14 Thread Rob Stewart
Hi, I'm having a slight issue with my Hadoop cluster. There are 32 nodes. I have: /usr/lib/hadoop/bin/stop-mapred.sh /usr/lib/hadoop/bin/stop-dfs.sh /usr/lib/hadoop/bin/start-dfs.sh /usr/lib/hadoop/bin/start-mapred.sh All worked perfectly, no errors. I try and remove a file: hadoop dfs -rmr word

Re: Unable to exit safe mode

2010-01-14 Thread Rob Stewart
fsadmin. > > On Thu, Jan 14, 2010 at 9:41 PM, Rob Stewart > wrote: > > Hi, > > > > I'm having a slight issue with my Hadoop cluster. There are 32 nodes. I > > have: > > /usr/lib/hadoop/bin/stop-mapred.sh > > /usr/lib/hadoop/bin/stop-dfs.sh > > /

Re: Unable to exit safe mode

2010-01-14 Thread Rob Stewart
tion files, and can > you find NameNode process with jps? > > On Thu, Jan 14, 2010 at 10:16 PM, Rob Stewart > wrote: > > Hi Wang, > > > >> hadoop dfsadmin -report > > > > returns nothing at all. > > > > I have tried: > >> hadoop dfsadm

Quick Clarification of sort mechanism

2010-01-15 Thread Rob Stewart
x27;t see in the reduce method in the above example where exactly the key/values get specified to order by key alphabetically? Or how I can override this to state to for by the value of the final reduce (i.e. by the frequency). Thanks, Rob Stewart

Re: Apriori and association rules with Hadoop

2010-01-18 Thread Rob Stewart
world for your Apriori implementation? It would be of huge help to me, but most probably others too. Many regards, Rob Stewart 2010/1/19 Enis Soztutar > We have succesfully implemented sequential apriori ( a variant of GSP ) on > MapReduce. Candidate generation, counting and candidate

Join Hadoop Example problem

2010-01-25 Thread Rob Stewart
at appear in both the input lists. Is it possible to adapt the Join example packaged with Hadoop to implement this? Many thanks, Rob Stewart

Re: Join Hadoop Example problem

2010-01-25 Thread Rob Stewart
he exact command that you are giving when submitting the > jobs? I did not see it in your e-mail. > > Abhishek > > On Mon, Jan 25, 2010 at 5:43 PM, Rob Stewart > wrote: > > Hi there, I'm using Hadoop 0.20.1 and I'm trying to use the Join > application > >

Hybrid Hadoop with fork/join ?

2012-01-31 Thread Rob Stewart
s that I should consider? Any thoughts on the proposed multi-threaded distributed-shared memory architecture are much appreciated from the Hadoop community! -- Rob Stewart

Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread Rob Stewart
I'm looking to clarify the relationship between MultithreadedMapper.setNumberOfThreads(i) and mapreduce.tasktracker.map.tasks.maximum . If I set: - MultithreadedMapper.setNumberOfThreads( 4 ) - mapreduce.tasktracker.map.tasks.maximum = 1 Will 4 map tasks be executed in four separate threads withi

Re: Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread Rob Stewart
hi Harsh, On 10 February 2012 12:42, Harsh J wrote: > 4 JVMs if you have 4 tasks in your Job  (# of map tasks of a job is > dependent on its input). > > Each JVM will then run the MultithreadedMapper code, which will then > run 4 threads to call your map() inside of it cause you've asked that >

Re: Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread Rob Stewart
Harsh, On 10 February 2012 13:33, Harsh J wrote: > What you're missing to see here is that the multithreaded mapper is > something that runs as part of one single map task. > With just one JVM slot, you'd end up processing only one input-chunk > at a time, though with 4 threads doing map() co

Re: Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread Rob Stewart
Harsh... Oddly, this blog post has appeared within the last hour or so http://kickstarthadoop.blogspot.com/2012/02/enable-multiple-threads-in-mapper-aka.html -- Rob On 10 February 2012 14:20, Harsh J wrote: > Hello again, > > On Fri, Feb 10, 2012 at 7:31 PM, Rob Stewart wro

Re: Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread Rob Stewart
Thanks, this is a lot clearer. One final question... On 10 February 2012 14:20, Harsh J wrote: > Hello again, > > On Fri, Feb 10, 2012 at 7:31 PM, Rob Stewart wrote: >> OK, take word count. The to the map is > beta">. The canonical Hadoop program would tokenize this

Concordance Challenge

2010-11-10 Thread Rob Stewart
MapReduce. thanks, Rob Stewart

Re: Concordance Challenge

2010-11-11 Thread Rob Stewart
r1 and again in mapper2. Have I misunderstood your example? cheers Christian. > > Cheers, > Christian Søttrup > > > Rob Stewart wrote: >> >> Hi all. >> I have a query about a application I have been asked to implement in >> Hadoop MapReduce. It is the conc

Slow final few reducers

2010-12-11 Thread Rob Stewart
number of reducers... 200 reducers: 180/190 will complete in 5 minutes, and the last 10/20 will take 10 minutes. Is this normal Hadoop behavior? I know that the output of the Reducer function is not sorted, so can't figure out why this decline of performance at the tail end of the job? thanks,

Multicore Nodes

2010-12-11 Thread Rob Stewart
ly quicker on 4 cores, rather than 2" ? thanks, Rob Stewart

Re: Slow final few reducers

2010-12-11 Thread Rob Stewart
Hi, many thanks for your response. A few observations: - I know that for a fact my key distribution is quite radically skewed (some keys with *many* value, most keys with few). - I have overlooked the fact that I need a partitioner. I suspect that this will help dramatically. I realize that the n

Re: Multicore Nodes

2010-12-11 Thread Rob Stewart
11 December 2010 11:28, Harsh J wrote: > Hi, > > On Sat, Dec 11, 2010 at 4:39 PM, Rob Stewart > wrote: >> Hi, >> >> When trying to compare Hadoop against other parallel  paradigms, it is >> important to consider heterogeneous systems. Some may have 100 nodes, &g

Re: Slow final few reducers

2010-12-11 Thread Rob Stewart
thanks, Rob On 11 December 2010 11:38, Rob Stewart wrote: > Hi, many thanks for your response. > > A few observations: > - I know that for a fact my key distribution is quite radically skewed > (some keys with *many* value, most keys with few). > - I have overlooked the fact that I nee

Re: Slow final few reducers

2010-12-11 Thread Rob Stewart
t is "setdest", and what is it actually doing? And why is it taking so long? cheers, Rob Stewart On 11 December 2010 12:26, Harsh J wrote: > On Sat, Dec 11, 2010 at 5:25 PM, Rob Stewart > wrote: >> Oh, >> >> I should add, of the Java processes running on t

Re: Slow final few reducers

2010-12-11 Thread Rob Stewart
Sorry my fault - It's someone running a network simulator on the cluster ! Rob On 11 December 2010 14:09, Rob Stewart wrote: > OK, slight update: > > Immediately underneath public void reduce(), I have added a: > System.out.println("Key: " + key.toString()); > &g

Hadoop Scalability - A Case Study: Concordance

2010-12-19 Thread Rob Stewart
Loidl, who's named at the bottom of this page: http://www.sicsa.ac.uk/news/sicsa-multicore-challenge Regards, Rob Stewart