Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

2012-09-26 Thread sudha sadhasivam
Sir We have also tried the option of putting JCUBLAA in hadoop jar. Still it does not recognise. We would be thankful if you could provide us with a sample exercise on the same with steps for execution I am herewith attaching the error file Thanking you with warm regards Dr G sudha Sadasivam ---

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Bertrand Dechoux
The difficulty with data transfer between tasks is handling synchronisation and failure. You may want to look at graph processing done on top of Hadoop (like Giraph). That's one way to do it but whether it is relevant or not to you will depend on your context. Regards Bertrand On Wed, Sep 26,

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Jonathan Bishop
Yes, Giraph seems like the best way to go - it is mainly a vertex evaluation with message passing between vertices. Synchronization is handled for you. On Wed, Sep 26, 2012 at 8:36 AM, Jane Wayne jane.wayne2...@gmail.comwrote: hi, i know that some algorithms cannot be parallelized and adapted

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Jane Wayne
my problem is more general (than graph problems) and doesn't need to have logic built around synchronization or failure. for example, when a mapper is finished successfully, it just writes/persists to a storage location (could be disk, could be database, could be memory, etc...). when the next

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Harsh J
Apache Giraph is a framework for graph processing, currently runs over MR (but is getting its own coordination via YARN soon): http://giraph.apache.org. You may also checkout the generic BSP system (Giraph uses BSP too, if am not wrong, but doesn't use Hama - works over MR instead), Apache Hama:

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Jay Vyas
The reason this is so rare is that the nature of map/reduce tasks is that they are orthogonal i.e. the word count, batch image recognition, tera sort -- all the things hadoop is famous for are largely orthogonal tasks. Its much more rare (i think) to see people using hadoop to do traffic

Re: Passing Command-line Parameters to the Job Submit Command

2012-09-26 Thread Varad Meru
Thanks Hemanth, Yes, the java variables passed as -Dkey=value. But for the arguments passed to the main method (i.e. String[] args) I cannot find any other way to pass them apart from hadoop jar CLASSNAME arguments. So if I have a job file, I'll will compulsorily have to use the java

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Bertrand Dechoux
I wouldn't so surprised. It takes times, energy and money to solve problems and make solutions that would be prod-ready. A few people would consider that the namenode/secondary spof is a limit for Hadoop itself in order to go into a critical production environnement. (I am only quoting it and

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Harsh J
Also read: http://arxiv.org/abs/1209.2191 ;-) On Thu, Sep 27, 2012 at 12:24 AM, Bertrand Dechoux decho...@gmail.com wrote: I wouldn't so surprised. It takes times, energy and money to solve problems and make solutions that would be prod-ready. A few people would consider that the

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Jane Wayne
thanks. those issues pointed out do cover the pain points i'm experiencing. On Wed, Sep 26, 2012 at 3:11 PM, Harsh J ha...@cloudera.com wrote: Also read: http://arxiv.org/abs/1209.2191 ;-) On Thu, Sep 27, 2012 at 12:24 AM, Bertrand Dechoux decho...@gmail.com wrote: I wouldn't so surprised. It

unsubscribe

2012-09-26 Thread Kuldeep Chitrakar

Re: Programming Question / Joining Dataset

2012-09-26 Thread Bejoy Ks
Hi Oliver I have scribbled a small post on reduce side joins , the implementation matches with your requirement http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html Regards Bejoy KS

Re: Amateur doubt about Terasort

2012-09-26 Thread Harsh J
Please do not mail general@ with user/dev questions. Use the user@ alias for it in future. The IdentityMapper and IdentityReducer is what TeraSort uses (it is not needed/hadoop does sort on default - uses default mapper/reducer). On Wed, Sep 26, 2012 at 10:08 PM, Nitin Khandelwal

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Jane Wayne
jay, thanks. i just needed a sanity check. i hope and expect that one day, hadoop will mature towards supporting a shared-something approach. the web service call is not a bad idea at all. that way, we can abstract what that ultimate data store really is. i'm just a little surprised that we are

Unit tests for Map and Reduce functions.

2012-09-26 Thread Ravi P
Is it possible to write unit test for mapper Map , and reducer Reduce function ? -Ravi

Re: Unit tests for Map and Reduce functions.

2012-09-26 Thread Kai Voigt
Hello, yes, http://mrunit.apache.org is your reference. MRUnit is a framework on top of JUnit which emulates the mapreduce framework to test your mappers and reducers. Kai Am 26.09.2012 um 22:18 schrieb Ravi P hadoo...@outlook.com: Is it possible to write unit test for mapper Map , and

RE: Unit tests for Map and Reduce functions.

2012-09-26 Thread Ravi P
Thanks Kai, I am exploring MRunit . Are there any other options/ways to write unit tests for Map and Reduce functions. Would like to evaluate all options. -Ravi From: hadoo...@outlook.com To: k...@123.org Subject: RE: Unit tests for Map and Reduce functions. Date: Wed, 26 Sep 2012 13:35:57

Re: Unit tests for Map and Reduce functions.

2012-09-26 Thread Bejoy Ks
Hi Ravi You can take a look at mockito http://books.google.co.in/books?id=Nff49D7vnJcCpg=PA138lpg=PA138dq=mockito+%2B+hadoopsource=blots=IifyVu7yXpsig=Q1LoxqAKO0nqRquus8jOW5CBiWYhl=ensa=Xei=b2pjULHSOIPJrAeGsIHwAgved=0CC0Q6AEwAg#v=onepageq=mockito%20%2B%20hadoopf=false On Thu, Sep 27, 2012 at

Re: splitting jobtracker and namenode

2012-09-26 Thread Ted Dunning
Why are you changing the TTL on DNS if you aren't moving the name? If you are just changing the config to a new name, then caching won't matter. On Wed, Sep 26, 2012 at 1:46 PM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: Hi Hadoopers, My production Hadoop 0.20.2 cluster has been

Re: splitting jobtracker and namenode

2012-09-26 Thread Patai Sangbutsarakum
Thanks Ted, that's true. On Wed, Sep 26, 2012 at 3:07 PM, Ted Dunning tdunn...@maprtech.com wrote: Why are you changing the TTL on DNS if you aren't moving the name? If you are just changing the config to a new name, then caching won't matter. On Wed, Sep 26, 2012 at 1:46 PM, Patai

Re: Advice on Migrating to hadoop + hive

2012-09-26 Thread Michael Segel
You can get rid of Postgres and go with Hive. You may want to consider setting up an external table so you just drop your logs in to place. (Define once in Hive's metadata store, and then just drop data within the space / partitions) Tools? Karmasphere and others. Sorry for the terse