Hadoop 1.2.1 or 2.2.0 on Windows - XP-SP2 Not using Cygwin

2014-01-19 Thread Anand Murali
Dear All:

I have been on a discovery cum learning process for the last 2 months, trying 
to install and use the above Hadoop packages using Cygwin on the Windows XP 
platform, after reading a text book (Hadoop in action - Chuck Lam), where he 
has suggested using Cygwin to work with a Unix like environment.

Much to my dismay, I have come across many installation and runtime problems 
with both Cygwin and Hadoop and things have been very unstable. I posted this 
issue on the issue tracker website and was told that Cygwin is not supported on 
both Hadoop releases, however, I could build for a windows environment. I am 
not sure of how it is done and need assistance and advise. I shall be thankful 
for a response and direction. Look forward to an early reply.

Thanks,


 Anand Murali  
11/7, 'Anand Vihar', Kandasamy St, Mylapore
Chennai - 600 004, India
Ph: (044)- 28474593/ 43526162 (voicemail)

Re: Question about Yarn

2014-01-19 Thread sudhakara st
Hello Chandler,

Yarn contains separate layers for resource management, scheduling and
map-reduce. Only scheduling and resource management layer is separated to
new daemons , changed and extended in the YARN . The map reduce
functionality and processing data(execution) in parallel framework remain
same as in MR1.

I suggest you to read more on YARN architecture
http://hortonworks.com/hadoop/yarn/
http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-operators/


On Mon, Jan 20, 2014 at 6:42 AM, chandler song wrote:

>
> http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>
> this one,
>
>
> 2014/1/19 Marco Shaw 
>
>> Can you clarify?
>>
>> What tutorial and specific sections are you referring to?
>>
>> Marco
>>
>> > On Jan 19, 2014, at 9:26 AM, chandler song 
>> wrote:
>> >
>> > hi all
>> >
>> >   I have some question about yarn when I read the tutorial on the
>> website.
>> >
>> >  1) the contain is physical or logic? for example, there are three
>> PCs(A,B,C) on Cluster.  if I allocate one container. it will run on one PC
>> all the time. or the container will work on different PC at different time.
>> >
>> > 2) the contain, I can think it's a virtual PC which can run java
>> application? is my correct?
>> >
>> > 3)about mapreduce, how mapreduce run on yarn? after reading the
>> tutorial, I think yarn and mapreduce is totally different thing. I think
>> the basic unit of yarn is container. map and reduce's basic unit is map and
>> reduce.
>> >
>> > or how yarn handle concurrent? I know in mapreduce, I don't need to
>> think too much about concurrent. because mapreduce will do this for you. it
>> will split data into a small unit and you can do what you do. but I don't
>> find yarn has same thing.
>> >
>>
>
>


-- 

Regards,
...Sudhakara.st


Re: Question about Yarn

2014-01-19 Thread chandler song
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html

this one,


2014/1/19 Marco Shaw 

> Can you clarify?
>
> What tutorial and specific sections are you referring to?
>
> Marco
>
> > On Jan 19, 2014, at 9:26 AM, chandler song 
> wrote:
> >
> > hi all
> >
> >   I have some question about yarn when I read the tutorial on the
> website.
> >
> >  1) the contain is physical or logic? for example, there are three
> PCs(A,B,C) on Cluster.  if I allocate one container. it will run on one PC
> all the time. or the container will work on different PC at different time.
> >
> > 2) the contain, I can think it's a virtual PC which can run java
> application? is my correct?
> >
> > 3)about mapreduce, how mapreduce run on yarn? after reading the
> tutorial, I think yarn and mapreduce is totally different thing. I think
> the basic unit of yarn is container. map and reduce's basic unit is map and
> reduce.
> >
> > or how yarn handle concurrent? I know in mapreduce, I don't need to
> think too much about concurrent. because mapreduce will do this for you. it
> will split data into a small unit and you can do what you do. but I don't
> find yarn has same thing.
> >
>


Re: Question about Yarn

2014-01-19 Thread Marco Shaw
Can you clarify?

What tutorial and specific sections are you referring to?

Marco

> On Jan 19, 2014, at 9:26 AM, chandler song  wrote:
> 
> hi all
> 
>   I have some question about yarn when I read the tutorial on the website.
>  
>  1) the contain is physical or logic? for example, there are three PCs(A,B,C) 
> on Cluster.  if I allocate one container. it will run on one PC all the time. 
> or the container will work on different PC at different time.
> 
> 2) the contain, I can think it's a virtual PC which can run java application? 
> is my correct?
> 
> 3)about mapreduce, how mapreduce run on yarn? after reading the tutorial, I 
> think yarn and mapreduce is totally different thing. I think the basic unit 
> of yarn is container. map and reduce's basic unit is map and reduce.
> 
> or how yarn handle concurrent? I know in mapreduce, I don't need to think too 
> much about concurrent. because mapreduce will do this for you. it will split 
> data into a small unit and you can do what you do. but I don't find yarn has 
> same thing.
> 


Question about Yarn

2014-01-19 Thread chandler song
hi all

  I have some question about yarn when I read the tutorial on the website.

 1) the contain is physical or logic? for example, there are three
PCs(A,B,C) on Cluster.  if I allocate one container. it will run on one PC
all the time. or the container will work on different PC at different time.

2) the contain, I can think it's a virtual PC which can run java
application? is my correct?

3)about mapreduce, how mapreduce run on yarn? after reading the tutorial, I
think yarn and mapreduce is totally different thing. I think the basic unit
of yarn is container. map and reduce's basic unit is map and reduce.

or how yarn handle concurrent? I know in mapreduce, I don't need to think
too much about concurrent. because mapreduce will do this for you. it will
split data into a small unit and you can do what you do. but I don't find
yarn has same thing.


Re: doubt

2014-01-19 Thread Justin Black
I've installed a hadoop single node cluster on a VirtualBox machine running
ubuntu 12.04LTS (64-bit) with 512MB RAM and 8GB HD. I haven't seen any
errors in my testing yet. Is 1GB RAM required? Will I run into issues when
I expand the cluster?


On Sat, Jan 18, 2014 at 11:24 PM, Alexander Pivovarov
wrote:

> it' enough. hadoop uses only 1GB RAM by default.
>
>
> On Sat, Jan 18, 2014 at 10:11 PM, sri harsha  wrote:
>
>> Hi ,
>> i want to install 4 node cluster in 64-bit LINUX. 4GB RAM 500HD is enough
>> for this or shall i need to expand ?
>> please suggest about my query.
>>
>> than x
>>
>> --
>> amiable harsha
>>
>
>


-- 
-jblack


Re: HADOOP & CUDA

2014-01-19 Thread Massimo Simoniello
Thank you Viacheslav!
It works and it's more simple thank I thought.

Regards,

Massimo


2014/1/15 Slava Rodionov 

> Hi Massimo,
>
> I don't see why should anyone actually need any interface between Hadoop
> and CUDA in case that you're using C++?
> Why don't you just make sure that all of your data nodes are able to run
> CUDA locally, then write some function that uses CUDA and then just call
> this function inside your mapper or reducer?
>
> Best regards,
> Viacheslav
>
>
> 2014/1/15 Massimo Simoniello 
>
>> Hi, I'm working on hadoop 2.2.0 and I'm writting a C++ application with
>> pipes. I would like to use also CUDA in my mapper/reducer but I'm not able
>> to find enough info.
>> I'have found this interesting slide:
>> http://www.slideshare.net/airbots/cuda-29330283 and this page in the
>> wiki http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop.
>>
>> I don't understand where are these functions:
>>
>>- void processHadoopData(string& input);
>>- void cudaCompute(std::map& output);
>>
>>
>> I don't understand where I have to write the code and if it is really
>> possible.
>>
>> Is there any tutorial to learning how to configure hadoop+cuda, write the
>> code, compile it, execute it?
>>
>> Is there any example?
>>
>
>