tools.DistCp: Invalid arguments

2015-02-02 Thread xeonmailinglist
Hi, I am trying to copy data using |distcp| but I get this error. Both hadoop runtime are working properly. Why is this happening? | vagrant@hadoop-coc-1:~/Programs/hadoop$ hadoop distcp hdfs://hadoop-coc-1:50070/input1 hdfs://hadoop-coc-2:50070/ 15/02/02 19:46:37 ERROR tools.DistCp: Invalid

Re: tools.DistCp: Invalid arguments

2015-02-02 Thread Alexander Alten-Lorenz
Have a closer look: hdfs://hadoop-coc-2:50070/ No Path is given. On 02 Feb 2015, at 20:52, xeonmailinglist xeonmailingl...@gmail.com wrote: Hi, I am trying to copy data using distcp but I get this error. Both hadoop runtime are working properly. Why is this happening?

Re: Multiple separate Hadoop clusters on same physical machines

2015-02-02 Thread daemeon reiydelle
Fantastic! I was delighted to have recently worked for a large search engine company that has moved significant components of their hadoop to to docker containers on Ubunto, seeing amazing performance/density improvements. And yes, the build process is really picky. Thanks SO much! *...*

Re: Copy data between clusters during the job execution.

2015-02-02 Thread Artem Ervits
take a look at oozie, once first job completes you can distcp to another server. Artem Ervits On Feb 2, 2015 5:46 AM, Daniel Haviv danielru...@gmail.com wrote: It should run after your job finishes. You can create the flow using a simple bash script Daniel On 2 בפבר׳ 2015, at 12:31,

RE: Multiple separate Hadoop clusters on same physical machines

2015-02-02 Thread Ashish Kumar9
Is there any good reference material available to follow to test docker and hadoop integration . From: hadoop.supp...@visolve.com To: 'Harun Reşit Zafer' harun.za...@tubitak.gov.tr, user@hadoop.apache.org Date: 02/02/2015 02:57 PM Subject:RE: Multiple separate Hadoop clusters

Re: Multiple separate Hadoop clusters on same physical machines

2015-02-02 Thread Alexander Alten-Lorenz
http://blog.sequenceiq.com/blog/2014/06/19/multinode-hadoop-cluster-on-docker/ http://blog.sequenceiq.com/blog/2014/06/19/multinode-hadoop-cluster-on-docker/ Ambari based, but works quite well On 02 Feb 2015, at 10:33, Ashish Kumar9 ashis...@in.ibm.com wrote: Is there any good reference

Re: Copy data between clusters during the job execution.

2015-02-02 Thread Daniel Haviv
You can use distcp Daniel On 2 בפבר׳ 2015, at 11:12, xeon Mailinglist xeonmailingl...@gmail.com wrote: Hi I want to have a job that copies the map output, or the reduce output to another hdfs. Is is possible? E.g., the job runs in cluster 1 and takes the input from this cluster.

Re: Copy data between clusters during the job execution.

2015-02-02 Thread Daniel Haviv
It should run after your job finishes. You can create the flow using a simple bash script Daniel On 2 בפבר׳ 2015, at 12:31, xeonmailinglist xeonmailingl...@gmail.com wrote: But can I use discp inside my job, or I need to program something that executes distcp after executing my job?

Re: Copy data between clusters during the job execution.

2015-02-02 Thread xeonmailinglist
But can I use discp inside my job, or I need to program something that executes distcp after executing my job? On 02-02-2015 10:20, Daniel Haviv wrote: an use distcp Daniel On 2 בפבר׳ 2015, at 11:12,

RE: Multiple separate Hadoop clusters on same physical machines

2015-02-02 Thread Hadoop Support
Hello Ashish, Alexander reference is great. Adding to that, you can also find latest Hadoop docker image from below https://registry.hub.docker.com/u/sequenceiq/hadoop-docker/ Set up is as simple as that 1. Install Docker 2. Build and Pull above Image 3.

RE: Copy data between clusters during the job execution.

2015-02-02 Thread hadoop.support
It seems in your first error message, you have missed the source directory argument by a bit. One common usage of distcp is : Distcp (solution to your problem) hadoop distcp hdfs://hadoop-coc-1:50070/input1 hdfs://hadoop-coc-2:50070/some1 It is also wise to use latest tool: distcp2

Re: set default queue for user

2015-02-02 Thread Vikas Parashar
Thanks Naga! Let me try it out. On Mon, Feb 2, 2015 at 1:06 PM, Naganarasimha G R (Naga) garlanaganarasi...@huawei.com wrote: Hi Vikas, To a small scale user to queue mapping was supported as part of YARN-2411, Please refer the following configurations : property

Compilation problem of hadoop

2015-02-02 Thread Bo Fu
Hi all, I’m trying to compile the hadoop-1.0.3 source code using the command: ant package -Djava5.home=$JAVA_HOME -Dforrest.home=$FORREST_HOME It failed. The fail log is as follows. Can anyone tell me how to solve it? Thanks. [exec] depbase=`echo impl/task-controller.o | sed

Cloudera Manager Question

2015-02-02 Thread SP
Hi All, I have installed hadoop CDH4 on my cluster now I am trying to install and configure Cloudera Manager. I don't want to install hadoop packages again with CM. How can I make CM to identifiy my cluster and setup monitoring on it. Thanks SP

Re: About HDFS's single-writer, multiple-reader model, any use case?

2015-02-02 Thread Dongzhe Ma
Hi, Thanks for your quick reply. But the point that puzzles me is why HDFS chose such a model, not how this model works. Recently I had read the append design doc, and one issue about implementing append correctly is to maintain read consistency, which is a problem only when someone reads a file

Re: HDFS balancer ending with error

2015-02-02 Thread Juraj jiv
Hello, restart cloudera cluster helped. After that balancer started working. Dunno why, all fine now. JV On Fri, Jan 30, 2015 at 2:41 AM, Anurag Tangri anurag_tan...@yahoo.com wrote: Is your cluster fully upgraded to CDH5 ? Thanks, Anurag Tangri On Jan 29, 2015, at 6:33 AM, Juraj jiv

Copy data between clusters during the job execution.

2015-02-02 Thread xeon Mailinglist
Hi I want to have a job that copies the map output, or the reduce output to another hdfs. Is is possible? E.g., the job runs in cluster 1 and takes the input from this cluster. Then, before the job finishes, it copies the map output or the reduce output to the hdfs in the cluster 2. Thanks,

RE: Multiple separate Hadoop clusters on same physical machines

2015-02-02 Thread hadoop.support
Hello Harun, Your question is very interesting and will be useful for future Hadoop setups for startup/individuals too. Normally for testing purposes, we prefer you to use pseudo-distributed environments (i.e. installation of all cluster files in single node). You can refer few links