Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
You can use node.js for this. On Tue, Jun 23, 2015 at 8:15 PM, Divya Gehlot divya.htco...@gmail.com wrote: Can you please elaborate it more. On 20 Jun 2015 2:46 pm, SF Hadoop sfhad...@gmail.com wrote: Really depends on your requirements for the format of the data. The easiest way I can think of is to stream batches of data into a pub sub system that the target system can access and then consume. Verify each batch and then ditch them. You can throttle the size of the intermediary infrastructure based on your batches. Seems the most efficient approach. On Thursday, June 18, 2015, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
Can you please elaborate it more. On 20 Jun 2015 2:46 pm, SF Hadoop sfhad...@gmail.com wrote: Really depends on your requirements for the format of the data. The easiest way I can think of is to stream batches of data into a pub sub system that the target system can access and then consume. Verify each batch and then ditch them. You can throttle the size of the intermediary infrastructure based on your batches. Seems the most efficient approach. On Thursday, June 18, 2015, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
Really depends on your requirements for the format of the data. The easiest way I can think of is to stream batches of data into a pub sub system that the target system can access and then consume. Verify each batch and then ditch them. You can throttle the size of the intermediary infrastructure based on your batches. Seems the most efficient approach. On Thursday, June 18, 2015, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
Not to hijack this post but how would you deal with data that is maintained by hive(Orc format file, hive created tables etc..)...Would we copy the hivemetastore(MySQL) and move that over to new cluster? On Friday, June 19, 2015, Joep Rottinghuis jrottingh...@gmail.com wrote: You can't set up a proxy ? You probably want to avoid writing to local file system because aside from that being slow, it limits the size of your file to the free space on your local disc. If you do need to go commando and go through a single client machine that can see both clusters you probably want to pipe a get to a put. Any kind of serious data volume pulled through a straw is going to be rather slow though. Cheers, Joep Sent from my iPhone On Jun 19, 2015, at 12:09 AM, Nitin Pawar nitinpawar...@gmail.com javascript:_e(%7B%7D,'cvml','nitinpawar...@gmail.com'); wrote: yes On Fri, Jun 19, 2015 at 11:36 AM, Divya Gehlot divya.htco...@gmail.com javascript:_e(%7B%7D,'cvml','divya.htco...@gmail.com'); wrote: In thats It will be like three step process . 1. first cluster (secure zone) HDFS - copytoLocal - user local file system 2. user local space - copy data - second cluster user local file system 3. second cluster user local file system - copyfromlocal - second clusterHDFS Am I on the right track ? On 19 June 2015 at 12:38, Nitin Pawar nitinpawar...@gmail.com javascript:_e(%7B%7D,'cvml','nitinpawar...@gmail.com'); wrote: What's the size of the data? If you can not do distcp between clusters then other way is doing hdfs get on the data and then hdfs put on another cluster On 19-Jun-2015 9:56 am, Divya Gehlot divya.htco...@gmail.com javascript:_e(%7B%7D,'cvml','divya.htco...@gmail.com'); wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya -- Nitin Pawar
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
yes On Fri, Jun 19, 2015 at 11:36 AM, Divya Gehlot divya.htco...@gmail.com wrote: In thats It will be like three step process . 1. first cluster (secure zone) HDFS - copytoLocal - user local file system 2. user local space - copy data - second cluster user local file system 3. second cluster user local file system - copyfromlocal - second clusterHDFS Am I on the right track ? On 19 June 2015 at 12:38, Nitin Pawar nitinpawar...@gmail.com wrote: What's the size of the data? If you can not do distcp between clusters then other way is doing hdfs get on the data and then hdfs put on another cluster On 19-Jun-2015 9:56 am, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya -- Nitin Pawar
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
You can't set up a proxy ? You probably want to avoid writing to local file system because aside from that being slow, it limits the size of your file to the free space on your local disc. If you do need to go commando and go through a single client machine that can see both clusters you probably want to pipe a get to a put. Any kind of serious data volume pulled through a straw is going to be rather slow though. Cheers, Joep Sent from my iPhone On Jun 19, 2015, at 12:09 AM, Nitin Pawar nitinpawar...@gmail.com wrote: yes On Fri, Jun 19, 2015 at 11:36 AM, Divya Gehlot divya.htco...@gmail.com wrote: In thats It will be like three step process . 1. first cluster (secure zone) HDFS - copytoLocal - user local file system 2. user local space - copy data - second cluster user local file system 3. second cluster user local file system - copyfromlocal - second clusterHDFS Am I on the right track ? On 19 June 2015 at 12:38, Nitin Pawar nitinpawar...@gmail.com wrote: What's the size of the data? If you can not do distcp between clusters then other way is doing hdfs get on the data and then hdfs put on another cluster On 19-Jun-2015 9:56 am, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya -- Nitin Pawar
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
In thats It will be like three step process . 1. first cluster (secure zone) HDFS - copytoLocal - user local file system 2. user local space - copy data - second cluster user local file system 3. second cluster user local file system - copyfromlocal - second clusterHDFS Am I on the right track ? On 19 June 2015 at 12:38, Nitin Pawar nitinpawar...@gmail.com wrote: What's the size of the data? If you can not do distcp between clusters then other way is doing hdfs get on the data and then hdfs put on another cluster On 19-Jun-2015 9:56 am, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya
copy data from one hadoop cluster to another hadoop cluster + cant use distcp
Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya
Re: copy data from one hadoop cluster to another hadoop cluster + cant use distcp
What's the size of the data? If you can not do distcp between clusters then other way is doing hdfs get on the data and then hdfs put on another cluster On 19-Jun-2015 9:56 am, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I need to copy data from first hadoop cluster to second hadoop cluster. I cant access second hadoop cluster from first hadoop cluster due to some security issue. Can any point me how can I do apart from distcp command. For instance Cluster 1 secured zone - copy hdfs data to - cluster 2 in non secured zone Thanks, Divya