In order to do anything other than a tar transfer (which is a kludge, of
course), you'll need to open up the relevant ports between the client and
the hadoop cluster. I may miss a few here, but I believe these would
include port 50010 for the datanodes and whatever port the namenode is
listening o
Jeff,
Thanks for help, I want to clarify several details:
1. I know this way to import files to HDFS, but this is connected with
direct accessing HDFS nodes by user.
Does exist another way export all data files from data server side to remote
HDFS nodes without tar invocation?
2. I've setup repl
Victor:
I think in your use case the best way to move the data into hadoop would
either be to tar it up and move it to the same network the HDFS machines are
on, untar it and then run...
hadoop dfs -put /contents-path /dfs-path
If you only want a replication factor of 2 (the default is 3), open
Hi,
I want to use HDFS as DFS to store files. I have one data server with 50Gb
data and I plan to use 3 new machines with installed HDFS to duplicate this
data.
These 3 machines are: 1 name node, 2 data nodes. The duplication factor for
all files is 2.
My questions are:
1. How could I create 50