I plan to use hadoop to do some log processing and I'm working on a
method to load the files (probably nightly) into hdfs. My plan is to
have a web server on each machine with logs that serves up the log
directories. Then I would give distcp a list of http URLs of the log
files and have it
(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:871)
- Original Message
From: Derek Young dyo...@...
To: core-u...@...
Sent: Wednesday, January 21, 2009 1:23:56 PM
Subject: using distcp for http source files
I plan to use hadoop to do some log processing