Hi,
You should definitely take a look at Apache Sqoop as previously mentioned,
if your file is large enough and you have several map jobs running and
hitting your database concurrently, you will experience issues at the db
level.
In terms of speculative jobs (redundant jobs) running to deal with
Hi,
I am going to try to response to your response in the text. I am not an hadoop
expert but we are facing the same kind of problem (dealing with file which are
external to HDFS) in our project and we use hadoop.
[@@THALES GROUP RESTRICTED@@]
-Message d'origine-
De : Per Steffensen [
Hi
We are considering to use MapReduce for a project. I am participating in
an "investigation"-phase where we try to reveal if we would benefit from
using the MapReduce framework.
A little bit about the project:
We will be receiving data from the "outside world" in files via FTP. It
will be