Hello Masoud,
Did you look at sqoop (http://sqoop.apache.org)? Maybe it can help you.
I think there also is a specific FileInputFormat designed for databases. I
don't remember the name. If you use it, you need just write a mapper.
Guillaume Polaert | Cyrès Conseil
-Message
Hi David,
Thanks for your reply.
Some more questions about installing MRV1 using tarball.
For the link '
https://ccp.cloudera.com/display/SUPPORT/CDH4+Downloadable+Tarballs',
should I intall MRV1 using 'hadoop-0.20-mapreduce-0.20.2+1270' or
'hadoop-2.0.0+556' or both of them (
If you install CDH4
From e-mail headers:
List-Help: mailto:hdfs-user-h...@hadoop.apache.org
List-Unsubscribe: mailto:hdfs-user-unsubscr...@hadoop.apache.org
List-Post: mailto:hdfs-u...@hadoop.apache.org
List-Id: hdfs-user.hadoop.apache.org
2013/2/20 Alex Luya alexander.l...@gmail.com
I can't googling it out,has
Hi,
Does anyone know how to configure the JobHistory server in Hadoop 2.0.2
to have detailed statistics about each job in a single node cluster?
In my case, when I access the http interface of the JobHistory,
http://localhost:19888/jobhistory, the message No data available in
table appears
I'd recommend picking up branch-trunk-win; (a call for vote for merge into
trunk is going to happen soon after the work is complete and precommit built is
clean; then these changes will be in trunk (!) ).
This removes the Cygwin dependency for running Hadoop on Windows. This work is
being
Hi Nagarjuna,
Is it a test cluster? Do you have another cluster running close-by?
Also, is it your first try?
It seems there is some previous data in the dfs directory which is not
in sync with the last installation.
Maybe you can remove the content of
Hi Jean Marc,
Yes, this is the cluster I am trying to create and then will scale up.
As per your suggestion I deleted the folder /Users/nagarjunak/Documents/
hadoop-install/hadoop-2.0.3-alpha/tmp_20 an formatted the cluster.
Now I get the following error.
2013-02-20 21:17:25,668 FATAL
But be careful.
hadoop fs -cat will retrieve the entire file and last only when it
will have retrieve the last bytes you are looking for.
If your file is many GB big, it will take a lot of time for this
command to complete and will put some pressure on your network.
JM
2013/2/19, jamal sasha
Hello All,
I’m sending this email because I think it may be interesting for Hadoop
users, as this project have a strong usage of Hadoop platform.
We are strongly considering opening the source of our DMP (Data Management
Platform), if it proves to be technically interesting to other developers /
Hi JM,
I am not sure how dangerous it is, since we're using a pipe here,
and as you yourself note, it will only last as long as the last bytes
have been got and then terminate.
The -cat process will terminate because the
process we're piping to will terminate first after it reaches its goal
of
Hi Harsh,
My bad.
I read the example quickly and I don't know why I tought you used tail
and not head.
head will work perfectly. But tail will not since it will need to read
the entier file. My comment was for tail, not for head, and therefore
not application to the example you gave.
hadoop
Hi Nagarjuna,
What's is in your /etc/hosts file? I think the line in logs where it says
DataNodeRegistration(0.0.0.0 [..], should be the hostname or IP of the
datanode (124.123.215.187 since you said it's a pseudo-distributed setup)
and not 0.0.0.0.
By the way are you using the dfs.hosts
No problem JM, I was confused as well.
AFAIK, there's no shell utility that can let you specify an offset #
of bytes to start off with (similar to skip in dd?), but that can be
done from the FS API.
On Thu, Feb 21, 2013 at 1:14 AM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
Hi Harsh,
NN refers to the rack topology script/class when a new node joins
(i.e. it doesn't have the node's IP already in cache), when it starts
up, and (I think) also when you issue -refreshNodes.
The ideal way to add a node to the rack right now is to first update
the rack config at the NN, then boot
Hadoop 1.0.4
Java JDK 6u37
CentOS 6.3
I am having a strange issue where the TTs are slow to rejoin the cluster
after a restart.
I issued a stop-all / start-all on the cluster. The DNs came up
immediately. All of the DNs reported in the NN UI as alive within 5/10
seconds of restart. Once the
Have you installed Hadoop on this node before? If so, did you clean out
all of your old data dirs?
On Wed, Feb 20, 2013 at 4:41 PM, nagarjuna kanamarlapudi
nagarjuna.kanamarlap...@gmail.com wrote:
/etc/hosts
127.0.0.1 nagarjuna
255.255.255.255 broadcasthost
::1
In mapred-site.xml:
property
namemapreduce.jobhistory.address/name
valueYOUR_HOST:10020/value
/property
property
namemapreduce.jobhistory.webapp.address/name
valueYOUR_HOST:19888/value
/property
property
namemapreduce.jobhistory.intermediate-done-dir/name
Jokingly I want to say the problem is that you selected Ubuntu (or any
other Debian based Linux) as your platform.
On a more serious note, if you are new to both Linux and Hadoop, you might
be much better off to select CentOS for your Linux as that is the base
development platform for most
Try the debs from the Apache Bigtop project 0.3 release, its a bit of
an older 1.x release but the debs would work well:
http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil ram.nath241...@gmail.com wrote:
Hello,
I am
The error is truncated, check the actual failed task's logs for complete info:
Caused by: com.sap… what?
Seems more like a SAP side fault than a Hadoop side one and you should
ask on their forums with the stacktrace posted.
On Thu, Feb 21, 2013 at 11:58 AM, samir das mohapatra
Hi Sameer
The query
SELECT t.* FROM hgopalan.hana_training AS t WHERE 1=0
Is first executed by SQOOP to fetch the metadata.
The actual data fetch happens as part of individual queries from each task
which would be a sub query of the whole input query.
Regards
Bejoy KS
Sent from remote
21 matches
Mail list logo