RE: Database insertion by HAdoop

2013-02-20 Thread Guillaume Polaert
Hello Masoud, Did you look at sqoop (http://sqoop.apache.org)? Maybe it can help you. I think there also is a specific FileInputFormat designed for databases. I don't remember the name. If you use it, you need just write a mapper. Guillaume Polaert | Cyrès Conseil -Message

Re: CDH4 version questions

2013-02-20 Thread jing wang
Hi David, Thanks for your reply. Some more questions about installing MRV1 using tarball. For the link ' https://ccp.cloudera.com/display/SUPPORT/CDH4+Downloadable+Tarballs', should I intall MRV1 using 'hadoop-0.20-mapreduce-0.20.2+1270' or 'hadoop-2.0.0+556' or both of them ( If you install CDH4

Re: how to unsubscribe from this list?

2013-02-20 Thread Vitalii Tymchyshyn
From e-mail headers: List-Help: mailto:hdfs-user-h...@hadoop.apache.org List-Unsubscribe: mailto:hdfs-user-unsubscr...@hadoop.apache.org List-Post: mailto:hdfs-u...@hadoop.apache.org List-Id: hdfs-user.hadoop.apache.org 2013/2/20 Alex Luya alexander.l...@gmail.com I can't googling it out,has

Yarn JobHistory service

2013-02-20 Thread Damián Serrano
Hi, Does anyone know how to configure the JobHistory server in Hadoop 2.0.2 to have detailed statistics about each job in a single node cluster? In my case, when I access the http interface of the JobHistory, http://localhost:19888/jobhistory, the message No data available in table appears

RE: Not able to start JobTracker in cygwin environment

2013-02-20 Thread Brad Sarsfield
I'd recommend picking up branch-trunk-win; (a call for vote for merge into trunk is going to happen soon after the work is complete and precommit built is clean; then these changes will be in trunk (!) ). This removes the Cygwin dependency for running Hadoop on Windows. This work is being

Re: In Compatible clusterIDs

2013-02-20 Thread Jean-Marc Spaggiari
Hi Nagarjuna, Is it a test cluster? Do you have another cluster running close-by? Also, is it your first try? It seems there is some previous data in the dfs directory which is not in sync with the last installation. Maybe you can remove the content of

Re: In Compatible clusterIDs

2013-02-20 Thread nagarjuna kanamarlapudi
Hi Jean Marc, Yes, this is the cluster I am trying to create and then will scale up. As per your suggestion I deleted the folder /Users/nagarjunak/Documents/ hadoop-install/hadoop-2.0.3-alpha/tmp_20 an formatted the cluster. Now I get the following error. 2013-02-20 21:17:25,668 FATAL

Re: copy chunk of hadoop output

2013-02-20 Thread Jean-Marc Spaggiari
But be careful. hadoop fs -cat will retrieve the entire file and last only when it will have retrieve the last bytes you are looking for. If your file is many GB big, it will take a lot of time for this command to complete and will put some pressure on your network. JM 2013/2/19, jamal sasha

If we Open Source our platform, would it be interesting to you?

2013-02-20 Thread Marcelo Elias Del Valle
Hello All, I’m sending this email because I think it may be interesting for Hadoop users, as this project have a strong usage of Hadoop platform. We are strongly considering opening the source of our DMP (Data Management Platform), if it proves to be technically interesting to other developers /

Re: copy chunk of hadoop output

2013-02-20 Thread Harsh J
Hi JM, I am not sure how dangerous it is, since we're using a pipe here, and as you yourself note, it will only last as long as the last bytes have been got and then terminate. The -cat process will terminate because the process we're piping to will terminate first after it reaches its goal of

Re: copy chunk of hadoop output

2013-02-20 Thread Jean-Marc Spaggiari
Hi Harsh, My bad. I read the example quickly and I don't know why I tought you used tail and not head. head will work perfectly. But tail will not since it will need to read the entier file. My comment was for tail, not for head, and therefore not application to the example you gave. hadoop

RE: In Compatible clusterIDs

2013-02-20 Thread Vijay Thakorlal
Hi Nagarjuna, What's is in your /etc/hosts file? I think the line in logs where it says DataNodeRegistration(0.0.0.0 [..], should be the hostname or IP of the datanode (124.123.215.187 since you said it's a pseudo-distributed setup) and not 0.0.0.0. By the way are you using the dfs.hosts

Re: copy chunk of hadoop output

2013-02-20 Thread Harsh J
No problem JM, I was confused as well. AFAIK, there's no shell utility that can let you specify an offset # of bytes to start off with (similar to skip in dd?), but that can be done from the FS API. On Thu, Feb 21, 2013 at 1:14 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Harsh,

Re: Topology script frequency

2013-02-20 Thread Harsh J
NN refers to the rack topology script/class when a new node joins (i.e. it doesn't have the node's IP already in cache), when it starts up, and (I think) also when you issue -refreshNodes. The ideal way to add a node to the rack right now is to first update the rack config at the NN, then boot

Tasktrackers slow to subscribe

2013-02-20 Thread Alex Current
Hadoop 1.0.4 Java JDK 6u37 CentOS 6.3 I am having a strange issue where the TTs are slow to rejoin the cluster after a restart. I issued a stop-all / start-all on the cluster. The DNs came up immediately. All of the DNs reported in the NN UI as alive within 5/10 seconds of restart. Once the

Re: In Compatible clusterIDs

2013-02-20 Thread Alex Current
Have you installed Hadoop on this node before? If so, did you clean out all of your old data dirs? On Wed, Feb 20, 2013 at 4:41 PM, nagarjuna kanamarlapudi nagarjuna.kanamarlap...@gmail.com wrote: /etc/hosts 127.0.0.1 nagarjuna 255.255.255.255 broadcasthost ::1

Re: Yarn JobHistory service

2013-02-20 Thread Azuryy Yu
In mapred-site.xml: property namemapreduce.jobhistory.address/name valueYOUR_HOST:10020/value /property property namemapreduce.jobhistory.webapp.address/name valueYOUR_HOST:19888/value /property property namemapreduce.jobhistory.intermediate-done-dir/name

Re: About Hadoop Deb file

2013-02-20 Thread Chris Embree
Jokingly I want to say the problem is that you selected Ubuntu (or any other Debian based Linux) as your platform. On a more serious note, if you are new to both Linux and Hadoop, you might be much better off to select CentOS for your Linux as that is the base development platform for most

Re: About Hadoop Deb file

2013-02-20 Thread Harsh J
Try the debs from the Apache Bigtop project 0.3 release, its a bit of an older 1.x release but the debs would work well: http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/ On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil ram.nath241...@gmail.com wrote: Hello, I am

Re: ISSUE :Hadoop with HANA using sqoop

2013-02-20 Thread Harsh J
The error is truncated, check the actual failed task's logs for complete info: Caused by: com.sap… what? Seems more like a SAP side fault than a Hadoop side one and you should ask on their forums with the stacktrace posted. On Thu, Feb 21, 2013 at 11:58 AM, samir das mohapatra

Re: ISSUE :Hadoop with HANA using sqoop

2013-02-20 Thread bejoy . hadoop
Hi Sameer The query SELECT t.* FROM hgopalan.hana_training AS t WHERE 1=0 Is first executed by SQOOP to fetch the metadata. The actual data fetch happens as part of individual queries from each task which would be a sub query of the whole input query. Regards Bejoy KS Sent from remote