how to access data directly in webapplication?

2008-05-23 Thread ma qiang
Hi all, I have developed a web application using Tomcat. In my app, a client submit a request, and the server read data from HBase then return these data as a response. But now I can only use shell and eclipse plugin to invoke program on Hadoop. Who can tell me how to access data in Hbase

Re: Can you run multiple simultaneous hadoop jobs?

2008-05-23 Thread Brice Arnould
Kayla Jay a écrit : I'm trying to figure out why I need to use HOD vs. trying to run multiple jobs at the same time on the same set of resources. Is it possible to run multiple hadoop jobs at the same time on the same set of input data? I tried to run different jobs on the same set of data

Re: Questions on how to use DistributedCache

2008-05-23 Thread Taeho Kang
Thank you for your clarification! One more question here, The API doc says... DistributedCache is a facility provided by the Map-Reduce framework to cache files (text, archives, jars etc.) needed by applications. My question is... Is it also possible distribute some some binary files (to be

upgrade from 0.14.4 to 0.16.4 ?

2008-05-23 Thread Frédéric Bertin
Hi all, We plan to upgrade our cluster from Hadoop 0.14.4 directly to 0.16.4 (skipping the 0.15 step). Is there anything that prevent from upgrading from 0.14 to 0.16? Is the simplified procedure described on the wiki page applicable in that case? Regarding 0.16.4, I couldn't find any

Re: Serialization format for structured data

2008-05-23 Thread Stuart Sierra
On 5/22/08 1:54 PM, Stuart Sierra [EMAIL PROTECTED] wrote: I've tried using JSON to store structured data in TextOutputFormat, which works but is not very efficient. Any better suggestions? On Thu, May 22, 2008 at 5:21 PM, Ted Dunning [EMAIL PROTECTED] wrote: What is it that makes you not

Re: Can you run multiple simultaneous hadoop jobs?

2008-05-23 Thread Kayla Jay
Thanks you guys for the feedback. I will try all options you have both suggested and see how it goes. Will post if I run into any problems. Thanks again - Original Message From: Brice Arnould [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Friday, May 23, 2008 3:53:12 AM

Re: 0.16.4 DFS dropping blocks, then won't retart...

2008-05-23 Thread Raghu Angadi
Can you attach initialization part of NameNode? thanks, Raghu. C G wrote: We've recently upgraded from 0.15.0 to 0.16.4. Two nights ago we had a problem where DFS nodes could not communicate. After not finding anything obviously wrong we decided to shut down DFS and restart. Following

Re: 0.16.4 DFS dropping blocks, then won't retart...

2008-05-23 Thread C G
2008-05-23 11:53:25,377 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = primary/10.2.13.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.16.4-dev STARTUP_MSG: build =

Re: Serialization format for structured data

2008-05-23 Thread Bryan Duxbury
On May 23, 2008, at 9:51 AM, Ted Dunning wrote: Relative to thrift, JSON has the advantage of not requiring a schema as well as the disadvantage of not having a schema. The advantage is that the data is more fluid and I don't have to generate code to handle the records. The disadvantage

Remote Job Submission

2008-05-23 Thread Natarajan, Senthil
Hi, I was wondering is it possible to submit MapReduce job on remote Hadoop cluster. (i.e) Submitting the job from the machine which doesn't have Hadoop installed and submitting to different machine where Hadoop installed. Is it possible to do this? I guess at least data can be uploaded to HDFS

Re: Remote Job Submission

2008-05-23 Thread Ted Dunning
Both are possible. You may have to have access to the data and task nodes for some operations. If you can see all of the nodes in your cluster, you should be able to do everything. On 5/23/08 1:46 PM, Natarajan, Senthil [EMAIL PROTECTED] wrote: Hi, I was wondering is it possible to submit

Re: Remote Job Submission

2008-05-23 Thread Ted Dunning
To write the data, you have a few choices: - put some kind of proxy in that can see the cluster and write to it using DAV or HTTP post. It would then do a normal HDFS write. There was a DAV client for HDFS at one time. - make the cluster visible and install the hadoop.jar and configuration

Re: Remote Job Submission

2008-05-23 Thread Doug Cutting
Ted Dunning wrote: - in order to submit the job, I think you only need to see the job-tracker. Somebody should correct me if I am wrong. No, you also need to be able to write the job.xml, job.jar, and job.split into HDFS. Someday perhaps we'll pass these via RPC to the jobtracker and have

Re: Remote Job Submission

2008-05-23 Thread Michael Bieniosek
You could set up an rpc server on a machine that does have hadoop installed. Then, your clients could submit rpc requests to this machine, and your rpc server would resubmit the job to hadoop. -Michael On 5/23/08 2:10 PM, Natarajan, Senthil [EMAIL PROTECTED] wrote: The client machine doesn't

Re: JAVA_HOME Cygwin problem (solution doesn't work)

2008-05-23 Thread s29752-hadoopuser
The following works for me set JAVA_HOME=/cygdrive/c/Progra~1/Java/jdk1.5.0_14 Nicholas - Original Message From: vatsan [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Friday, May 23, 2008 5:41:05 PM Subject: JAVA_HOME Cygwin problem (solution doesn't work) I have installed

Re: 0.16.4 DFS dropping blocks, then won't retart...

2008-05-23 Thread C G
Ugh, that solved the problem. Thanks Dhruba! Thanks, C G Dhruba Borthakur [EMAIL PROTECTED] wrote: If you look at the log message starting with STARTUP_MSG: build =... you will see that the namenode and good datanode was built by CG whereas the bad datanodes were compiled by hadoopqa!