Re: Does hadoop support append option?

2011-10-18 Thread kartheek muthyala
I am just concerned about the use case of appends in Hadoop. I know that they have provided support for appends in hadoop. But how frequently are the files getting appended? . There is this version concept too that is maintained in the block report, according to my guess this version number is

Re: Does hadoop support append option?

2011-10-18 Thread Uma Maheswara Rao G 72686
- Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Tuesday, October 18, 2011 11:54 am Subject: Re: Does hadoop support append option? To: common-user@hadoop.apache.org I am just concerned about the use case of appends in Hadoop. I know that they have provided

Regarding using pig version 0.7 with cygwin

2011-10-18 Thread Gayatri Rao
Hi All, I have been trying to use pig versions 0.7 with windows+cygwin set up. Somehow when I try to run a pig script, it gives me an error saying its not able to copy the script to some temporary location. The error is as follows: Cannot copy myScript.pig to c:/cgywin/tmp/pig-1234956677

Re: Regarding using pig version 0.7 with cygwin

2011-10-18 Thread Daniel Dai
Hi, Gayatri, There is no active work for making Pig work under Cygwin for a while. It's no surprise to me if recent version of Pig does not run under cygwin. Sorry about this but we just don't have enough resource to work on it right now. You are welcome to submit a patch if interested. Regrads,

Re: Does hadoop support append option?

2011-10-18 Thread kartheek muthyala
Thanks Uma for the clarification of the append functionality. My second question is about the version number concept used in the block map. Why does it maintain this version number? ~Kartheek On Tue, Oct 18, 2011 at 12:14 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: -

automatic node discovery

2011-10-18 Thread Petru Dimulescu
Hello, I wonder how do you guys see the problem of automatic node discovery: having, for instance, a couple of hadoops, with no configuration explicitly set whatsoever, simply discover each other and work together, like Gridgain does: just fire up two instances of the product, on the same

could not complete file...

2011-10-18 Thread bourne1900
Hi, There are 20 threads which put file into HDFS ceaseless, every file is 2k. When 1 million files have finished, client begin throw coulod not complete file exception ceaseless. At that time, datanode is hang-up. I think maybe heart beat is lost, so namenode does not know the state of

Re: could not complete file...

2011-10-18 Thread Uma Maheswara Rao G 72686
- Original Message - From: bourne1900 bourne1...@yahoo.cn Date: Tuesday, October 18, 2011 3:21 pm Subject: could not complete file... To: common-user common-user@hadoop.apache.org Hi, There are 20 threads which put file into HDFS ceaseless, every file is 2k. When 1 million files

Re: Does hadoop support append option?

2011-10-18 Thread Uma Maheswara Rao G 72686
- Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Tuesday, October 18, 2011 1:31 pm Subject: Re: Does hadoop support append option? To: common-user@hadoop.apache.org Thanks Uma for the clarification of the append functionality. My second question is about the

Re: Re: could not complete file...

2011-10-18 Thread bourne1900
Thank you for your reply. There is PIPE ERROR in datanode log, and nothing else. Client only shows Could not complete file ceaselessly. From namonodeIP:50070/dfshealth.jsp , I found the datanode is hang-up, and there is only a datanode in my cluster :) BTW, the retry times is unlimit I think,

execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
Hi , what is the way to execute hadoop job on remote cluster. I want to execute my hadoop job from remote web application , but I didn't find any hadoop client (remote API) to do it. Please advice. Oleg

Re: automatic node discovery

2011-10-18 Thread Steve Loughran
On 18/10/11 10:48, Petru Dimulescu wrote: Hello, I wonder how do you guys see the problem of automatic node discovery: having, for instance, a couple of hadoops, with no configuration explicitly set whatsoever, simply discover each other and work together, like Gridgain does: just fire up two

Re: execute hadoop job from remote web application

2011-10-18 Thread Steve Loughran
On 18/10/11 11:40, Oleg Ruchovets wrote: Hi , what is the way to execute hadoop job on remote cluster. I want to execute my hadoop job from remote web application , but I didn't find any hadoop client (remote API) to do it. Please advice. Oleg the Job class lets you build up and submit jobs

Re: execute hadoop job from remote web application

2011-10-18 Thread Uma Maheswara Rao G 72686
- Original Message - From: Oleg Ruchovets oruchov...@gmail.com Date: Tuesday, October 18, 2011 4:11 pm Subject: execute hadoop job from remote web application To: common-user@hadoop.apache.org Hi , what is the way to execute hadoop job on remote cluster. I want to execute my hadoop

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
Excellent. Can you give a small example of code. On Tue, Oct 18, 2011 at 1:13 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: - Original Message - From: Oleg Ruchovets oruchov...@gmail.com Date: Tuesday, October 18, 2011 4:11 pm Subject: execute hadoop job from remote

RE: execute hadoop job from remote web application

2011-10-18 Thread Devaraj K
The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in));

Re: execute hadoop job from remote web application

2011-10-18 Thread Bejoy KS
Oleg If you are looking at how to submit your jobs using JobClient then the below sample can give you a start. //get the configuration parameters and assigns a job name JobConf conf = new JobConf(getConf(), MyClass.class); conf.setJobName(SMS Reports); //setting key

Re: execute hadoop job from remote web application

2011-10-18 Thread Uma Maheswara Rao G 72686
- Original Message - From: Bejoy KS bejoy.had...@gmail.com Date: Tuesday, October 18, 2011 5:25 pm Subject: Re: execute hadoop job from remote web application To: common-user@hadoop.apache.org Oleg If you are looking at how to submit your jobs using JobClient then the below

Input path does not exist error

2011-10-18 Thread Thamizh
Hi All, I am invoking a WordCount map/reduce job from a Java class. It ended with below error, Though, I have input file on /data/wordcount. What could be the reason for this error? Hadoop version: 0.20.1 2011-10-18 16:24:00,478 ERROR [thread1] workflow.WorkerPool$Worker(121): Error while

Re: Input path does not exist error

2011-10-18 Thread Harsh J
Hey Thamizh, For some reason, your FS is being detected as a local FS instead of HDFS. Hence, your program fails to find /data/wordcount on the local filesystem. It could be cause your configuration directory is not on the classpath of the program that executes this driver code, or you haven't

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
Thanks you all for your answers but I still have a questions: Currently we running our jobs using shell scripts which locates on hadoop master machine. Here is an example of command line: /opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar -inputPath /opt/inputs/ -outputPath

Re: Input path does not exist error

2011-10-18 Thread Thamizh
Thanks a lot. It helped me.   Regards, Thamizhannal P From: Harsh J ha...@cloudera.com To: common-user@hadoop.apache.org; Thamizh tceg...@yahoo.co.in Sent: Tuesday, 18 October 2011 6:16 PM Subject: Re: Input path does not exist error Hey Thamizh, For some

Re: execute hadoop job from remote web application

2011-10-18 Thread Bejoy KS
Hi Oleg I haven't tried out a scenario like you mentioned. But I think there shouldn't be any issue in submitting a job that has some dependent classes which holds the business logic referred from mapper,reducer or combiner. You should be able to do the job submission remotely the same

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
I try to be more specific. It is not dependent jar. It is a jar which contains map/reduce/combine classes and some business logic. executing our job from command line, class which parse parameters and submit a job has a line of code: job.setJarByClass(HadoopJobExecutor.class); we execute

Re: Building and adding new Datanode

2011-10-18 Thread Ivan.Novick
Hi Harsh, That brings up an interesting question i wanted to ask. The connection between the data nodes and the name node is initiated by the data node and not by the name node, based on the config file on the data node machine. Correct? Cheers, Ivan On 10/17/11 9:28 PM, Harsh J

Re: Is there a good way to see how full hdfs is

2011-10-18 Thread Ivan.Novick
Cool, is there any documentation on how to use the JMX stuff to get monitoring data? Cheers, Ivan On 10/17/11 6:04 PM, Rajiv Chittajallu raj...@yahoo-inc.com wrote: If you are running 0.20.204 http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode, name=NameNodeInfo

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
So you mean that in case I am going to submit job remotely and my_hadoop_job.jar will be in class path of my web application it will submit job with my_hadoop_job.jar to remote hadoop machine (cluster)? On Tue, Oct 18, 2011 at 6:13 PM, Harsh J ha...@cloudera.com wrote: Oleg, Steve already

Re: Building and adding new Datanode

2011-10-18 Thread Harsh J
Ivan, Yes, that is correct, its the DataNode that initiates the heartbeating process. On Tue, Oct 18, 2011 at 9:48 PM, ivan.nov...@emc.com wrote: Hi Harsh, That brings up an interesting question i wanted to ask. The connection between the data nodes and the name node is initiated by the

Re: execute hadoop job from remote web application

2011-10-18 Thread Harsh J
Oleg, It will pack up the jar that contains the class specified by setJarByClass into its submission jar and send it up. Thats the function of that particular API method. So, your deduction is almost right there :) On Tue, Oct 18, 2011 at 10:20 PM, Oleg Ruchovets oruchov...@gmail.com wrote: So

Re: Does hadoop support append option?

2011-10-18 Thread kartheek muthyala
Hey Uma, yes the version number what ever i was referring is the generationtimestamp info. I am sorry for screwing the nomenclature we call that the version number. I was actually referring to this http://www.cloudera.com/blog/2009/07/file-appends-in-hdfs/ where he mentioned this as version. But

[ANNOUNCEMENT] Hadoop 0.20.205.0 release

2011-10-18 Thread Matt Foley
On Friday 14 Oct, the Hadoop community voted ten to zero (including four PMC members voting in favor) to accept the release of Hadoop 0.20.205.0. The biggest feature of this release is that it merges the append/hsync/hflush features of branch-0.20-append, and security features of

Sudoku Example Program Inputs

2011-10-18 Thread Adam
Does anyone know the syntax for the sudoku example program input and if I can find some datasets for it? Thanks, Adam

Re: Jira Assignment

2011-10-18 Thread Jon Allen
Arun, That doesn't seem to have solved anything, I still can't assign an issue to myself. Just to confirm, my username is jonallen. Thanks, Jon On Mon, Oct 17, 2011 at 11:08 PM, Arun C Murthy a...@hortonworks.com wrote: Done. On Oct 16, 2011, at 10:00 AM, Mahadev Konar wrote: Arun,

Re: Jira Assignment

2011-10-18 Thread Arun C Murthy
Try now? On Oct 18, 2011, at 2:04 PM, Jon Allen wrote: Arun, That doesn't seem to have solved anything, I still can't assign an issue to myself. Just to confirm, my username is jonallen. Thanks, Jon On Mon, Oct 17, 2011 at 11:08 PM, Arun C Murthy a...@hortonworks.com wrote:

Re: Jira Assignment

2011-10-18 Thread Jon Allen
No, still no luck. I'm trying to assign https://issues.apache.org/jira/browse/HADOOP-7713 if you want to see if you can assign it to me. I don't even have an option to change the assignee. On Tue, Oct 18, 2011 at 10:07 PM, Arun C Murthy a...@hortonworks.com wrote: Try now? On Oct 18, 2011,

Re: Jira Assignment

2011-10-18 Thread Tim Williams
Try again... Thanks, --tim On Tue, Oct 18, 2011 at 5:12 PM, Jon Allen jayaye...@gmail.com wrote: No, still no luck. I'm trying to assign https://issues.apache.org/jira/browse/HADOOP-7713 if you want to see if you can assign it to me.  I don't even have an option to change the assignee. On

Re: Jira Assignment

2011-10-18 Thread Jon Allen
That's done it, now sorted. Thanks, Jon On Tue, Oct 18, 2011 at 10:20 PM, Tim Williams william...@gmail.com wrote: Try again... Thanks, --tim On Tue, Oct 18, 2011 at 5:12 PM, Jon Allen jayaye...@gmail.com wrote: No, still no luck. I'm trying to assign

Re: Sudoku Example Program Inputs

2011-10-18 Thread Owen O'Malley
On Tue, Oct 18, 2011 at 1:23 PM, Adam jacobvi...@gmail.com wrote: Does anyone know the syntax for the sudoku example program input and if I can find some datasets for it? There is an example puzzle at:

Re: Sudoku Example Program Inputs

2011-10-18 Thread Owen O'Malley
On Tue, Oct 18, 2011 at 3:20 PM, Owen O'Malley o...@hortonworks.com wrote: On Tue, Oct 18, 2011 at 1:23 PM, Adam jacobvi...@gmail.com wrote: Does anyone know the syntax for the sudoku example program input and if I can find some datasets for it? There is an example puzzle at:

Re: Is there a way to get the version of a remote hadoop instance

2011-10-18 Thread Tao.Zhang2
Shevek, Thanks for your response, getting the protocol version doesn't fit our requirements. I think it's pretty easy for hadoop to provide this kind of information, and this seems very common for those popular NOSQL databases.. Anyway, thanks for your reply! Hopefully hadoop can provide

Re: Is there a way to get the version of a remote hadoop instance

2011-10-18 Thread Ted.Xu
Hi Tao, I'm afraid there is no such API. A related JIRA ticket is https://issues.apache.org/jira/browse/HADOOP-7719. Regards, Ted Xu On 10/18/11 8:41 AM, Zhang, Tao tao.zha...@emc.com wrote: Hi, Is there a way to get the version of a remote hadoop instance through java API?