Re: Not able to copy a file to HDFS after installing

2009-02-05 Thread Rajshekar
Name naode is localhost with an ip address.Now I checked when i give /bin/hadoop namenode i am getting error r...@excel-desktop:/usr/local/hadoop/hadoop-0.17.2.1# bin/hadoop namenode 09/02/05 13:27:43 INFO dfs.NameNode: STARTUP_MSG: /

Re: copying binary files to a SequenceFile

2009-02-05 Thread Rasit OZDAS
Mark, http://stuartsierra.com/2008/04/24/a-million-little-files/comment-page-1 In this link, there is a tool to create sequence files from tar.gz and tar.bz2 files. I don't think that this is a real solution, but at least it means more free memory and delay of problems (worst solution). Rasit

Re: HADOOP-2536 supports Oracle too?

2009-02-05 Thread Enis Soztutar
From the exception : java.io.IOException: ORA-00933: SQL command not properly ended I would broadly guess that Oracle JDBC driver might be complaining that the statement does not end with ;, or something similar. you can 1. download the latest source code of hadoop 2. add a print statement

Re: Bad connection to FS.

2009-02-05 Thread Rasit OZDAS
I can add a little method to follow namenode failures, I find out such problems by running first start-all.sh , then stop-all.sh if namenode starts without error, stop-all.sh gives the output stopping namenode.. , but in case of an error, it says no namenode to stop.. In case of an error,

Problem with Counters

2009-02-05 Thread some speed
Hi, Can someone help me with the usage of counters please? I am incrementing a counter in Reduce method but I am unable to collect the counter value after the job is completed. Its something like this: public static class Reduce extends MapReduceBase implements ReducerText, FloatWritable, Text,

Re: Problem with Counters

2009-02-05 Thread Tom White
Hi Sharath, The code you posted looks right to me. Counters#getCounter() will return the counter's value. What error are you getting? Tom On Thu, Feb 5, 2009 at 10:09 AM, some speed speed.s...@gmail.com wrote: Hi, Can someone help me with the usage of counters please? I am incrementing a

Re: Problem with Counters

2009-02-05 Thread some speed
I Tried the following...It gets compiled but the value of result seems to be 0 always. RunningJob running = JobClient.runJob(conf); Counters ct = new Counters(); ct = running.getCounters(); long result =

Re: Problem with Counters

2009-02-05 Thread some speed
Hi Tom, I get the error : Cannot find Symbol* **MyCounter.ct_key1 * On Thu, Feb 5, 2009 at 5:51 AM, Tom White t...@cloudera.com wrote: Hi Sharath, The code you posted looks right to me. Counters#getCounter() will return the counter's value. What error are you getting? Tom On Thu,

Re: Problem with Counters

2009-02-05 Thread Rasit OZDAS
Forgot to say, value 0 means that the requested counter does not exist. 2009/2/5 Rasit OZDAS rasitoz...@gmail.com: Sharath, I think the static enum definition should be out of Reduce class. Hadoop probably tries to find it elsewhere with MyCounter, but it's actually Reduce.MyCounter in your

Re: Problem with Counters

2009-02-05 Thread some speed
Thanks Rasit. I did as you said. 1) Put the static enum MyCounter{ct_key1} just above main() 2) Changed result = ct.findCounter(org.apache.hadoop.mapred.Task$Counter, 1, Reduce.MyCounter).getCounter(); Still is doesnt seem to help. It throws a null pointer exception.Its not able to find the

Re: Problem with Counters

2009-02-05 Thread some speed
In fact I put the enum in my Reduce method as the following link (from Yahoo) says so: http://public.yahoo.com/gogate/hadoop-tutorial/html/module5.html#metrics ---Look at the section under Reporting Custom Metrics. 2009/2/5 some speed speed.s...@gmail.com Thanks Rasit. I did as you said.

Re: How to use DBInputFormat?

2009-02-05 Thread Stefan Podkowinski
The 0.19 DBInputFormat class implementation is IMHO only suitable for very simple queries working on only few datasets. Thats due to the fact that it tries to create splits from the query by 1) getting a count of all rows using the specified count query (huge performance impact on large tables) 2)

Re: Problem with Counters

2009-02-05 Thread Rasit OZDAS
Sharath, I think the static enum definition should be out of Reduce class. Hadoop probably tries to find it elsewhere with MyCounter, but it's actually Reduce.MyCounter in your example. Hope this helps, Rasit 2009/2/5 some speed speed.s...@gmail.com: I Tried the following...It gets compiled

Connecting to namenode

2009-02-05 Thread Habermaas, William
After creation and startup of the hadoop namenode, you can only connect to the namenode via hostname and not IP. EX. hostname for box is sunsystem07, ip is 10.120.16.99 If you use the following url, hdfs://10.120.16.99, to connect to the namenode, the following message will be printed:

Re: Problem with Counters

2009-02-05 Thread Rasit OZDAS
Sharath, You're using reporter.incrCounter(enumVal, intVal); to increment counter, I think method to get should also be similar. Try to use findCounter(enumVal).getCounter() or getCounter(enumVal). Hope this helps, Rasit 2009/2/5 some speed speed.s...@gmail.com: In fact I put the enum in

Re: How to use DBInputFormat?

2009-02-05 Thread Fredrik Hedberg
Indeed sir. The implementation was designed like you describe for two reasons. First and foremost to make is as simple as possible for the user to use a JDBC database as input and output for Hadoop. Secondly because of the specific requirements the MapReduce framework brings to the table

Re: Not able to copy a file to HDFS after installing

2009-02-05 Thread Rasit OZDAS
Rajshekar, It seems that your namenode isn't able to load FsImage file. Here is a thread about a similar issue: http://www.nabble.com/Hadoop-0.17.1-%3D%3E-EOFException-reading-FSEdits-file,-what-causes-this---how-to-prevent--td21440922.html Rasit 2009/2/5 Rajshekar rajasheka...@excelindia.com:

Re: Problem with Counters

2009-02-05 Thread Tom White
Try moving the enum to inside the top level class (as you already did) and then use getCounter() passing the enum value: public class MyJob { static enum MyCounter{ct_key1}; // Mapper and Reducer defined here public static void main(String[] args) throws IOException { // ...

Re: Regarding Hadoop multi cluster set-up

2009-02-05 Thread Rasit OZDAS
Ian, here is a list under Setting up Hadoop on a single node Basic Configuration Jobtracker and Namenode settings Maybe it's what you're looking for. Cheers, Rasit 2009/2/4 Ian Soboroff ian.sobor...@nist.gov: I would love to see someplace a complete list of the ports that the various Hadoop

Re: How to use DBInputFormat?

2009-02-05 Thread Stefan Podkowinski
As far as i understand the main problem is that you need to create splits from streaming data with an unknown number of records and offsets. Its just the same problem as with externally compressed data (.gz). You need to go through the complete stream (or do a table scan) to create logical splits.

Re: Task tracker archive contains too many files

2009-02-05 Thread Andrew
On Wednesday 04 February 2009 14:25:44 Amareshwari Sriramadasu wrote: Now, there is no way to stop DistributedCache from stopping unpacking of jars. I think it should have an option (thru configuration) whether to unpack or not. Can you raise a jira for the same? OK!

Re: How to use DBInputFormat?

2009-02-05 Thread Enis Soztutar
Please see below, Stefan Podkowinski wrote: As far as i understand the main problem is that you need to create splits from streaming data with an unknown number of records and offsets. Its just the same problem as with externally compressed data (.gz). You need to go through the complete stream

Connect to namenode

2009-02-05 Thread Habermaas, William
After creation and startup of the hadoop namenode, you can only connect to the namenode via hostname and not IP. EX. hostname for box is sunsystem07, ip is 10.120.16.99 If you use the following url, hdfs://10.120.16.99, to connect to the namenode, the following message will be printed:

Re: Re: Re: Regarding Hadoop multi cluster set-up

2009-02-05 Thread shefali pawar
Hi, I do not think that the firewall is blocking the port because it has been turned off on both the computers! And also since it is a random port number I do not think it should create a problem. I do not understand what is going wrong! Shefali On Wed, 04 Feb 2009 23:23:04 +0530 wrote I'm

FileInnputFormat, FileSplit, and LineRecorder: where are they run?

2009-02-05 Thread Saptarshi Guha
Hello All, In order to get a better understanding of Hadoop, i've started reading the source and have a question The FileInputFormat, reads in files, splits into splitsizes (which may be bigger than block size) and creates FileSplits. The FileSplits contain the start, length *and* the locations of

Reporter for Hadoop Streaming?

2009-02-05 Thread S D
Is there a way to use the Reporter interface (or something similar such as Counters) with Hadoop streaming? Alternatively, can how could STDOUT be intercepted for the purpose of updates? If anyone could point me to documentation or examples that cover this I'd appreciate it. Thanks, John

[event] Cloud Computing:Using the Open Source Hadoop to Generate Data-Intensive Insights

2009-02-05 Thread Bonesata
Registration: http://www.meetup.com/CIO-IT-Executives/calendar/9528874/ Gaining and keeping a competitive edge in Internet offerings has increasingly become a matter of continuously processing enormous volumes of data about users, user activities, Web sites, ads, and Web searches. There is gold

Re: Connect to namenode

2009-02-05 Thread Raghu Angadi
I don't think it is intentional. Please file a jira with all the details about how to reproduce (with actual configuration files). thanks, Raghu. Habermaas, William wrote: After creation and startup of the hadoop namenode, you can only connect to the namenode via hostname and not IP.

Re: Reporter for Hadoop Streaming?

2009-02-05 Thread Arun C Murthy
On Feb 5, 2009, at 1:40 PM, S D wrote: Is there a way to use the Reporter interface (or something similar such as Counters) with Hadoop streaming? Alternatively, can how could STDOUT be intercepted for the purpose of updates? If anyone could point me to documentation or examples that cover

Re: Reporter for Hadoop Streaming?

2009-02-05 Thread S D
This does it. Thanks! On Thu, Feb 5, 2009 at 9:14 PM, Arun C Murthy a...@yahoo-inc.com wrote: On Feb 5, 2009, at 1:40 PM, S D wrote: Is there a way to use the Reporter interface (or something similar such as Counters) with Hadoop streaming? Alternatively, can how could STDOUT be

slow writes to HDFS

2009-02-05 Thread Mark Kerzner
Hi all, I am writing to HDFS with this simple code File[] files = new File(fileDir).listFiles(); for (File file : files) { key.set(file.getPath()); byte[] bytes = new FileUtil().readCompleteFile(file);

Re: FileInnputFormat, FileSplit, and LineRecorder: where are they run?

2009-02-05 Thread Jothi Padmanabhan
The RecordReader code gets executed on the node in which the maps are run. The framework tries to run maps on nodes that contain the split. However, there is no guarantee that maps will only run on nodes that contain the split. If a split spans multiple blocks, attempt will be made to choose a

can't read the SequenceFile correctly

2009-02-05 Thread Mark Kerzner
Hi, I have written binary files to a SequenceFile, seemeingly successfully, but when I read them back with the code below, after a first few reads I get the same number of bytes for the different files. What could go wrong? Thank you, Mark reader = new SequenceFile.Reader(fs, path,

Re: Not able to copy a file to HDFS after installing

2009-02-05 Thread Rajshekar
Hi Thanks Rasi, From Yest evening I am able to start Namenode. I did few changed in hadoop-site.xml. it working now, but the new problem is I am not able to do map/reduce jobs using .jar files. it is giving following error had...@excel-desktop:/usr/local/hadoop$ bin/hadoop jar

Re: Not able to copy a file to HDFS after installing

2009-02-05 Thread Rasit OZDAS
Rajshekar, I have also threads for this ;) http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200803.mbox/%3cpine.lnx.4.64.0803132200480.5...@localhost.localdomain%3e http://www.mail-archive.com/hadoop-...@lucene.apache.org/msg03226.html Please try the following: - Give local filepath