Re: Queue support from HDFS

2011-06-24 Thread Jakob Homan
Not directly, but you may wish to take a look at the Kafka project (http://sna-projects.com/kafka/), which we use as a queue and then bring the data periodically into HDFS via an MR job. See this presentation: http://www.slideshare.net/ydn/hug-january-2011-kafka-presentation -Jakob On Fri, Jun

Re: Where is Hadoop 20.3?

2010-08-01 Thread Jakob Homan
Pete- You're correct, there's not yet been a 20.3 release. Although it has been a while and Owen's recent bugfix in HADOOP-6881 may be worth a new release soon. -Jakob Pete Tyler wrote: Apologies for the newbie question but I think I'm a little lost. Hadoop 20.2 came out in Feb 2010 but t

Re: user authentication: protect hdfs/job web interface from public

2010-03-05 Thread Jakob Homan
Jiang- Hadoop has support for this via the hadoop.http.filter.initializers property, which allows you set the name of a class to add as a standard servlet filter for the public-facing websites, such as: hadoop.http.filter.initializers com.widgetcorp.HadoopFilter Each public-f

Re: HTTP secure access

2009-10-20 Thread Jakob Homan
Bogdan- Currently there is no security provided for accessing web pages, although as detailed in Owen's recent presentation (http://www.cloudera.com/sites/all/themes/cloudera/static/hw09/2 - 2-00 Owen OMalley, Yahoo, securitycompatability.pdf -uhm what an odd url - try here: http://www.clo

Re: Storing contents of a file in a java object

2009-09-30 Thread Jakob Homan
Raakhi- Guilherme is correct. Each mapper (and reducer) runs independently and communication between them is not provided for nor encouraged. You may wish to look into the DistributedCached (http://wiki.apache.org/hadoop/FAQ#A8, http://hadoop.apache.org/common/docs/current/mapred_tutorial.

Re: Can we configure two or more datanode under pseudo-distributed mode?

2009-09-25 Thread Jakob Homan
Boris is quite right. My sample config file for dn1 is here: http://dl.getdropbox.com/u/565949/hadoop-site.xml, if you'd like. Notice can use the trailing digit of the ports to identify each node. I've never needed to run more than 5, and have five separate config files. Note that having sepa

Re: submitting multiple small jobs simultaneously

2009-08-19 Thread Jakob Homan
JobClient.runJob() method does. You can submit multiple jobs using the submitJob() method and wait for all of them to finish through their RunningJobs, as it seems you want to do. Is this what you're looking for? -Jakob Homan Hadoop at Yahoo! George Jahad wrote: I'm importing a bun

Re: Some issues!

2009-08-18 Thread Jakob Homan
Sugandha- I would suggest you look at the FileSystem interface, which is our starting point for implementing a file system for use with Hadoop. There are several implementations, such as S3FileSystem, that you can look at for inspiration. Jakob Homan Hadoop at Yahoo! Sugandha Naolekar

Re: Ubuntu/Hadoop incompatibilities?

2009-08-17 Thread Jakob Homan
Ubuntu will work fine. The only to-do item is to make sure Sun's Java is installed and pointed-to, rather than th Open JDK that Ubuntu ships with by default. Jakob Homan Hadoop at Yahoo! Dmitry Pushkarev wrote: We have a cluster of over 40 machines all running ubuntu. As long as you ca

Re: Upgrade to 0.20.0

2009-08-17 Thread Jakob Homan
Some more detailed information or, hopefully, some logs would be helpful here. Have you verified that the namenode and datanodes are being started correctly? -Jakob Turner Kunkel wrote: Hello all, I had 0.18.0 working and recently downloaded 0.20.0 and copied over the information from .18 wh

Re: How to re-read the config files

2009-08-13 Thread Jakob Homan
onf.get("dfs.replication")); } } Note from the Javadoc: Values that are added via set methods will overlay values read from the resources. Hope this helps. Write back if you have more questions. Thanks, Jakob Homan Hadoop at Yahoo! Arvind Sharma wrote: Hi, I was wondering if ther

Re: Creating a job

2009-08-11 Thread Jakob Homan
.html#Job+Submission+and+Monitoring Let us know if anything is unclear after that. Thanks, Jakob Homan Yahoo! Mithila Nagendra wrote: Hello All How do I create a Job in Hadoop using Class Job? And how do I run it? Generally JobClient.runJob(conf) is used, but the parameter in not of the typ

Re: Question on file HDFS file system

2009-08-11 Thread Jakob Homan
/java/org/apache/hadoop/hdfs/server/namenode/INode.java (also INodeDirectory, INodeFile, etc.) Looking at how the fsimage file is laid out FSImage:LoadFSImage or the OfflineImageViewer classes. Hope this helps. -Jakob Homan HDFS/Yahoo! ashish pareek wrote: Hi Everybody, I

Re: Testing Mappers in Hadoop 0.20.0

2009-07-22 Thread Jakob Homan
It looks like there's quite a bit more documentation about MRUnit on the Cloudera site that's not included in the regular documentation. Looks like about twice as much. It would be great if this could be added to the content that's in mrunit/doc Thanks, Jakob Aaron Kimball wrote: Hi David

Re: Recovery following disk full

2009-07-20 Thread Jakob Homan
The oiv handles the fsimage fil but not the edits log, so it wouldn't help in this case. There has been talk about writing a similar tool for the edits log but nothing has been decided. Also, while the oiv will be included in 21, it works on images back to 18 (and maybe earlier). It's standalo