On 29/09/2011 18:02, Joey Echeverria wrote:
Do you close your FileSystem instances at all? IIRC, the FileSystem
instance you use is a singleton and if you close it once, it's closed
for everybody. My guess is you close it in your cleanup method and you
have JVM reuse turned on.
I've hit this
Thanks so much Harsh!
On Thu, Sep 29, 2011 at 12:42 AM, Harsh J ha...@cloudera.com wrote:
Hello Bikash,
The tasks run on the tasktracker, so that is where you'll need to look
for the process ID -- not the JobTracker/client.
Crudely speaking,
$ ssh tasktracker01 # or whichever.
$ jps |
Thanks Varad.
On Wed, Sep 28, 2011 at 9:35 PM, Varad Meru meru.va...@gmail.com wrote:
The process ids of each individual task can be seen using jps and jconsole
commands provided by java.
jconsole command on command-line interface provides a GUI screen for
monitoring running tasks within
Hi,
Does anyone knows if Linux containers (which are like kernel supported
virtualization technique for providing resource isolation across
process/appication) have ever been used with Hadoop to provide resource
isolation for map/reduce tasks?
If yes, what could be the up/down sides of such
Thanks Harsh.
I did look at userlogs dir. Although it creates subdirs for each
job/attempt, there are no files in those directories. just the acl xml file.
I had also looked at task tracker log and all it has is this -
2011-09-30 15:50:05,344 INFO org.apache.hadoop.mapred.TaskTracker:
Hi all,
I have been working with Hadoop core, Hadoop HDFS and Hadoop MapReduce for the
past 8 months.
Now I want to learn other projects under Apache Hadoop such as Pig, Hive, HBase
...
Can you suggest me a learning path to learn about the Hadoop Eco-System in a
structured manner?
I am
On Fri, Sep 30, 2011 at 9:03 AM, bikash sharma sharmabiks...@gmail.comwrote:
Hi,
Does anyone knows if Linux containers (which are like kernel supported
virtualization technique for providing resource isolation across
process/appication) have ever been used with Hadoop to provide resource
I am using nagios to monitor Hadoop cluster.
Would like to hear input from you guys.
Questions
1. Would that be any difference between monitoring
TCP port 9000 and curl port 50070 and grep for namenode
2. Job tracker I will monitor tcp port 9001 any drawbacks ?
3. Secondarynamenode
Thanks Edward, so mostly the linux containers are used in Hadoop for
ensuring isolation in terms of providing security across mapreduce jobs from
different users (even mesos seem to leverage the same) not for resource
fairness?
On Fri, Sep 30, 2011 at 1:39 PM, Edward Capriolo
Are you learning for the sake of experimenting or are there functional
requirements driving you to dive into this space?
*If you are learning for the sake of adding new tools to your portfolio: Look
into high level overviews of each of the projects and review architecture
solutions that use
I am trying to setup single node cluster using hadoop-0.20.204.0 and while setting I found my job tracker and task tracker are not starting. I am attaching the exception. I also don't know why my while formatting name node my IP address still doesn't show 127.0.0.1 as follows.1/09/30 15:50:36 INFO
Now I am able to make task tracker and job tracker running but I still have
following problem with datanode.
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
Incompatible namespaceIDs in /private/tmp/hadoop-hadoop-user/dfs/data: namenode
namespaceID = 798142055;
Since you're only just beginning, and have unknowingly issued multiple
namenode -format commands, simply run the following and restart DN
alone:
$ rm -r /private/tmp/hadoop-hadoop-user/dfs/data
(And please do not reformat namenode, lest you go out of namespace ID
sync yet again -- You can
The October SF Hadoop users meetup will be held Wednesday, October 12, from
7pm to 9pm. This meetup will be hosted by Twitter at their office on Folsom
St. *Please note that due to scheduling constraints, we will begin an hour
later than usual this month.*
As usual, we will use the
Hi,
I am relatively new to Hadoop and was wondering how to do incremental
loads into HDFS.
I have a continuous stream of data flowing into a service which is
writing to an OLTP store. Due to the high volume of data, we cannot do
aggregations on the OLTP store, since this starts affecting the
15 matches
Mail list logo