Hi,
I am relatively new to Hadoop and was wondering how to do incremental
loads into HDFS.
I have a continuous stream of data flowing into a service which is
writing to an OLTP store. Due to the high volume of data, we cannot do
aggregations on the OLTP store, since this starts affecting the writ
The October SF Hadoop users meetup will be held Wednesday, October 12, from
7pm to 9pm. This meetup will be hosted by Twitter at their office on Folsom
St. *Please note that due to scheduling constraints, we will begin an hour
later than usual this month.*
As usual, we will use the discussion-base
Since you're only just beginning, and have unknowingly issued multiple
"namenode -format" commands, simply run the following and restart DN
alone:
$ rm -r /private/tmp/hadoop-hadoop-user/dfs/data
(And please do not reformat namenode, lest you go out of namespace ID
sync yet again -- You can inste
Now I am able to make task tracker and job tracker running but I still have
following problem with datanode.
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
Incompatible namespaceIDs in /private/tmp/hadoop-hadoop-user/dfs/data: namenode
namespaceID = 798142055; datan
I am trying to setup single node cluster using hadoop-0.20.204.0 and while setting I found my job tracker and task tracker are not starting. I am attaching the exception. I also don't know why my while formatting name node my IP address still doesn't show 127.0.0.1 as follows.1/09/30 15:50:36 INFO
Are you learning for the sake of experimenting or are there functional
requirements driving you to dive into this space?
*If you are learning for the sake of adding new tools to your portfolio: Look
into high level overviews of each of the projects and review architecture
solutions that use the
Thanks Edward, so mostly the linux containers are used in Hadoop for
ensuring isolation in terms of providing security across mapreduce jobs from
different users (even mesos seem to leverage the same) not for resource
fairness?
On Fri, Sep 30, 2011 at 1:39 PM, Edward Capriolo wrote:
> On Fri, Sep
I am using nagios to monitor Hadoop cluster.
Would like to hear input from you guys.
Questions
1. Would that be any difference between monitoring
TCP port 9000 and curl port 50070 and grep for "namenode"
2. Job tracker I will monitor tcp port 9001 any drawbacks ?
3. Secondarynamenode wh
On Fri, Sep 30, 2011 at 9:03 AM, bikash sharma wrote:
> Hi,
> Does anyone knows if Linux containers (which are like kernel supported
> virtualization technique for providing resource isolation across
> process/appication) have ever been used with Hadoop to provide resource
> isolation for map/redu
Hi all,
I have been working with Hadoop core, Hadoop HDFS and Hadoop MapReduce for the
past 8 months.
Now I want to learn other projects under Apache Hadoop such as Pig, Hive, HBase
...
Can you suggest me a learning path to learn about the Hadoop Eco-System in a
structured manner?
I am confu
Thanks Harsh.
I did look at userlogs dir. Although it creates subdirs for each
job/attempt, there are no files in those directories. just the acl xml file.
I had also looked at task tracker log and all it has is this -
2011-09-30 15:50:05,344 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAc
Hi,
Does anyone knows if Linux containers (which are like kernel supported
virtualization technique for providing resource isolation across
process/appication) have ever been used with Hadoop to provide resource
isolation for map/reduce tasks?
If yes, what could be the up/down sides of such approac
Thanks Varad.
On Wed, Sep 28, 2011 at 9:35 PM, Varad Meru wrote:
> The process ids of each individual task can be seen using jps and jconsole
> commands provided by java.
>
> jconsole command on command-line interface provides a GUI screen for
> monitoring running tasks within java.
>
> The task
Thanks so much Harsh!
On Thu, Sep 29, 2011 at 12:42 AM, Harsh J wrote:
> Hello Bikash,
>
> The tasks run on the tasktracker, so that is where you'll need to look
> for the process ID -- not the JobTracker/client.
>
> Crudely speaking,
> $ ssh tasktracker01 # or whichever.
> $ jps | grep Child |
On 29/09/2011 18:02, Joey Echeverria wrote:
Do you close your FileSystem instances at all? IIRC, the FileSystem
instance you use is a singleton and if you close it once, it's closed
for everybody. My guess is you close it in your cleanup method and you
have JVM reuse turned on.
I've hit this i
15 matches
Mail list logo