Hi Stefan,
Yes, the 'nice' cannot resolve this problem.
Now, in my cluster, there are 8GB of RAM. My java heap configuration is:
HDFS DataNode : 1GB
HBase-RegionServer: 1.5GB
MR-TaskTracker: 1GB
MR-child: 512MB (max child task is 6, 4 map task + 2 reduce task)
But the memory usage is still tig
when i use eclipse plugin hadoop-0.18.3-eclipse-plugin.jar and try to connect
to a remote hadoop dfs, i got ioexception. if run a map/reduce program it
outputs:
09/05/12 16:53:52 INFO ipc.Client: Retrying connect to server:
/**.**.**.**:9100. Already tried 0 time(s).
09/05/12 16:53:52 INFO ipc.Cli
Arun C Murthy wrote:
... oh, and getting it to run a marathon too!
http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html
Owen & Arun
Lovely. I will now stick up the pic of you getting the first results in
on your laptop at apachecon
Stefan Will wrote:
Raghu,
I don't actually have exact numbers from jmap, although I do remember that
jmap -histo reported something less than 256MB for this process (before I
restarted it).
I just looked at another DFS process that is currently running and has a VM
size of 1.5GB (~600 resident)
zsongbo wrote:
Hi Stefan,
Yes, the 'nice' cannot resolve this problem.
Now, in my cluster, there are 8GB of RAM. My java heap configuration is:
HDFS DataNode : 1GB
HBase-RegionServer: 1.5GB
MR-TaskTracker: 1GB
MR-child: 512MB (max child task is 6, 4 map task + 2 reduce task)
But the memory u
Does anyone have any vague ideas when append() may be available for
production usage?
Thanks in advance
-sasha
--
Sasha Dolgy
sasha.do...@gmail.com
Yes, I also found that the TaskTracker should not use so much memory.
PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND
32480 schubert 35 10 1411m 172m 9212 S0 2.2 8:54.78 java
The previous 1GB is the default value, I have just change the heap of TT to
384MB one hours
Yes, I think the JVM uses way more memory than just its heap. Now some of it
might be just reserved memory, but not actually used (not sure how to tell
the difference). There are also things like thread stacks, jit compiler
cache, direct nio byte buffers etc. that take up process space outside of
t
Stefan Will wrote:
Yes, I think the JVM uses way more memory than just its heap. Now some of it
might be just reserved memory, but not actually used (not sure how to tell
the difference). There are also things like thread stacks, jit compiler
cache, direct nio byte buffers etc. that take up proce
Right now data is received in parallel and is written to a queue, then a
single thread reads the queue and writes those messages to a
FSDataOutputStream which is kept open, but the messages never get flushed.
Tried flush() and sync() with no joy.
1.
outputStream.writeBytes(rawMessage.toString());
2009-05-12 12:42:17,470 DEBUG [Thread-7] (FSStreamManager.java:28)
hdfs.HdfsQueueConsumer: Thread 19 getting an output stream
2009-05-12 12:42:17,470 DEBUG [Thread-7] (FSStreamManager.java:49)
hdfs.HdfsQueueConsumer: Re-using existing stream
2009-05-12 12:42:17,472 DEBUG [Thread-7] (FSStreamManager
Ian - Thanks for the detailed analysis. It was these issues that lead
me to create a temporary file in NativeS3FileSystem in the first
place. I think we can get NativeS3FileSystem to report progress
though, see https://issues.apache.org/jira/browse/HADOOP-5814.
Ken - I can't see why you would be g
It shows sold out on the website. Any chances of more seats opening up?
Amandeep
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
On Tue, May 5, 2009 at 2:10 PM, Ajay Anand wrote:
> This year's Hadoop Summit
> (http://developer.yahoo.com/events/hadoopsu
You can register at http://hadoopsummit09.eventbrite.com/
Ajay
-Original Message-
From: Amandeep Khurana [mailto:ama...@gmail.com]
Sent: Tuesday, May 12, 2009 9:55 AM
To: hbase-u...@hadoop.apache.org; core-user@hadoop.apache.org
Subject: Re: Hadoop Summit 2009 - Open for registration
I
Hi,
I'd like to do this in my hodrc file:
client-params = ...,,mapred.child.java.opts="-Dkey=value",...
but HoD doesn't like it:
error: 1 problem found.
Check your command line options and/or your configuration file /hodrc
Any ideas how to specify "nested equal"s? Has anyone ever tried this,
or
On Mon, May 11, 2009 at 9:43 PM, Raghu Angadi wrote:
> stack wrote:
>
>> Thanks Raghu:
>>
>> Here is where it gets stuck: [...]
>>
>
> Is that where it normally stuck? That implies it is spending unusually long
> time at the end of writing a block, which should not be the case.
I studied datan
Hello,
I mentioned this issue before for the case of map tasks. I have 43
reduce tasks, 42 completed, 1 pending and 0 running.
This is the case for the last 30 minutes. A pictur(tiff) of the job
tracker can be found here( http://www.stat.purdue.edu/~sguha/mr.tiff
),
since I haven't canceled the jo
Interestingly, when i started other jobs, this one finished.
I have no idea why.
Saptarshi Guha
On Tue, May 12, 2009 at 10:36 PM, Saptarshi Guha
wrote:
> Hello,
> I mentioned this issue before for the case of map tasks. I have 43
> reduce tasks, 42 completed, 1 pending and 0 running.
> This is
Hi,
I have a question about the that the reducer gets in Hadoop
Streaming.
I wrote a simple mapper.sh, reducer.sh script files:
mapper.sh :
#!/bin/bash
while read data
do
#tokenize the data and output the values
echo $data | awk '{token=0; while(++token<=NF) print $token"\t1"}'
done
r
Interesting so, where can I download the benchmark and relative
test codes?
On Tue, May 12, 2009 at 8:38 AM, Arun C Murthy wrote:
> ... oh, and getting it to run a marathon too!
>
> http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html
>
> Owen & Arun
>
Your hadoop isn't working at all or isn't working at the specified port.
- try stop-all.sh command on namenode. if it says "no namenode to stop",
then take a look at namenode logs and paste here if anything seems strange.
- If namenode logs are ok (filled with INFO messages), then take a look at
al
(raking up real old thread)
After struggling with this issue for sometime now - it seems that accessing
hdfs on ec2 from outside ec2 is not possible.
This is primarily because of https://issues.apache.org/jira/browse/HADOOP-985.
Even if datanode ports are authorized in ec2 and we set the public
I have the similar situation, I have very small files,
I never tried HBase (want to), but you can also group them
and write (let's say) 20-30 into a file as every file becomes a key in that
big file.
There are methods in API which you can write an object as a file into HDFS,
and read again
to get
DFSOutputStream.writeChunk() enqueues packets into data queue and after that
it returns. So write is asynchronous.
I want to know the total actual time of HDFS executing the write operation
(start from writeChunk() to the time that each replication is written on
disk). How can get that time?
Th
24 matches
Mail list logo