Thanks. Let me try this configuration. I use the default setting. Is it
set to true in default settings?
2013/8/10 Karthik Kambatla
> It is possible that you have assignMultiple set to true in your fair
> scheduler configuration - that leads to assigning as many tasks on a single
> node heart
Appreciate your input Bryant, i will try to reproduce and see the namenode
log before, while, and after it pause.
Wish me luck
On Fri, Aug 9, 2013 at 2:09 PM, Bryan Beaudreault
wrote:
> When I've had problems with a slow jobtracker, i've found the issue to be
> one of the following two (so far)
It is possible that you have assignMultiple set to true in your fair
scheduler configuration - that leads to assigning as many tasks on a single
node heartbeat as the node can accommodate. Setting it to false, would
assign a single task on each heartbeat and can help in spreading out the
tasks.
O
When I've had problems with a slow jobtracker, i've found the issue to be
one of the following two (so far) possibilities:
- long GC pause (I'm guessing this is not it based on your email)
- hdfs is slow
I haven't dived into the code yet, but circumstantially I've found that
when you submit a job
A while back, i was fighting with the jobtracker page hangs when i browse
to http://jobtracker:50030 browser doesn't show jobs info as usual which
ends up because of allowing too much job history kept in jobtracker.
Currently, i am setting up a new cluster 40g heap on the namenode and
jobtracker i
The overarching responsibility of a record reader is to return one record
which in case of conventional, traditionally means one line. But as we see
in this case, it cannot be always true. An xml file can have physically
multiple lines but functionally they map to one record or one line. For
this,
Correction: Its scan.addColumn(family, qualifier) and not
scan.addFamily(family,
qualifier) that I actually used.
Thanks, Narlin M.
On Fri, Aug 9, 2013 at 2:08 PM, Narlin M wrote:
> I am fairly new to the hadoop-hbase environment having started working on
> it very recently, so I hope I am wor
I am fairly new to the hadoop-hbase environment having started working on
it very recently, so I hope I am wording the question correctly.
I am trying to read data from a hadoop-hbase table which has only one
column family named 'DFLT'. This family contains hierarchical column
qualifiers "/source:
Hi,
I'm getting below errors in log file while starting datanode and
tasktracker. I'm using Hadoop 1.1.2 and java 1.7.0_21.
mmap failed for CEN and END part of zip file
mmap failed for CEN and END part of zip file
mmap failed for CEN and END part of zip file
mmap failed for CEN and END part of zi
check altiscale as well
On Fri, Aug 9, 2013 at 3:05 AM, Dhaval Shah wrote:
> Thanks for the list Marcos. I will go through the slides/links. I think
> that's helpful
>
> Regards,
> Dhaval
>
> --
> *From:* Marcos Luis Ortiz Valmaseda
> *To:* Dhaval Shah
> *Cc:* us
Hi,
I am trying to understand, how to write my own writable.
So basically trying to understand how to process records spanning multiple
lines.
Can some one break down to me, that what are the things needed to be
considered in each method??
I am trying to understand this example:
https://github
I've built a custom LineReader-like class to back a new InputFormat. I
wrote the new LineReader class to handle escaped whitespace that the
util.LineReader doesn't handle. The data I'm reading might have lines that
look something like this:
foo\tbar\tbaz\\tmore\\ndata with escaped stuff\n
I'd lik
Actually, 1.2.1 is out (and marked stable). I see no reason not to upgrade.
http://hadoop.apache.org/docs/r1.2.1/releasenotes.html
As far as performance goes, when I upgraded our cluster from 1.0.4 to
1.1.2, our small jobs (that took about 1 min each) were taking about 20-30s
less time. So ther
Regards, Viswanathan J.
Like Hars said, the release notes described every bug fix, minor or
major improvements with the link to the related JIRA for each one.
Just a simple question? Why not upgrade directly to 1.2.0?
There are lot of good improvements and bug fixes here too.
See the Release note
The link Jitendra provided lists all the changes exhaustively. What
are you exactly looking for beyond that? For Performance related
changes, they are probably noted, so just search the same page for
Performance.
On Fri, Aug 9, 2013 at 7:41 PM, Viswanathan J
wrote:
> I have seen these release not
I have seen these release notes already, any other comment on this upgrade
regarding MR Job processing and any performance improvement.
On Aug 9, 2013 6:27 PM, "Jitendra Yadav" wrote:
> Please refer Hadoop 1.1.2 release notes.
>
> http://hadoop.apache.org/docs/r1.1.2/releasenotes.html
>
> On Fri,
The counter, being num-ops, should up-count and not reset. Note that
your test may be at fault though - calling hsync may not always call
NN#fsync(…) unless you are passing the proper flags to make it always
do so.
On Wed, Aug 7, 2013 at 4:27 PM, lei liu wrote:
> I use hadoop-2.0.5 and config had
There isn't a "discrepancy", but read on: DFS Used counts disk spaces
across DNs. FSCK counts file lengths on HDFS. The former includes
replicated data sizes, plus block checksum metadata consumed space.
The latter does not.
A small (but probably significant) percentage of your files are using
rep
Please refer Hadoop 1.1.2 release notes.
http://hadoop.apache.org/docs/r1.1.2/releasenotes.html
On Fri, Aug 9, 2013 at 5:41 PM, Viswanathan J wrote:
> Hi,
>
> Planning to upgrade hadoop from 1.0.3 to 1.1.2, what are the key features
> or advantages.
>
Hi,
Planning to upgrade hadoop from 1.0.3 to 1.1.2, what are the key features
or advantages.
The hadoop version is 1.0.3
2013/8/9 Sandy Ryza
> Hi devdoer,
>
> What version are you using?
>
> -Sandy
>
>
> On Thu, Aug 8, 2013 at 4:25 AM, devdoer bird wrote:
>
>> HI:
>>
>> I configure the FairScheduler with default settings and my job has 19
>> reduce tasks. I found that all the reduce
Hi Shahab,
Thanks for the book and links.
Regards
Olivier
2013/8/9 Shahab Yunus
> Given that your questions are very broad and at high level, I would
> suggest that you should pick up a book or such to go through that. The
> Hadoop: Definitive Guide by Tom White is a great book to start wit
Hi Lei,
MutableCounterLong is a type of counter which can be increased only (count
number is often large comparing with MutableCounterInt). It is used a lot in
Hadoop metrics system, i.e. DatanodeMetrics.
You can find more details on metrics v2 in Hadoop wiki link (
http://wiki.apache.org/hado
Hi All,
I have a CDH4 hadoop cluster setup with 3 datanodes and a data replication
factor of 2.
When I try to check the consumed dfs space, I get different values using
the "hdfs dfsadmin -report" and "hdfs fsck" command.
Could anyone please help me understand the reason behind the discrepancy in
24 matches
Mail list logo