use of the recipient(s) named above. If you are not that person,
you are not authorized to review, use, copy, forward, distribute or
otherwise disclose the information contained in the message.
From: Xuri Nagarin secs...@gmail.com
Reply-To: user@hadoop.apache.org user@hadoop.apache.org
Date
. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com
On Thu, Oct 10, 2013 at 4:50 PM, Xuri Nagarin secs...@gmail.com wrote:
On Thu, Oct 10, 2013 at 1:27 PM, Pradeep Gollakota
pradeep...@gmail.comwrote:
I don't
Hi,
I am looking for some simple graphing tools to use with Hadoop (bar or line
chart). Most google searches for hadoop graphing turns up results for
much more complex graph analysis tool like Giraph.
Any simple rrdtool like solutions for Hadoop?
TIA,
Xuri
...@gmail.com wrote:
You mean a performance monitoring tool? I have not used any, but you
should search for that, not graph.
On 10/14/2013 08:03 PM, Xuri Nagarin wrote:
Hi,
I am looking for some simple graphing tools to use with Hadoop (bar or
line chart). Most google searches for hadoop graphing
of a job. Hadoop does this reliably
by monitoring all instances, restarting failed ones, etc.
3) You have way too much data to fit on one computer. Same as #2.
You might not need Hadoop if you can run your programs without it.
Lance
On 10/14/2013 08:02 PM, Xuri Nagarin wrote:
Yes, I tested
Hi,
I have a simple Grep job (from bundled examples) that I am running on a
11-node cluster. Each node is 2x8-core Intel Xeons (shows 32 CPUs with HT
on), 64GB RAM and 8 x 1TB disks. I have mappers set to 20 per node.
When I run the Grep job, I notice that CPU gets pegged to 100% on multiple
bound, you expect to see low CPU usage.
On Thu, Oct 10, 2013 at 11:05 AM, Xuri Nagarin secs...@gmail.com wrote:
Hi,
I have a simple Grep job (from bundled examples) that I am running on a
11-node cluster. Each node is 2x8-core Intel Xeons (shows 32 CPUs with HT
on), 64GB RAM and 8 x 1TB disks
for your responses.
On Thu, Oct 10, 2013 at 12:29 PM, Xuri Nagarin secs...@gmail.com wrote:
Thanks Pradeep. Does it mean this job is a bad candidate for MR?
Interestingly, running the cmdline '/bin/grep' under a streaming job
provides (1) Much better disk throughput and, (2) CPU load
Hi,
I am trying to get the Grep example bundled with CDH to read
Sequence/Snappy files.
By default, the program throws errors trying to read Sequence/Snappy files:
java.io.EOFException: Unexpected end of block in input stream
at
...@apache.org wrote:
Errr, what's wrong with discussing these types of issues on list?
Nothing public here, and as long as it's kept to facts, this should
not be a problem and Apache is a fine place to have such discussions.
My 2c.
-Original Message-
From: Xuri Nagarin
kept to facts, this should
not be a problem and Apache is a fine place to have such discussions.
My 2c.
-Original Message-
From: Xuri Nagarin secs...@gmail.com
Reply-To: user@hadoop.apache.org user@hadoop.apache.org
Date: Thursday, September 12, 2013 4:39 PM
To: user
I understand it can be contentious issue especially given that a lot of
contributors to this list work for one or the other vendor or have some
stake in any kind of evaluation. But, I see no reason why users should not
be able to compare notes and share experiences. Over time, genuine pain
points
Hi,
I realize there is no perfect spec for data nodes as lot depends on use
cases and work loads but I am curious if there are any rules of thumb or
no-go zones in terms of how many terabytes per core is ok?
So a few questions assuming 1 core per hdd holds:
Is there a no-go zone in terms of
Yes, ideally you want to setup a 4th gateway node to run clients.
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Security-Guide/AppxG-Setting-Up-Gateway.html
On Thu, Aug 29, 2013 at 3:11 PM, Raj Hadoop hadoop...@yahoo.com wrote:
Hi,
I am trying to setup a
14 matches
Mail list logo