Hi all,
I wanted to figure out the Read and Write throughputs that happens in
a Map task (Read - reading from the input splits, Write - writing the
map output back) inside a JVM. Do we have any counters that can help
me with this? Or where exactly should I focus on tweaking the code to
add some
Hi All,
We had set up 5 node hadoop[0.20.2] hbase[0.90.1] cluster, the cluster was
idle for many days [a week] today we were unable to stop it, it seems pid
files are deleted, now my question is i had set temp folder as different
still how come pid files might have deleted? . Since i have 5
Sumeet,
To create your pids in another directory, you can set HADOOP_PID_DIR in
your bin/hadoop-env.sh. There's an open issue about it:
https://issues.apache.org/jira/browse/HADOOP-6606
Regards,
Marcos
Em 13-04-2011 04:19, Sumeet M Nikam escreveu:
Hi All,
We had set up 5 node
Hi Marcos,
Thanks, will do this. but I am still not clear if hadoop.tmp.dir has
different directory than the default /tmp then how can pid files get
deleted, are there any other threads/process running which are deleting
inactive pid files?
Regards,
Sumeet
Hello,
On Wed, Apr 13, 2011 at 5:25 AM, Jeffrey Wang jw...@palantir.com wrote:
Hey all,
I'm trying to format my NameNode (I've done it successfully in the past), but
I'm getting a strange error:
11/04/12 16:47:32 INFO common.Storage: java.io.IOException: Input/output error
at
Guys, I'm not the one who said 'HDFS' unless I had a brain bubble in
my original message. I asked for a distribution mechanism for
code+mappable data. I appreciate the arrival of some suggestions.
Ted is correct that I know quite a bit about mmap; I had a lot to do
with the code in ObjectStore
Hi,
I need to install hadoop on 16-node cluster. I have a couple of related
questions:
1. I have installed hadoop on a shared directory, i.e., there is just one
place where the whole hadoop installation files exist and all the 16 nodes
use the same installation.
Is that an issue or I need to
Thank you Harsh,
that works fine!
(looks like the page I was looking at was the same, but for an older
version of hadoop)
Dieter
On Fri, 1 Apr 2011 13:07:38 +0530
Harsh J qwertyman...@gmail.com wrote:
You will need to supply your own Key-comparator Java class by setting
an appropriate
p.s.
Also, while starting dfs using bin/start-dfs.sh, I get the following error:
2011-04-13 09:42:31,729 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host =
Sorry, don't mean to say you don't know mmap or didn't do cool things in the
past.
But you will see why anyone would've interpreted this original post, given
the title of the posting and the following wording, to mean can I mmap
files that are in hdfs
On Mon, Apr 11, 2011 at 3:57 PM, Benson
Point taken.
On Wed, Apr 13, 2011 at 10:33 AM, M. C. Srivas mcsri...@gmail.com wrote:
Sorry, don't mean to say you don't know mmap or didn't do cool things in the
past.
But you will see why anyone would've interpreted this original post, given
the title of the posting and the following
Thanks a lot for the comments, but I set the mapred.local.dir to /tmp which
is a dir on every local machies.
Still I got the same error, and I use the same conf file, with 3 nodes (I
have this problem when use 4 nodes), I don't have the problem.
Any idea what problem it may be? Thanks a lot.
Hello,
I am facing trouble using hadoop streaming in order to solve a simple
nearest neighbor problem.
Input data is in the following format
key'\t'value
key is the imageid for which nearest neighbor will be computed
the value is 100 dimensional vector of floating point values separated by
I am not sure what the problem is but your approach seems incorrect unless you
always want to use 1 mapper. You need to make your queries available to all
mappers
(cache them-although I am not sure how to do that with streaming). Then
you definitely want to use a combiner to reduce over each
Hi,
It's just in my home directory, which is an NFS mount. I moved it off NFS and
it seems to work fine. Is there some reason it doesn't work with NFS?
-Jeffrey
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Wednesday, April 13, 2011 3:48 AM
To:
Hello,
I am facing trouble using hadoop streaming in order to solve a simple
nearest neighbor problem.
Input data is in the following format
key'\t'value
key is the imageid for which nearest neighbor will be computed
the value is 100 dimensional vector of floating point values separated by
Your NFS mount is not letting NameNode lock a file.
On Wed, Apr 13, 2011 at 12:38 PM, Jeffrey Wang jw...@palantir.com wrote:
Hi,
It's just in my home directory, which is an NFS mount. I moved it off NFS
and it seems to work fine. Is there some reason it doesn't work with NFS?
-Jeffrey
Is it possible to change the logging level for an individual job? (As
opposed to the cluster as a whole.) E.g., is there some key that I can
set on the job's configuration object that would allow me to bump up the
logging from info to debug just for that particular job?
Thanks,
DR
On Apr 13, 2011, at 12:38 PM, Jeffrey Wang wrote:
It's just in my home directory, which is an NFS mount. I moved it off NFS and
it seems to work fine. Is there some reason it doesn't work with NFS?
Locking on NFS--regardless of application--is a dice roll, especially when
client/server are
Is there an issue with using the regex SerDe with loading into Hive text
files above 2 gigs in size? I've been experiencing out of memory errors
with a select group of logs when running a hive job. I have been able
to load the data if I use split to cut it in half or thirds. No problem.
I have a problem where my input data has various control characters. I
thought that I could load this data (100+GB tab delimited) files and then
run a perl streaming script to clean it up (wanted to take advantage of
parallelization of hadoop framework). However, since some of the data has
^M
. Usually this is a positive-integer problem somewhere. If you use a
32-bit Java this would be a problem.
On Wed, Apr 13, 2011 at 3:16 PM, hadoopman hadoop...@gmail.com wrote:
Is there an issue with using the regex SerDe with loading into Hive text
files above 2 gigs in size? I've been
There are systems for file-system plumbing out to user processes, and
FUSE does this on Linux, and there is a package for hadoop. However-
pretending a remote resource is local holds a place of honor on the
system design antipattern hall of fame.
On Wed, Apr 13, 2011 at 7:35 AM, Benson Margulies
If it's Java, and log4j, you can set that package tree to its own logging level.
On Wed, Apr 13, 2011 at 1:52 PM, David Rosenstrauch dar...@darose.net wrote:
Is it possible to change the logging level for an individual job? (As
opposed to the cluster as a whole.) E.g., is there some key that
I appreciate the feedback. I'll check in the morning. We're running
64bit Ubuntu 10.04 LTS server and I 'believe' it should all be 64 bit
but I'll verify that.
Thanks !!
On 04/13/2011 08:31 PM, Lance Norskog wrote:
. Usually this is a positive-integer problem somewhere. If you use a
32-bit
Hbase is very good at this kind of thing.
Depending on your aggregation needs OpenTSDB might be interesting since they
store and query against large amounts of time ordered data similar to what
you want to do.
It isn't clear to whether your data is primarily about current state or
about
Hi,
I want ot use Capacity Scheduler for my Hadoop Jobs. I have currently three
Queues defined and are configured and working properly. I am using Hadoop
0.20.2 And in the new library, we are not supposed to use JobConf. So, I
need to set Queue name as a property in Configuration (
Looks like you are hitting https://issues.apache.org/jira/browse/MAPREDUCE-1621.
-Amareshwari
On 4/13/11 11:39 PM, Shivani Rao raoshiv...@gmail.com wrote:
Hello,
I am facing trouble using hadoop streaming in order to solve a simple
nearest neighbor problem.
Input data is in the following
28 matches
Mail list logo