Try OpenSource h2o.ai - a cran-style package that allows fast scalable R on
Hadoop in-Memory.
One can invoke single threaded R from h2o package and the runtime on clusters
is Java (not R!) - So you get better memory management.
http://docs.0xdata.com/deployment/hadoop.html
Hi,
we have a daily Hive script that usually takes a few hours to run. The
other day I notice one of the jobs was taking in excess of a few hours.
Digging into it I saw that there were 3 attempts to launch a job on a
single node:
Task Id Start Time Finish Time
Error
The ids of the tasks are different so the node got killed after failing on
3 different(!) reduce tasks. The reduce task 48 will probably have been
resubmitted to another node.
2014-03-27 10:22 GMT+01:00 Krishna Rao krishnanj...@gmail.com:
Hi,
we have a daily Hive script that usually takes a
I noticed, but none of the jobs ended up being re-submitted! And all 3 of
those jobs failed on the same node. All we know is that the disk on that
node became unresponsive.
On 27 March 2014 09:33, Dieter De Witte drdwi...@gmail.com wrote:
The ids of the tasks are different so the node got
Hi,
All the other mailing lists that I am part of usually put [centos] or
[foobar] in the subject field. Is there a way to set up on this
mailing list?
Thanks,
Andrew
Hi,
I don't know what mistake I did that now I get this error
INFO ipc.Client:Retrying connect toserver:localhost/127.0.0.1:54310.Already
tried2time(s);retry policy
isRetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1SECONDS)
INFO ipc.Client:Retrying connect
Does netstat -an | grep LISTEN show these ports being listened on?
Can you stat hdfs from the command line e.g.:
hdfs dfsadmin -report
hdfs fsck /
hdfs dfs -ls /
Also, check out /var/log/hadoop or /var/log/hdfs for more details.
john
From: Mahmood Naderan [mailto:nt_mahm...@yahoo.com]
Sent:
Could you have a pmem-vs-vmem issue as in:
http://stackoverflow.com/questions/8017500/specifying-memory-limits-with-hadoop
john
From: praveenesh kumar [mailto:praveen...@gmail.com]
Sent: Tuesday, March 25, 2014 7:38 AM
To: user@hadoop.apache.org
Subject: Re: Hadoop Takes 6GB Memory to run one
This discussion may also be relevant to your question:
http://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits
Do you actually need to specify that -Xmx6000m for java heap or could it be one
of the other issues discussed?
John
From: John Lilley
One more thing. I ran hadoop namenode and it says that namenode has not been
formatted!! But I have run classification commands some days ago and the size
of data dir is nearly 60GB containing my data.
So why it says that namenode has not been formatted? Please see the output
$ hadoop
Hi Stanley,
Sorry about the confusion, but I'm trying to read a txt file into my Mapper
function. I am trying to copy the file using the -files option when
submitting the Hadoop job.
I try to obtain the filename using the following lines of code in my Mapper:
URI[] localPaths =
Hi all, I have a job running with 1750 maps and 1 reduce and the status has
been the same for the last two hours. Any thoughts?
Thanks, Clay
Next step would be to look in the logs under userlog directory for that job
Sent from my iPhone
On Mar 27, 2014, at 11:08 AM, Clay McDonald stuart.mcdon...@bateswhite.com
wrote:
Hi all, I have a job running with 1750 maps and 1 reduce and the status has
been the same for the last two
Hi Serge,
I'm using the -files option through the hadoop cli.
The following lines of code works
Path[] localPaths = context.getLocalCacheFiles();
String configFilename = localPaths[0].toString();
However, context.getLocalCacheFiles() is deprecated. What is the correct
equivalent function in
I have a simple M/R job using Mapper only thus no reducer. The mapper read
a timestamp from the value, generate a path to the output file and writes
the key and value to the output file.
The input file is a sequence file, not compressed and stored in the HDFS,
it has a size of 162.68 MB.
Output
Have you checked the content of the files you write?
/th
On Thu, 2014-03-27 at 11:43 -0700, Kim Chew wrote:
I have a simple M/R job using Mapper only thus no reducer. The mapper
read a timestamp from the value, generate a path to the output file
and writes the key and value to the output
Yea, gonna do that. 8-)
Kim
On Thu, Mar 27, 2014 at 12:30 PM, Thomas Bentsen t...@bentzn.com wrote:
Have you checked the content of the files you write?
/th
On Thu, 2014-03-27 at 11:43 -0700, Kim Chew wrote:
I have a simple M/R job using Mapper only thus no reducer. The mapper
read a
Thanks Serge, looks like I need to at memory to my datanodes.
Clay McDonald
Cell: 202.560.4101
Direct: 202.747.5962
-Original Message-
From: Serge Blazhievsky [mailto:hadoop...@gmail.com]
Sent: Thursday, March 27, 2014 2:16 PM
To: user@hadoop.apache.org
Cc: user@hadoop.apache.org
It seems to take a very long time to timeout a connection to an invalid NN URI.
Our application is interactive so the defaults of taking many minutes don't
work well. I've tried setting:
conf.set(ipc.client.connect.max.retries, 2);
conf.set(ipc.client.connect.timeout, 7000);
before calling
I am also wondering if, say, I have two identical timestamp so they are
going to be written to the same file. Does MulitpleOutputs handle appending?
Thanks.
Kim
On Thu, Mar 27, 2014 at 12:30 PM, Thomas Bentsen t...@bentzn.com wrote:
Have you checked the content of the files you write?
/th
-files was used by hive, not MR.
So it cannot be recognized by your MR job.
Sent from my iPhone5s
On 2014年3月28日, at 2:31, Jonathan Poon jkp...@ucdavis.edu wrote:
Hi Serge,
I'm using the -files option through the hadoop cli.
The following lines of code works
Path[] localPaths =
Thanks folks.
I am not awared my input data file has been compressed.
FileOutputFromat.setCompressOutput() is set to true when the file is
written. 8-(
Kim
On Thu, Mar 27, 2014 at 5:46 PM, Mostafa Ead mostafa.g@gmail.comwrote:
The following might answer you partially:
Input key is not
Hi all,
The hadoop blocks of a region may be moved to different region servers (e.g. by
hadoop rebalancer). In such a scenario, is there a way to fix that and move all
the
blocks to the region server that hosts the region? Should hadoop rebalancer
never
be used when hbase is in use? Thanks.
Correct, do not use the hadoop balancer on a cluster serving HBase data.
Let HBase handle it at the region level.
To recover locality, you will need to run major compactions of the affected
regions or tables. In hbase shell: major_compact 'tablename'
On Thu, Mar 27, 2014 at 9:33 PM, Libo Yu
Thanks for the answer. If I run the major compaction on a region that only has
a single hfile whose
blocks may reside on different data nodes, will major compact occur in this
case and recover the
data locality? Thanks.
From: bbeaudrea...@hubspot.com
Date: Thu, 27 Mar 2014 21:44:49 -0400
Hi all,
hadoop path fsck -files -block -locations can list locations for all blocks
in the path.
Is it possible to list all blocks and the block locations for a given path
programmatically?
Thanks,
Libo
Hi Libo,
DFSClient.getBlockLocations, is this what you want?
Regards,
Wangda Tan
On Fri, Mar 28, 2014 at 10:03 AM, Libo Yu yu_l...@hotmail.com wrote:
Hi all,
hadoop path fsck -files -block -locations can list locations for all
blocks in the path.
Is it possible to list all blocks and the
Thanks, it is useful. If I know the block I'd, how to find out its location?
From: wheele...@gmail.com
Date: Fri, 28 Mar 2014 10:12:10 +0800
Subject: Re: How to get locations of blocks programmatically?
To: user@hadoop.apache.org
Hi Libo,DFSClient.getBlockLocations, is this what you want?
dear all:
I had a problem today, when i executed the command mapred job
-list on a slave, an error came out. show the message as below:
14/03/28 11:18:47 INFO Configuration.deprecation: session.id is deprecated.
Instead, use dfs.metrics.session-id
14/03/28 11:18:47 INFO
hi,
when i am trying to execute this command:
hadoop job -history ~/1
its giving error like:
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.
HADOOP_MAPRED_HOME not found!
from where can i get HADOOP_MAPRED_HOME?
thanks.
Which version of hadoop are u using? AFAIK the hadoop mapred home is the
directory where hadoop is installed or in other words untarred.
Thanks
Divye Sheth
On Mar 28, 2014 10:43 AM, Avinash Kujur avin...@gmail.com wrote:
hi,
when i am trying to execute this command:
hadoop job -history ~/1
yes, it is there. then why execution of that command throwing such error.
Do i need to change anything in this hadoop file ?
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this
32 matches
Mail list logo