I think if you add datanode host value for the key dfs.hosts.exclude in hdfs
conf file and refresh nodes [hadoop dfsadmin -refreshNodes]..might work.Thanks,
On 1/31/11 2:42 PM, "Jam Ram" wrote:
How to remove the multiple datanodes dynamically from the masternode without
stopping it?
--
View t
Hi Saptarshi,
AFAIK, this is an intermediate stage where the old api is supported, while
evolving the new api.
In 0.21, the old api - MultipleOutputFormat is not deprecated.
http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/index.html.In future it
might be.
>From usage perspective, what Mult
By default, with compressed files you lose the ability to control splits and
the file is essentially read as one split to one mapper.
There had been some discussion in and around this over bzip2, gzip and some
fixes are done to allow bzip2 to be splittable.Refer HADOOP-4012
Also Kevin came with
I think -file is only for shipping executable or properties file.The extn .zip
does not work.Please have a look @ MAPREDUCE-596
Cheers,
On 4/15/10 12:27 PM, "Meng Mao" wrote:
We are trying to pass into a streaming program a zip file. An invocation
looks like this:
/usr/local/hadoop/bin/hadoop
Nachiket,
Not sure which hadoop version/package you are using.
If default hadoop, you might like to check ${hadoop.log.dir}/userlogs/xxx,
${hadoop.log.dir}/logs/history, [${hadoop.log.dir} defaults to /tmp if nothing
specified] and maybe /grid/0/hadoop/var/log/ .Check also -
http://hadoop.apac
..dfs -rmr -skipTrash /user/hadoop/.Trash
recreates .Trash, on consecutive rmr...-skipTrash can be used generally if you
don't want a backup of deletes, here only to illustrate..
On 3/15/10 2:43 PM, "Marcus Herou" wrote:
Hi.
Our disks are getting full and I have found that it is the trashbin t
The last I heard, there were some discussions of instead creating solr index
using hadoop mapreduce rather than pushing solr index into hdfs and so on.
SOLR-1045 ad SOLR-1301 can provide you more info.
Cheers,
/R
On 2/24/10 4:23 PM, "Rakhi Khatwani" wrote:
Hi,
Has anyone tried creatin
I would say restart the cluster, but suspect that would not help either -
instead try checking up your running process list (eg: perl/shell script or a
ETL pipeline job) to analyze/kill.
Also wondering if any hadoop -dfsadmin commands can supersede this scenario..
Cheers,
/R
On 2/2/10 2:50 AM,
You can find out the reason from the JT logs (eg: memory/timeout restrictions)
and adjust the timeout - mapred.task.timeout or the memory parameters
accordingly.Refer
http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html
Cheers,
/R
On 1/29/10 12:22 PM, "john li" wrote:
when hadoop r
Not sure what error you get and if it is suggestive, but attimes where you
place the libjars option can make a difference.You can try adding the jar to
your HADOOP_CLASSPATH and then executing?
Cheers,
/R
On 1/20/10 9:50 AM, "Victor Hsieh" wrote:
Hi,
I was trying to run a mapreduce job with
I think TaskInProgress would take care of cleanup after a job is killed. It has
flow for killed/failed as well as successful jobs cleanup. You might like to
look into JobTracker/ TaskTracker/ HeartbeatResponse as well.
Cheers,
/R
On 1/20/10 10:45 AM, "#YONG YONG CHENG#" wrote:
Good Day,
I am
kherjee" wrote:
Hmmm. I am actually running it from a batch file. Is "hadoop fs -rmr"
not that stable compared to pig's rm OR hadoop's FileSystem ?
Let me try your suggestion by writing a cleanup script in pig.
-Thanks,
Prasen
On Tue, Jan 19, 2010 at 10:25 AM, Rekha Joshi
Can you try with dfs/ without quotes?If using pig to run jobs you can use rmf
within your script(again w/o quotes) to force remove and avoid error if
file/dir not present.Or if doing this inside hadoop job, you can use
FileSystem/FileStatus to delete directories.HTH.
Cheers,
/R
On 1/19/10 10:15
I have come across this error when my gzip file was corrupt, either
incompletely formed or corrupted during the data copy over network.
Cheers,
/R
On 1/18/10 8:19 AM, "Zheng Lv" wrote:
Hello Everyone,
We have been using SequenceFile(GZIP) to save some data in our product,
but we got the fol
_SUCCESS file is created after the hadoop job has successfully finished.Setting
I think is mapreduce.fileoutputcommitter.marksuccessfuljobs. You can leverage
this file existence to kick off your second step.
Alternatively you can capture the process id or logs to verify the conclusion
of the fir
Not sure what the whole error is, but you can always alternatively try this -
fs.default.name
s3://BUCKET
fs.s3.awsAccessKeyId
ID
fs.s3.awsSecretAccessKey
SECRET
And I am not sure what is the base hadoop version on S3, but possibly if S3
wiki is correct try updating conf/hadoo
You might want to check if the queue is also set in mapred.queue.names property
and you have access allowed to the queue.Thanks,
On 12/16/09 12:33 PM, "anjali nair" wrote:
I tried changing the mapred-site.xml by creating the new property. But still
it doesn't work. The job goes to the default q
What's your hadoop version/distribution? In anycase, to eliminate the easy
suspects first, what do the hadoop logs say on restart?Did you provide port on
the job tracker url?Thanks!
On 12/11/09 8:43 AM, "Jeff Zhang" wrote:
Hi all,
I'd like to configure FairScheduler on hadoop. but seems it ca
If it is hadoop 0.20 the files to modify are core-site.xml, hdfs-site.xml and
mapred-site.xml, while the default configs are in
core-default.xml,hdfs-default.xml and mapred-default.xml.
Otherwise also, are you saying that providing -D works with same memory but not
via onfig?
If not, for memor
Everythings in here - http://hadoop.apache.org/common/
On 12/4/09 9:55 AM, "samuellawrence" wrote:
Hai,
I have to start the HADOOP environment using java code (inprocess). I would
like to use the APIs to start it.
Could anyone please give me snippet or a link.
Thanks in Advance.
--
View th
I remember playing around with dfsadmin sometime back and wasn't this always
the case? Was able to use it for only few commands like upgradeProgress<>
without getting the superuser issue.
Anyways, as per HADOOP-2659, dfsadmin are only meant for admin. Thanks!
On 12/1/09 3:48 PM, "pavel kolodin"
It is better to use cloudera wordcount pipes example itself to be using the
correct params. Also the issue here seems to be in building itself.Did you
check what `config.log' suggest to be the cause why C compiler failed?
Were you able to compile any other c++ program with this build file on you
https://issues.apache.org/jira/browse/MAPREDUCE-655 fixed in version 0.21.0
On 11/26/09 9:43 PM, "Matthias Scherer" wrote:
Sorry, but I can't find it in the version control system for release 0.20.1:
http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.20.1/src/mapred/org/apache/hadoop/
The exit status of 1 usually indicates configuration issues, incorrect command
invocation in hadoop 0.20 (incorrect params), if not JVM crash.
In your logs there is no indication of crash, but some paths/command can be the
cause. Can you check if your lib paths/data paths are correct?
If it is a
https://issues.apache.org/jira/browse/HADOOP-372 has valuable information on
InputFormat/MapInput/RecordReader, you may try using the pseudo code. Thanks!
On 11/25/09 9:35 AM, "Gordon Linoff" wrote:
Does anyone have a pointer to code that allows the map to save data in
intermediate files, for u
If you use hadoop fs -ls hdfs:// that will work for your intent. Thanks!
On 11/25/09 2:31 AM, "Mark Kerzner" wrote:
Hi,
I am starting a cluster of Apache Hadoop distributions, like .18 and also
.19. This all works fine, then I log in. I see that the Hadoop daemons are
already working. However,
Hi,
Not sure about your hadoop version, and havent done much on single m/c setup
myself. However there is a IPC improvement bug filed @
https://issues.apache.org/jira/browse/HADOOP-2864.Thanks!
On 11/24/09 11:22 AM, "onur ascigil" wrote:
I am running Hadoop on a single machine and have some
If your paths/files are alright, think job.setJarByClass(yourClassName.class);
should work.Thanks!
On 11/24/09 9:58 AM, "Zhengguo 'Mike' SUN" wrote:
Job.setJarByClass()
28 matches
Mail list logo