statement but still getting the same error:
DistributedCache.addCacheArchive(new
URI("/home/akhil1988/Config.zip#Config"), conf);
Do you think whether there should be any problem in distributing a zipped
directory and then hadoop unzipping it recursively.
Thanks!
Akhil
Amareshwa
Hi Akhil,
DistributedCache.addCacheArchive takes path on hdfs. From your code, it looks
like you are passing local path.
Also, if you want to create symlink, you should pass URI as hdfs://#, besides calling
DistributedCache.createSymlink(conf);
Thanks
Amareshwari
akhil1988 wrote:
Please a
Is your jar file in local file system or hdfs?
The jar file should be in local fs.
Thanks
Amareshwari
Shravan Mahankali wrote:
Am as well having similar... there is no solution yet!!!
Thank You,
Shravan Kumar. M
Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
-
one job ran after the other job in one class with the new
api?
Amareshwari Sriramadasu wrote:
HRoger wrote:
Hi
As you know in the "org.apache.hadoop.mapred.jobcontrol.Job" there is a
method called "addDependingJob" but not in
"org.apache.hadoop.mapreduce.Job&qu
HRoger wrote:
Hi
As you know in the "org.apache.hadoop.mapred.jobcontrol.Job" there is a
method called "addDependingJob" but not in
"org.apache.hadoop.mapreduce.Job".Is there some method works like
addDependingJob in "mapreduce" package?
"org.apache.hadoop.mapred.jobcontrol.Job" is moved to
Hi Lance,
Where are you passing the -libjars parameter? It is now GenericOption.
It is no more a parameter for jar command.
Thanks
Amareshwari
Lance Riedel wrote:
We are trying to upgrade to .20 from 19.1 due to several issues we are
having. Now are jobs are failing with class not found exc
You can use RunningJob handle to query map/reduce progress.
See api @
http://hadoop.apache.org/core/docs/r0.20.0/api/org/apache/hadoop/mapred/RunningJob.html
Thanks
Amareshwari
Jothi Padmanabhan wrote:
Look at JobClient -- There are some useful methods there.
For example, displayTasks and moni
.
Regards
Sandhya
On Tue, Apr 28, 2009 at 2:02 PM, Amareshwari Sriramadasu
wrote:
Hi Sandhya,
Which version of HADOOP are you using? There could be
directories in mapred/local, pre 0.17. Now, there should not be any such
directories.
From version 0.17 onwards, the attempt directories will be
Hi Sandhya,
Which version of HADOOP are you using? There could be
directories in mapred/local, pre 0.17. Now, there should not be any such
directories.
From version 0.17 onwards, the attempt directories will be present only
at mapred/local/taskTracker/jobCache// . If you are
seeing the dire
You can add your jar to distributed cache and add it to classpath by
passing it in configuration propery - "mapred.job.classpath.archives".
-Amareshwari
Peter Skomoroch wrote:
If I need to use a custom streaming combiner jar in Hadoop 18.3, is there a
way to add it to the classpath without the
Elia Mazzawi wrote:
is there a command that i can run from the shell that says this job
passed / failed
I found these but they don't really say pass/fail they only say what
is running and percent complete.
this shows what is running
./hadoop job -list
and this shows the completion
./hadoop
Set mapred.jobtracker.retirejob.interval and mapred.userlog.retain.hours
to higher value. By default, their values are 24 hours. These might be
the reason for failure, though I'm not sure.
Thanks
Amareshwari
Billy Pearson wrote:
I am seeing on one of my long running jobs about 50-60 hours that
Can you look for Exception from jetty in JT logs and report here? That
would tell us the cause for ERROR 500.
Thanks
Amareshwari
Nathan Marz wrote:
Sometimes I am unable to access a job's details and instead only see.
I am seeing this on 0.19.2 branch.
HTTP ERROR: 500
Internal Server Error
Saptarshi Guha wrote:
Hello,
I would like to produce side effect files which will be later copied
to the outputfolder.
I am using FileOuputFormat, and in the Map's close() method i copy
files (from the local tmp/ folder) to
FileOutputFormat.getWorkOutputPath(job);
FileOutputFormat.getWorkOut
into future releases.
cheers,
ckw
On Mar 12, 2009, at 8:20 PM, Amareshwari Sriramadasu wrote:
Are you seeing reducers getting spawned from web ui? then, it is a bug.
If not, there won't be reducers spawned, it could be job-setup/
job-cleanup task that is running on a reduce slot. See H
Are you seeing reducers getting spawned from web ui? then, it is a bug.
If not, there won't be reducers spawned, it could be job-setup/
job-cleanup task that is running on a reduce slot. See HADOOP-3150 and
HADOOP-4261.
-Amareshwari
Chris K Wensel wrote:
May have found the answer, waiting on
Till 0.18.x, files are not added to client-side classpath. Use 0.19,
and run following command to use custom input format
bin/hadoop jar contrib/streaming/hadoop-0.19.0-streaming.jar -mapper
mapper.pl -reducer org.apache.hadoop.mapred.lib.IdentityReducer -input
test.data -output test-output -fi
This is due to HADOOP-5233. Got fixed in branch 0.19.2
-Amareshwari
Nathan Marz wrote:
Every now and then, I have jobs that stall forever with one map task
remaining. The last map task remaining says it is at "100%" and in the
logs, it says it is in the process of committing. However, the task
Is your job a streaming job?
If so, Which version of hadoop are you using? what is the configured
value for stream.non.zero.exit.is.failure? Can you see
stream.non.zero.exit.is.failure to true and try again?
Thanks
Amareshwari
Saptarshi Guha wrote:
Hello,
I have given a case where my mapper sh
Are you hitting HADOOP-2771?
-Amareshwari
Sandy wrote:
Hello all,
For the sake of benchmarking, I ran the standard hadoop wordcount example on
an input file using 2, 4, and 8 mappers and reducers for my job.
In other words, I do:
time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 2
[HADOOP-1722] Make streaming to handle non-utf8 byte array
http://issues.apache.org/jira/browse/HADOOP-1722
is committed to branch 0.21
Yasuyuki Watanabe wrote:
Hi,
I would like to know the status of binary input/output format
support for streaming.
We found HADOOP-3227 and it was open. So we
n parallel with this job, but it's of the
same priority. The other job had failed when the job I'm describing
got hung.
On Feb 24, 2009, at 10:46 PM, Amareshwari Sriramadasu wrote:
Nathan Marz wrote:
I have a large job operating on over 2 TB of data, with about 5
input splits. For
Nathan Marz wrote:
I have a large job operating on over 2 TB of data, with about 5
input splits. For some reason (as yet unknown), tasks started failing
on two of the machines (which got blacklisted). 13 mappers failed in
total. Of those 13, 8 of the tasks were able to execute on another
m
Arun C Murthy wrote:
On Feb 23, 2009, at 2:01 AM, Bing TANG wrote:
Hi, everyone,
Could somdone tell me the principle of "-file" when using Hadoop
Streaming. I want to ship a big file to Slaves, so how it works?
Hadoop uses "SCP" to copy? How does Hadoop deal with -file option?
No, -file ju
You should implement Tool interface and submit jobs.
For example see org.apache.hadoop.examples.WordCount
-Amareshwari
Wu Wei wrote:
Hi,
I used to submit Hadoop job with the utility RunJar.main() on hadoop
0.18. On hadoop 0.19, because the commandLineConfig of JobClient was
null, I got a Null
Yes. The configuration is read only when the taskTracker starts.
You can see more discussion on jira HADOOP-5170
(http://issues.apache.org/jira/browse/HADOOP-5170) for making it per job.
-Amareshwari
jason hadoop wrote:
I certainly hope it changes but I am unaware that it is in the todo queue a
Bill Au wrote:
I have enabled persistent completed jobs status and can see them in HDFS.
However, they are not listed in the jobtracker's UI after the jobtracker is
restarted. I thought that jobtracker will automatically look in HDFS if it
does not find a job in its memory cache. What am I miss
Nathan Marz wrote:
I have some unit tests which run MapReduce jobs and test the
inputs/outputs in standalone mode. I recently started using
DistributedCache in one of these jobs, but now my tests fail with
errors such as:
Caused by: java.io.IOException: Incomplete HDFS URI, no host:
hdfs:///
Nick Cen wrote:
Hi,
I hava a hadoop cluster with 4 pc. And I wanna to integrate hadoop and
lucene together, so i copy some of the source code from nutch's Indexer
class, but when i run my job, i found that there is only 1 reducer running
on 1 pc, so the performance is not as far as expect.
Andrew wrote:
I've noticed that task tracker moves all unpacked jars into
${hadoop.tmp.dir}/mapred/local/taskTracker.
We are using a lot of external libraries, that are deployed via "-libjars"
option. The total number of files after unpacking is about 20 thousands.
After running a number of
putFormat use LineRecordReader.)
-Amareshwari
Any thoughts?
John
On Sun, Feb 1, 2009 at 11:00 PM, Amareshwari Sriramadasu <
amar...@yahoo-inc.com> wrote:
Which version of hadoop are you using?
You can directly use -inputformat
org.apache.hadoop.mapred.lib.NLineInputFormat for your st
roach, can you point me to an example of what kind of
param should be specified? I appreciate your help.
Thanks,
SD
On Thu, Jan 29, 2009 at 10:49 PM, Amareshwari Sriramadasu <
amar...@yahoo-inc.com> wrote:
You can use NLineInputFormat for this, which splits one line (N=1, by
default) a
Anum Ali wrote:
Hi,
Need some kind of guidance related to started with Hadoop Installation and
system setup. Iam newbie regarding to Hadoop . Our system OS is Fedora 8,
should I start from a stable release of Hadoop or get it from svn developing
version (from contribute site).
Thank You
Kris Jirapinyo wrote:
Hi all,
I am using counters in Hadoop via the reporter. I can see this custom
counter fine after I run my job. However, if somehow I restart the cluster,
then when I look into the Hadoop Job History, I can't seem to find the
information of my previous counter values an
You can use NLineInputFormat for this, which splits one line (N=1, by
default) as one split.
So, each map task processes one line.
See
http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html
-Amareshwari
S D wrote:
Hello,
I have a clarifying question
Edwin wrote:
Hi
I am looking for a way to interrupt a thread that entered
JobClient.runJob(). The runJob() method keep polling the JobTracker until
the job is completed. After reading the source code, I know that the
InterruptException is caught in runJob(). Thus, I can't interrupt it using
Thre
patektek wrote:
Hello list, I am trying to add some functionality to Hadoop-core and I am
having serious issues
debugging it. I have searched in the list archive and still have not been
able to resolve the issues.
Simple question:
If I want to insert "LOG.INFO()" statements in Hadoop code is not
Saptarshi Guha wrote:
Sorry, i see - every line is now a maptask - one split,one task.(in
this case N=1 line per split)
Is that correct?
Saptarshi
You are right. NLineInputFormat splits N lines of input as one split and
each split is given to a map task.
By default, N is 1. N can configured th
From the exception you pasted, it looks like your io.serializations did
not set the SerializationFactory properly. Do you see any logs on your
console for adding serialization class?
Can you try running your app on pseudo distributed mode, instead of
LocalJobRunner ?
You can find pseudo distribu
You can use Job Control.
See
http://hadoop.apache.org/core/docs/r0.19.0/mapred_tutorial.html#Job+Control
http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/jobcontrol/Job.html
and
http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/jobcontrol/JobControl.htm
You can also have a look at NLineInputFormat.
@http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html
Thanks
Amareshwari
Abdul Qadeer wrote:
Dmitry,
If you are talking about Text data, then the splits can be anywhere. But
LineRecordReader will take c
is the location specified by the configuration property
"hadoop.job.history.user.location". If you don't specify anything for the property,
the job history logs will be created in job's output directory. So, to view your history give
your jobOutputDir, if you havent specified any location.
Hop
Saptarshi Guha wrote:
Hello,
I had previously emailed regarding heap size issue and have discovered
that the hadoop-site.xml is not loading completely, i.e
Configuration defaults = new Configuration();
JobConf jobConf = new JobConf(defaults, XYZ.class);
System.out.println("1:"+jo
Sean Shanny wrote:
To all,
Version: hadoop-0.17.2.1-core.jar
I have created a MapFile.
What I don't seem to be able to do is correctly place the MapFile in
the DistributedCache and the make use of it in a map method.
I need the following info please:
1.How and where to place the MapFi
Saptarshi Guha wrote:
Caught it in action.
Running ps -e -o 'vsz pid ruser args' |sort -nr|head -5
on a machine where the map task was running
04812 16962 sguha/home/godhuli/custom/jdk1.6.0_11/jre/bin/java
-Djava.library.path=/home/godhuli/custom/hadoop/bin/../lib/native/Linux-amd64-64:/home
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
2008-12-23 19:04:57,781 INFO org.apache.hadoop.mapred.JobTracker:
Removed completed task 'attempt_200812221742_0075_r_00_2' from
'tracker_hnode1.cor.mystrands.in:localhost/127.0.0.1:37971'
Thanks,
RDH
On Dec 23, 2008, at 1:
You can report status from streaming job by emitting
reporter:status: in stderr.
See documentation @
http://hadoop.apache.org/core/docs/r0.18.2/streaming.html#How+do+I+update+status+in+streaming+applications%3F
But from the exception trace, it doesn't look like lack of
report(timeout). The tr
You can set the configuration property
"mapred.task.tracker.http.address" to 0.0.0.0:0 . If the port is given
as 0, then the server will start on a free port.
Thanks
Amareshwari
Sagar Naik wrote:
- check hadoop-default.xml
in here u will find all the ports used. Copy the xml-nodes from
hado
Arv Mistry wrote:
I'm using hadoop 0.17.0. Unfortunately I cant upgrade to 0.19.0 just
yet.
I'm trying to control the amount of extraneous files. I noticed there
are the following log files produced by hadoop;
On Slave
- userlogs (for each map/reduce job)
Hi Aayush,
Do you want one map to run one command? You can give input file
consisting of lines of . Use NLineInputFormat which
splits N lines of input as one split. i.e gives N lines to one map for
processing. By default, N is one. Then your map can just run the shell
command on input line. W
Message-
From: Amareshwari Sriramadasu [mailto:[EMAIL PROTECTED]
Sent: Friday, November 28, 2008 10:56 AM
To: core-user@hadoop.apache.org
Subject: Re: Error with Sequence File in hadoop-18
It got fixed in 0.18.3 (HADOOP-4499).
-Amareshwari
Palleti, Pallavi wrote:
Hi,
I am getting "Chec
It got fixed in 0.18.3 (HADOOP-4499).
-Amareshwari
Palleti, Pallavi wrote:
Hi,
I am getting "Check sum ok was sent" errors when I am using hadoop. Can
someone please let me know why this error is coming and how to avoid it.
It was running perfectly fine when I used hadoop-17. And, this error
Jeremy Chow wrote:
Hi list,
I added a property dfs.hosts.exclude to my conf/hadoop-site.xml. Then
refreshed my cluster with command
bin/hadoop dfsadmin -refreshNodes
It showed that it can only shut down the DataNode process but not included
the TaskTracker process on each s
tim robertson wrote:
Hi all,
I am running MR which is scanning 130M records and then trying to
group them into around 64,000 files.
The Map does the grouping of the record by determining the key, and
then I use a MultipleTextOutputFormat to write the file based on the
key:
@Override
returns the
value as N Lines?
Thanks
Rahul
On Mon, Nov 17, 2008 at 9:43 AM, Amareshwari Sriramadasu
<[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
Hi Rahul,
How did you set the configuration
"mapred.line.input.format.linespermap" and your input forma
returns the
value as N Lines?
Setting Configuration in run() method will also work. You have to extend
LineRecordReader and override method next() to return N lines as value
instead of 1 line.
Thanks
Amareshwari
Thanks
Rahul
On Mon, Nov 17, 2008 at 9:43 AM, Amareshwari Sriramadasu
<[EM
Hi Rahul,
How did you set the configuration "mapred.line.input.format.linespermap"
and your input format? You have to set them in hadoop-site.xml or pass
them through -D option to the job.
NLineInputFormat will split N lines of input as one split. So, each map
gets N lines.
But the RecordReade
Jeremy Pinkham wrote:
We are using the distributed cache in one of our jobs and have noticed
that the local copies on all of the task nodes never seem to get cleaned
up. Is there a mechanism in the API to tell the framework that those
copies are no longer needed so they can be deleted. I've tri
some speed wrote:
I was wondering if it was possible to read the input for a map function from
2 different files:
1st file ---> user-input file from a particular location(path)
2nd file=---> A resultant file (has just one pair) from a
previous MapReduce job. (I am implementing a chain MapReduce
shahab mehmandoust wrote:
I'm try to write a daemon that periodically wakes up and runs map/reduce
jobs, but I've have little luck. I've tried different ways (including using
cascading) and I keep arriving at the below exception:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.had
Nathan Marz wrote:
Hello all,
Occasionally when running jobs, Hadoop fails to clean up the
"_temporary" directories it has left behind. This only appears to
happen when a task is killed (aka a speculative execution), and the
data that task has outputted so far is not cleaned up. Is this a kn
Some more links:
http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Other+Useful+Features
http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Debugging
-Amareshwari
Arun C Murthy wrote:
On Oct 30, 2008, at 1:16 PM, Scott Whitecross wrote:
Is the presentation online as
hem as jar file, is
there any other ways to do that?
Thanks
Mike
From: Amareshwari Sriramadasu <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Tuesday, October 28, 2008 11:58:33 PM
Subject: Re: How do I include customized InputFormat, InputSp
Hi,
How are you passing your classes to the pipes job? If you are passing
them as a jar file, you can use -libjars option. From branch 0.19, the
libjar files are added to the client classpath also.
Thanks
Amareshwari
Zhengguo 'Mike' SUN wrote:
Hi,
I implemented customized classes for InputF
Has your task-tracker started? I mean, do you see non-zero nodes on your
job tracker UI?
-Amareshwari
John Babilon wrote:
Hello,
I've been trying to get Hadoop up and running on a Windows Desktop running
Windows XP. I've installed Cygwin and Hadoop. I run the start-all.sh script,
it starts
Hi,
From 0.19, the jars added using -libjars are available on the client
classpath also, fixed by HADOOP-3570.
Thanks
Amareshwari
Mahadev Konar wrote:
HI Tarandeep,
the libjars options does not add the jar on the client side. Their is an
open jira for that ( id ont remember which one)...
O
Hi Naama,
Yes. It is possible to specify using the apis
FileInputFormat#setInputPaths(), FileOutputFormat#setOutputPath().
You can specify the FileSystem uri for the path.
Thanks,
Amareshwari
Naama Kraus wrote:
Hi,
I wanted to know if it is possible to use different file systems for Map
Re
This is because the non-zero exit status of streaming process was not
treated as failure until 0.17. In 0.17, you can specify the
configuration property "stream.non.zero.exit.is.failure" as "true", to
consider the non-zero exit as failure. From 0.18, the default value
for/ stream.non.zero.exit
Are you seeing HADOOP-2009?
Thanks
Amareshwari
Nathan Marz wrote:
Unfortunately, setting those environment variables did not help my
issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not
defined in both LzoCompressor.c and LzoDecompressor.c. Where is this
variable supposed to be set?
mlink in the local running directory, correct?
Just like the cacheFile option? If not how can i then specify which
class to use?
cheers,
Christian
Amareshwari Sriramadasu wrote:
Dennis Kubes wrote:
If I understand what you are asking you can use the -cacheArchive
with the path to the j
Dennis Kubes wrote:
If I understand what you are asking you can use the -cacheArchive with
the path to the jar to including the jar file in the classpath of your
streaming job.
Dennis
You can also use -cacheArchive option to include jar file and symlink
the unjarred directory from cwd by pro
Per Jacobsson wrote:
Hi all.
I've got a beginner question: Are there any best practices for how to do
logging from a task? Essentially I want to log warning messages under
certain conditions in my map and reduce tasks, and be able to review them
later.
stdout, stderr and the logs using common
You can add more paths to input using
FileInputFormat.addInputPath(JobConf, Path).
You can also specify comma separated filenames as input path using
FileInputFormat.setInputPaths(JobConf, String commaSeparatedPaths)
More details at
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoo
The Mapred framework kills the map/reduce tasks if they dont report
status within 10 minutes. If your mapper/reducer needs more time they
should report status using
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/Reporter.html
More documentation at
http://hadoop.apache.org/c
You can get the file name accessed by the mapper using the config
property "map.input.file"
Thanks
Amareshwari
Deyaa Adranale wrote:
Hi,
I need to know inside my mapper, the name of the file that contains
the current record.
I saw that I can access the name of the input directories inside
ma
The error "Could not find any valid local directory for task" means that
the task could not find a local directory to write file, mostly because
there is no enough space on any of the disks.
Thanks
Amareshwari
Shirley Cohen wrote:
Hi,
Does anyone know what the following error means?
hadoop-
Arv Mistry wrote:
I'll try again, can anyone tell me should it be possible to run hadoop
in a pseudo-distributed mode (i.e. everything on one machine) and then
submit a mapred job using the ToolRunner from another machine on that
hadoop configuration?
Cheers Arv
Yes. It is possible to do.
Hi Srilatha,
You can download hadoop release tar ball from
http://hadoop.apache.org/core/releases.html
You will find hadoop-*-examples.jar when you untar it.
Thanks,
Amareshwari
us latha wrote:
HI All,
Trying to run the wordcount example on single node hadoop setup.
Could anyone please poin
same,
https://issues.apache.org/jira/browse/HADOOP-3850. You can give you
inputs there.
Thanks
Amareshwari
Paco
On Mon, Jul 28, 2008 at 1:42 AM, Amareshwari Sriramadasu
<[EMAIL PROTECTED]> wrote:
HistoryViewer is used in JobClient to view the history files in the
directory provi
the directory.
Thanks
Amareshwari
Paco NATHAN wrote:
Thank you, Amareshwari -
That helps. Hadn't noticed HistoryViewer before. It has no JavaDoc.
What is a typical usage? In other words, what would be the
"outputDir" value in the context of ToolRunner, JobClient, etc. ?
Pa
Can you have a look at org.apache.hadoop.mapred.HistoryViewer and see if
it make sense?
Thanks
Amareshwari
Paco NATHAN wrote:
We have a need to access data found in the JobTracker History link.
Specifically in the "Analyse This Job" analysis. Must be run in Java,
between jobs, in the same code
The proposal on http://issues.apache.org/jira/browse/HADOOP-3386 takes
care of this.
Thanks
Amareshwari
Amareshwari Sriramadasu wrote:
If task tracker didn't receive KillJobAction, its true that job
directory will not removed.
And your observation is correct that some task trackers d
If task tracker didn't receive KillJobAction, its true that job
directory will not removed.
And your observation is correct that some task trackers didn't receive
KillJobAction for the job.
If a reduce task has finished before the job completion, the task will
be sent KillTaskAction.
Looks like
C G wrote:
Hi All:
I have mapred.tasktracker.tasks.maximum set to 4 in our conf/hadoop-site.xml, yet I frequently see 5-6 instances of org.apache.hadoop.mapred.TaskTracker$Child running on the slave nodes. Is there another setting I need to tweak in order to dial back the number of childr
Taeho Kang wrote:
Set "mapred.tasktracker.tasks.maximum"
and each node will be able to process N number of tasks - map or/and reduce.
Please note that once you set "mapred.tasktracker.tasks.maximum",
"mapred.tasktracker.map.tasks.maximum" and
"mapred.tasktracker.reduce.tasks.maximum" setting wil
You can put your external jar in DistributedCache. and do symlink the
jar in the current working directory of the task giving the value of
mapred.create.symlink as true. More details can be found at
http://issues.apache.org/jira/browse/HADOOP-1660.
The jar can also be added to classpath usin
You can have a look at TextInputFormat, KeyValueTextInputFormat etc at
http://svn.apache.org/viewvc/hadoop/core/trunk/src/java/org/apache/hadoop/mapred/
coneybeare wrote:
I want to alter the default <"key", "line"> input format to be <"key", "line
number:" + "line"> so that my mapper can have
Arun C Murthy wrote:
On Apr 3, 2008, at 5:36 PM, Jason Venner wrote:
For the first day or so, when the jobs are viewable via the main page
of the job tracker web interface, the jobs specific counters are also
visible. Once the job is only visible in the history page, the
counters are not vis
LineRecordReader.readLine() is deprecated by
HADOOP-2285(http://issues.apache.org/jira/browse/HADOOP-2285) because it was
slow.
But streaming still uses the method. HADOOP-2826
(http://issues.apache.org/jira/browse/HADOOP-2826) will remove the usage in
streaming.
This change should improve str
Norbert Burger wrote:
I'm trying to use the cacheArchive command-line options with the
hadoop-0.15.3-streaming.jar. I'm using the option as follows:
-cacheArchive hdfs://host:50001/user/root/lib.jar#lib
Unfortunately, my PERL scripts fail with an error consistent with not being
able to find th
Hi Andreas,
Looks like your mapper is not available to the streaming jar. Where is
your mapper script? Did you use distributed cache to distribute the mapper?
You can use -file to make it part of
jar. or Use -cacheFile /dist/wordloadmf#workloadmf to distribute the
script. Distributing this way
Thanks Matt for info.
I raised a Jira for this at
https://issues.apache.org/jira/browse/HADOOP-3039
Thanks
Amareshwari
Matt Kent wrote:
Or maybe I can't use attachments, so here's the stack traces inline:
--task tracker
2008-03-17 21:58:30
Hi Andrey,
I think that is classpath problem.
Can you try using patch at
https://issues.apache.org/jira/browse/HADOOP-2622 and see you still have
the problem?
Thanks
Amareshwari.
Andrey Pankov wrote:
Hi all,
I'm still new to Hadoop. I'd like to use Hadoop streaming in order to
combine map
93 matches
Mail list logo