thanks, the configure file format looks like below,
@tag_name0 name0 {value00, value01, value02}
@tag_name1 name1 {value10, value11, value12}
and reading it from HDFS. Then how can I parse them ?
On Apr 3, 2008, at 5:36 PM, Jason Venner wrote:
For the first day or so, when the jobs are viewable via the main
page of the job tracker web interface, the jobs specific counters
are also visible. Once the job is only visible in the history page,
the counters are not visible.
Is it possi
HI Amar , Theodore, Arun,
Thanks for your reply. Actaully I am new to hadoop so cant figure out much.
I have written following code for inverted index. This code maps each word
from the document to its document id.
ex: apple file1 file123
Main functions of the code are:-
public class HadoopProgr
For the first day or so, when the jobs are viewable via the main page of
the job tracker web interface, the jobs specific counters are also
visible. Once the job is only visible in the history page, the counters
are not visible.
Is it possible to view the counters of the older jobs?
--
Jason
Thanks Yuri! I followed your pattern here and the version where you make the
sytem call directly to -put onto DFS works for me. I did not set
$ENV{HADOOP_HEAPSIZE}=300;
and it seems to work fine (i didnt try setting this variable to see if it
failed).
I also used perl's built in File::Temp mechanis
Interesting you should say this.
I have been using this exact example (slightly modified) as an interview
question lately. I have to admit I stole it from Doug's Hadoop slides.
If you have a 1TB database with 100 B records and you want to update 1% of
them, how long will it take?
Assume for ar
This only happens if you add a class from the jar to the JobConf
creation line.
JobConf conf = new JobConf(MyClass.class);
JobConf
public JobConf(Class exampleClass)
Construct a map/reduce job configuration.
Parameters:
exampleClass - a class whose containing jar is used a
You're right. Java isn't really that slow. I re-examined the Java code for
the standalone program and found I was using an unbuffered output method.
After I changed it to a buffered method, the Java code running time was
comparable to the C++ one. This also means the 1000% speed-up I got was
quite
Ted Dunning wrote:
I sympathize fully with Owen's thoughts here, but Andrej's point that
(essentially) users ought to be able to do it if they really, really want to
is a good one.
One particular scenario where having the ability to update blocks would
be beneficial is when flipping flags in r
Yeah, everything is packaged into one jar...I've been copying those jars
everywhere which didn't seem right, hence the question.
Thanks,
C G
Ted Dunning <[EMAIL PROTECTED]> wrote:
The easiest way is to package all of your code (classes and jars) into a
single jar file which you then
distcp supports multiple sources (link Unix cp) and if the specified source is
a directory, it copies the entire directory. So, you could either do
distcp src1 src2 ... src100 dst
or
first copy all srcs to srcdir, and then
distcp srcdir dstdir
I have no experience on S3 and EC2. Not sure
The easiest way is to package all of your code (classes and jars) into a
single jar file which you then execute. When you instantiate a JobClient
and run a job, your jar gets copied to all necessary nodes. The machine you
use to launch the job need not even be in the cluster, just able to see th
Hi All:
When deploying a jar file containing code for a Hadoop job, is it necessary
to copy the jar to the same path on all nodes in the grid, or just on the node
which will launch the job?
Thanks,
C G
-
You rock. That's why Blockbuster's
Yingyuan ,
I cannot give you a detailed answer, just a guess.
Maybe Arun can chime in and provide more details.
My guess is that it has to do with the fact that the DFSClient returns the
same (cached) FileSystem handle for every connection request to the same
namenode, but libhdfs would return yo
I found it was a slight oversight on my part. I was copying the files into S3
using Firefox EC2 UI, and then trying to access those files on S3 using hadoop.
The S3 filesystem provided by hadoop doesn't work with standard files. When I
used hadoop to upload the files into S3 instead of Firefox
Here is how we (attempt to) do it:
Reducer (in streaming) writes one file for each different key it receives as
input.
Here's some example code in perl:
my $envdir = $ENV{'mapred_output_dir'};
my $fs = ($envdir =~ s/^file://);
if ($fs) {
#output goes onto NFS
open(FI
Release 0.16.2 fixes critical bugs in 0.16.1. Note that HBase
releases are now maintained at http://hadoop.apache.org/hbase/ and
HBase has been removed from this release.
For Hadoop release details and downloads, visit:
http://hadoop.apache.org/core/releases.html
Thanks to all who contr
I sympathize fully with Owen's thoughts here, but Andrej's point that
(essentially) users ought to be able to do it if they really, really want to
is a good one.
It IS true that the original point of hadoop is high performance sequential
writing applications. It does that, more or less, pretty w
That depends on where the file is. If you are reading a file on a normal
file system, you use normal Java functions. If you are reading a file from
HDFS, you use hadoop functions.
On 4/3/08 1:22 AM, "Jeremy Chow" <[EMAIL PROTECTED]> wrote:
> Hi list,
>
> If I define a method named configure
On Apr 3, 2008, at 3:53 AM, Andrzej Bialecki wrote:
Hmm ... Exactly why random writes are not possible? For performance
reasons? Or the problem of synchronization of replicas?
The HDFS protocols to support random write would be much more
complicated. Furthermore, part of the performance of
Hello ,
As this problem is related to CLASSPATh of hadoop , so just set the
HADOOP_CLASSPATH or CLASSPATH with hadoop core jar
---
Peeyush
On Wed, 2008-04-02 at 13:51 -0300, Anisio Mendes Lacerda wrote:
> Hi,
>
> me and my coleagues are implementing a small search engine in my University
> L
Owen O'Malley wrote:
On Apr 2, 2008, at 11:39 PM, Garri Santos wrote:
Hi!
I'm starting to take alook at hadoop and the whole HDFS idea. I'm
wondering
if it's just fine to update or overwrite a file copied to hadoop?
No. Although we are making progress on HADOOP-1700, which would allow
ap
the config file is a normal text file.
--
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.
http://coderplay.javaeye.com
Hi list,
If I define a method named configure in a mapper class which try to read a
config file before all map tasks start, which class I should choose?
A normal FileReader from jdk or another Reader provided by hadoop ? Can
anyone give me an example?
Thx,
Jeremy
--
My research interests are di
24 matches
Mail list logo