Hi Folks,
I m using java client to run queries on hive, suggest me some way so that I
can kill the query whenever I need.
Or how can I find the jobid to kill it.
Regards
Vikas Srivastava
Dude
check from where you fired Hadoop command.
$HADOOP_HOME/bin/Hadoop namenode -format
Regards
Vikas srivastava
From: Ambreen Mahvash [mailto:ambreen.mahv...@appsassociates.com]
Sent: Wednesday, June 27, 2012 3:02 PM
To: common-user@hadoop.apache.org
Subject: Error while
I want to be able to submit my teragen and terasort jobs via oozie.
I have tried different things in workflow.xml to no avail.
Has anybody had any success doing so? Can you share your workflow.xml
file ?
Many thanks
-James
Hi Masoud
One reducer would definitely emit one output file. If you are looking at
just one file as your final result in lfs, Then once you have the MR job done
use hadoop fs -getmerge .
Sent from BlackBerry® on Airtel
-Original Message-
From: Masoud
Date: Tue, 20 Mar 2012 19
Hi Mausoud
Set -D mapred.reduce.tasks=n; ie to any higher value.
Sent from BlackBerry® on Airtel
-Original Message-
From: Masoud
Date: Tue, 20 Mar 2012 17:52:58
To:
Reply-To: common-user@hadoop.apache.org
Subject: Increasing number of Reducers
Hi all,
we have a cluster with 32 machin
-- Forwarded message --
From: hadoop hive
Date: Fri, Mar 16, 2012 at 2:04 PM
Subject: Hive with JDBC
To: u...@hive.apache.org
HI folks,
I m facing a problem while when i fired a query through java code, its
returns around half a million records which make the result set in hang
ur job means
nothing. Because Hadoop mapreduce platform only checks this parameter when
it starts. This is a system configuration.
You need to set it in your conf/mapred-site.xml file and restart your
hadoop mapreduce.
On Fri, Mar 9, 2012 at 7:32 PM, Mohit Anch
Mohit
It is a job level config parameter. For plain map reduce jobs you can set
the same through CLI as
hadoop jar ... -D mapred.map.tasks=n
You should be able to do it pig as well.
However the number of map tasks for a job are governed by the input splits and
the Input Format you are
okk.. can i built it with ANT
On Mon, Feb 27, 2012 at 12:49 PM, alo alt wrote:
> hive?
> You are then on the wrong list, for hive related questions refer to:
> u...@hive.apache.org
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> On Feb 27, 2012, at
gt; For storing snappy compressed files in HDFS you should use Pig or Flume.
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> On Feb 27, 2012, at 7:28 AM, hadoop hive wrote:
>
> > thanks Alex,
> >
> > i m using Apache hadoop, steps i followed
thanks Alex,
i m using Apache hadoop, steps i followed
1:- untar snappy
2:- entry in mapred site
this can be used like deflate only(like only on overwriting file)
On Mon, Feb 27, 2012 at 11:50 AM, alo alt wrote:
> Hi,
>
>
> https://ccp.cloudera.com/display/CDHDOC/Snappy+
Hey folks,
i m using hadoop 0.20.2 + r911707 , please tell me the installation and how
to use snappy for compression and decompression
Regards
Vikas Srivastava
ith loading of
small xml files that are stored efficiently.
Regards
Bejoy K S
From handheld, Please excuse typos.
-Original Message-
From: Mohit Anchlia
Date: Wed, 22 Feb 2012 12:29:26
To: ;
Subject: Re: Splitting files on new line using hadoop fs
On Wed, Feb 22, 2012 at 12:2
Hi Mohit
AFAIK there is no default mechanism available for the same in hadoop.
File is split into blocks just based on the configured block size during hdfs
copy. While processing the file using Mapreduce the record reader takes care of
the new lines even if a line spans across multiple
HI Folks,
Rite now i m having replication factor 2, but now i want to make it three
for sum tables so how can i do that for specific tables, so that whenever
the data would be loaded in those tables it can automatically replicated
into three nodes.
Or i need to replicate for all the tables.
and
Hi McNeil
Have a look at OOZIE. It is meant for work flow management in
hadoop and can serve your purpose.
--Original Message--
From: W.P. McNeill
To: Hadoop Mailing List
ReplyTo: common-user@hadoop.apache.org
Subject: How do I synchronize Hadoop jobs?
Sent: Feb 16, 2012 00
ount. The to the map is > beta">. The canonical Hadoop program would tokenize this line of text
>> and output <"foo",1> and so on. How would the multithreadedmapper know
>> how to further divide this line of text into, say: [> bar">,] for 2 threads
ast hour or so
http://kickstarthadoop.blogspot.com/2012/02/enable-multiple-threads-in-mapper-aka.html
--
Rob
On 10 February 2012 14:20, Harsh J wrote:
> Hello again,
>
> On Fri, Feb 10, 2012 at 7:31 PM, Rob Stewart wrote:
>> OK, take word count. The to the map is > beta"
rote:
> Thanks for your reply !
>
> I think i installed Hadoop correctly because i run wordcount example i have
> correct output. But i didn't know how to install Hive, so i installed Hive
> via https://cwiki.apache.org/confluence/display/Hive/GettingStartedinclude
> installed Hado
hey luca,
you can use
conf.set("*mapred.textoutputformat.separator*", " ");
hope it works fine
regards
Vikas Srivastava
On Thu, Feb 9, 2012 at 3:57 PM, Luca Pireddu wrote:
> Hello list,
>
> I'm trying to specify from the command line an empty string as the
> key-value separator for TextOutpu
hine ?
> Please paste also the output of "kingul2" namenode logs.
>
> Regards,
>
> Robin
>
>
> On 02/08/12 13:06, Guruprasad B wrote:
>
> Hi,
>
> I am Guruprasad from Bangalore (India). I need help in setting up hadoop
> platform. I am very much new
:
Reply-To: common-user@hadoop.apache.org
Subject: Re: Processing compressed files in Hadoop
Hi Bejoy,
Thanks for your response. I know how to index Lzo files, however I am
curious on whether I can still use my custom InputFormats to process the
compressed LZO files or if I have to implement new
our custom
driver classes for each jobs.
Regards
Bejoy K S
From handheld, Please excuse typos.
-Original Message-
From: "W.P. McNeill"
Date: Wed, 8 Feb 2012 10:17:53
To: ;
Subject: Re: Preferred ways to specify input and output directories to Hadoop
jobs
Right. There is
xcuse typos.
-Original Message-
From: "W.P. McNeill"
Date: Wed, 8 Feb 2012 10:00:55
To: Hadoop Mailing List
Reply-To: common-user@hadoop.apache.org
Subject: Preferred ways to specify input and output directories to Hadoop jobs
How do you like to specify input and output dire
lease excuse typos.
-Original Message-
From: Leonardo Urbina
Sender: flechadeor...@gmail.com
Date: Wed, 8 Feb 2012 12:39:54
To:
Reply-To: common-user@hadoop.apache.org
Subject: Processing compressed files in Hadoop
Hello everyone,
I run a daily job that takes files in a variety of diff
/hadoop jar hadoop-0.20.2-examples.jar sort -inFormat
org.apache.hadoop.mapred.TextInputFormat /user/sangroya/test1 outtest16
Running on 1 nodes to sort from hdfs://localhost:54310/user/sangroya/test1
into hdfs://localhost:54310/user/sangroya/outtest16 with 1 reduces.
Job started: Wed Feb 08 14:53:14
also adding
by their ip's
Regards
Vikas Srivastava
On Wed, Feb 8, 2012 at 11:28 AM, Harsh J wrote:
> Hi,
>
> Can you provide your tasktracker startup log as a pastebin.com link?
> Also your JT log grepped for "Adding a new node"?
>
> On Wed, Feb 8, 2012 at 11:
Hi Folks,
I added a node in cluster , and restart the cluster but its taking much
time to come all the server live in Jobtracker UI, its only showing the
added server in cluster.
I there any specific reason for this or anything,
Thanks
Vikas Srivastava
t; a first start with flume:
> http://mapredit.blogspot.com/2011/10/centralized-logfile-management-across.html
>
> Facebook's scribe could also be work for you.
>
> - Alex
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> On Feb 7, 2012, at 11:03 AM
: What's the best practice of loading logs into hdfs while using hive
to do log analytic?
Hi all,
Sorry if it is not appropriate to send one thread into two maillist.
**
I'm tring to use hadoop and hive to do some log analytic jobs.
Our system generate lots of logs every day, for e
codec='theCodecYouPrefer'
You'd get the blocks compressed in the output dir.
You can use the API to read from standard input like
-get hadoop conf
-register the required compression codec
-write to CompressionOutputStream.
You should get a well detailed explanation on the same from the book '
@hadoop.apache.org
ReplyTo: bing...@asu.edu
Subject: How to Set the Value of hadoop.tmp.dir?
Sent: Feb 7, 2012 02:04
Dear all,
I am a new Hadoop learner. The version I used is 1.0.0.
I tried to set a new value for the parameter instead of /tmp,
hadoop.tmp.dir in core-site.xml, hdfs-site.xml and mapred
ply-To: common-user@hadoop.apache.org
> > Subject: Re: Can I write to an compressed file which is located in hdfs?
> >
> > sorry, this sentence is wrong,
> >
> > I can't compress these logs every hour and them put them into hdfs.
> >
> > it should be
,
I can't compress these logs every hour and them put them into hdfs.
it should be
I can compress these logs every hour and them put them into hdfs.
2012/2/6 Xiaobin She
>
> hi all,
>
> I'm testing hadoop and hive, and I want to use them in log analysis.
>
> H
On Fri, Feb 3, 2012 at 4:56 PM, hadoop hive wrote:
> hey folks,
>
> i m getting this error while running mapreduce and these comes up in
> reduce phase..
>
>
> 2012-02-03 16:41:19,780 WARN org.apache.hadoop.mapred.ReduceTask:
> attempt_201201271626_528
hey folks,
i m getting this error while running mapreduce and these comes up in reduce
phase..
2012-02-03 16:41:19,780 WARN org.apache.hadoop.mapred.ReduceTask:
attempt_201201271626_5282_r_00_0 copy failed:
attempt_201201271626_5282_m_07_2 from hadoopdata3
2012-02-03 16:41:19,954 WARN or
hey folks ,
I m getting an when i starting my datanode. can any1 have the idea what
this error about.
2012-02-03 11:57:02,947 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.0.3.31:50010, storageID=DS-1677953808-10.0.3.31-50010-1318330317888,
infoPort=50075, ipcPo
Hey ,
Can any1 help me with this, i have increases the reduce slowstart to .25
but its still hangs after copy .
tell me what else i can change it to make it working fine.
regards
Vikas Srivastava
On Wed, Jan 25, 2012 at 7:45 PM, praveenesh kumar wrote:
> Yeah , I am doing it, currently its on 2
:
> I am facing issues while trying to run a job from windows (through
> eclipse) on my hadoop cluster on my RHEL VM's. When I run it as "run on
> hadoop" it works fine, but when I run it as a java application, it throws
> classnotfound exception
>
> INFO: Task
couple of days.
>
> On Friday, January 27, 2012, hadoop hive wrote:
> > Hey Harsh,
> >
> > but after sumtym they are available 1 by 1 in jobtracker URL.
> >
> > any idea how they add up slowly slowly.
> >
> > regards
> > Vikas
> >
> >
ommunication errors in their logs? Did you
> perhaps bring up a firewall accidentally, that was not present before?
>
> On Fri, Jan 27, 2012 at 4:47 PM, hadoop hive wrote:
> > Hey folks,
> >
> > i m facing a problem, with job Tracker URL, actually i added a node to
Hey folks,
i m facing a problem, with job Tracker URL, actually i added a node to the
cluster and after sometime i restart the cluster, then i found that my job
tracker is showing recent added node in *nodes * but rest of nodes are not
available not even in *blacklist. *
*
*
can any1 have any i
hey there must be sum problem with the key or value, reducer didnt find the
expected value.
On Fri, Jan 27, 2012 at 1:23 AM, Rajesh Sai T wrote:
> Hi,
>
> I'm new to Hadoop. I'm trying to write my custom data types for Writable
> types. So, that Map class will produce my
this problem arise after adding a node , so then i start balancer to make
it balance ,
On Wed, Jan 25, 2012 at 4:38 PM, praveenesh kumar wrote:
> @hadoophive
>
> Can you explain more by "balance the cluster" ?
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2
i face the same issue but after sumtime when i balanced the cluster the
jobs started running fine,
On Wed, Jan 25, 2012 at 3:34 PM, praveenesh kumar wrote:
> Hey,
>
> Can anyone explain me what is reduce > copy phase in the reducer section ?
> The (K,List(V)), is passed to the reducer. Is reduce
your job tracker is not running
On Wed, Jan 11, 2012 at 7:08 PM, praveenesh kumar wrote:
> Jobtracker webUI suddenly stopped showing. It was working fine before.
> What could be the issue ? Can anyone guide me how can I recover my WebUI ?
>
> Thanks,
> Praveenesh
>
generally, I wonder what the smallest desirable compressed record size is
in the hadoop universe.
- Tim.
From: Tony Burton [tbur...@sportingindex.com]
Sent: Monday, January 09, 2012 10:02 AM
To: common-user@hadoop.apache.org
Subject: RE: has bzip2
Anyone please tell this,
I want to know from where Jobtracker sends task(taskid) to
tasktarcker for scheduling.
i.e where it creates taskid & tasktracker pairs
Thanks & Regards,
Mohmmadanis Moulavi
Student,
MTech (Computer Sci. & Engg.)
Walchand college of Engg. Sangli (M.S.)
and then the matrix
multiplication takes place in there.
Regards
Bejoy K S
-Original Message-
From: Mike Spreitzer
Date: Fri, 18 Nov 2011 14:52:05
To:
Reply-To: common-user@hadoop.apache.org
Subject: RE: Matrix multiplication in Hadoop
Well, this mismatch may tell me something
al Message-
From: Sharad Agarwal
Date: Thu, 17 Nov 2011 18:31:08
To: ; ;
Reply-To: common-user@hadoop.apache.org
Subject: Announcing Bangalore Hadoop Meetup Group
Hi Bangalore Area Hadoop Developers and Users,
There is a lot of interest in Hadoop and Big Data space in Bangalore. Many
folks
patch.
I will try and use 0.21
Raj
>
>From: Tim Broberg
>To: "common-user@hadoop.apache.org" ; Raj V
>; Joey Echeverria
>Sent: Friday, November 11, 2011 10:25 AM
>Subject: RE: Input split for a streaming job!
>
>
>
>What
rrent hadoop 0.20.20x releases. Both
the old and new Map reduce APIs work fine.
Hope it helps!..
Regards
Bejoy K S
-Original Message-
From: Amir Sanjar
Date: Thu, 3 Nov 2011 12:42:06
To:
Reply-To: common-user@hadoop.apache.org
Subject: Hadoop 20.2 release compability
is ther
some blog posts, hadoop
articles, upcoming in hadoop, new developments in big data techology stack etc.
I'd be happy to contribute something. But I believe hug was created by Yahoo
B'lor team before last Hadoop India Summit and they'd be having some good
amount of hadoop materia
Hey
There is a hadoop user group in B'lor. But I think it is not really
active. You can subscribe to it at hug-blr-subscr...@yahoogroups.com
--Original Message--
From: real great..
To: common-user
ReplyTo: common-user@hadoop.apache.org
Subject: Bangalore meetups
Sent: Oct 30,
Oct 2011 17:31:27
> To: ;
> Reply-To: common-user@hadoop.apache.org
> Subject: mapreduce linear chaining: ClassCastException
>
> Hi all,
> I am trying a simple extension of WordCount example in Hadoop. I want to
> get a frequency of wordcounts in descending order. To that
Message-
From: "Periya.Data"
Date: Fri, 14 Oct 2011 17:31:27
To: ;
Reply-To: common-user@hadoop.apache.org
Subject: mapreduce linear chaining: ClassCastException
Hi all,
I am trying a simple extension of WordCount example in Hadoop. I want to
get a frequency of wordcounts in descen
e: Fri, 14 Oct 2011 17:31:27
To: ;
Reply-To: common-user@hadoop.apache.org
Subject: mapreduce linear chaining: ClassCastException
Hi all,
I am trying a simple extension of WordCount example in Hadoop. I want to
get a frequency of wordcounts in descending order. To that I employ a linear
chain
o leverage the parallel processing power
of hadoop. You need to have a mini cluster at least for performance bench
marking and processing relatively large volume data.
Hope it helps!..
--Original Message--
From: Aishwarya Venkataraman
Sender: avenk...@eng.ucsd.edu
To: common-user@hadoop.
t;
> You mean putting your unix dir contents into hdfs? If so use hadoop fs
> -copyFromLocal src destn
> --Original Message--
> From: Jignesh Patel
> To: common-user@hadoop.apache.org
> To: bejoy.had...@gmail.com
> Subject: Re: hdfs directory location
> Sent: Oc
Jignesh
Sorry I didn't get your query, 'how I can link it with HDFS
directory structure?
'
You mean putting your unix dir contents into hdfs? If so use hadoop fs
-copyFromLocal src destn
--Original Message--
From: Jignesh Patel
To: common-user@hadoo
Hi Mohit
I'm really not sure how many of the map reduce developers use the map
reduce eclipse plugin. AFAIK majority don't. As Jignesh mentioned you can get
it from the hadoop distribution folder as soon as you unzip the same.
My suggested approach would be,If you are on Windo
Jignesh
You are creating a dir in hdfs by that command. The dir won't be in your
local file system but it hdfs. Issue a command like
hadoop fs -ls /user/hadoop-user/citation/
You can see the dir you created in hdfs
If you want to create a die on local unix use a simple linux command
ou to take the following steps
-understand hadoop, hdfs and mapreduce
-understand the word count example code you already ran
-understand each and every parameters mentioned in the Driver Class along with
Map and Reduce class, get a grip on the parameters used in map and reduce
methods and why.
-
Hi Abdelrahman Kamel
Your Name Node is in safe mode now. Either wait till it automatically
comes out of safe mode or you can manually make it exit the safe mode by the
following command
hadoop dfsadmin -safemode leave
If you cluster was not put on safe mode manually and it happened
ssage-
From: trang van anh
Date: Wed, 05 Oct 2011 14:06:11
To:
Reply-To: common-user@hadoop.apache.org
Subject: Re: Error using hadoop distcp
which host run the task that throws the exception ? ensure that each
data node know another data nodes in hadoop cluster-> add "ub16" entry
Sam
Your understanding is right, hadoop definitely works great with large
volume of data. But not necessarily every file should be in the range of
Giga,Tera or Peta bytes. Mostly when said hadoop process tera bytes of data, It
is the total data processed by a map reduce job(rather jobs
dir, right?
Second question, ie once a job is executed, does the logs from all tasks
trackers get dumbed to HADOOP_HOME/logs/history dir in name node/job tracker?
Third question is how do I enable DEBUG mode of logger? Or is it enabled in
default. If not what is the logger mode enabled default
ubject: Out of heap space errors on TTs
> To: common-user@hadoop.apache.org
>
> > Hey guys,
> >
> > I am running hive and I am trying to join two tables (2.2GB and
> > 136MB) on a
> > cluster of 9 nodes (replication = 3)
> >
> > Hadoop version - 0.20.2
> >
Hello,
Can anyone please explain the use of the Transaction Log on the NameNode ?
AFAIK, it logs the files created and deleted details in the Hadoop Cluster.
Thanks.
Hello,
What is the size of the metadata information if a file occupies a single
block (of 64Mb) on HDFS ? Is there any specific formula using which we can
calculate the size of the metadata ?
Hi guys,
I'm trying to sort a 2.5 GB sequence file in one mapper using its
implemented Sort function, but it's taking long that the map is killed for not
reporting .
I would increase the default time to get reports from the mapper, but I'll do
this only if sorting using SequenceFile.so
Hi guys,
This is a program which used to work but I probably have changed something
and now it is taking me a lot of time to figure out how the problem is caused
. In mapper :
map-function:
--
...
tempSeq = new Path(job.get("mapred.local.dir")+"/
Redirecting to common-user,
you can check hadoop version by using any of the following methods.
CLI : using hadoop version command.
bin/hadoop version
Web Interface:
Check Name node or Job tracker web interface. It will show version number.
-
Ravi
On Fri, Jan 28, 2011 at 11:24 AM, wrote
What are possible causes due to which I might get SocketTimeoutException ?
11/01/28 19:01:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException: 69000 millis timeout while waiting for
channel to be ready for connect. ch :
java.nio.channels.SocketChannel[conn
ral such efforts.
>>>
>>> Pig has PigMix
>>>
>>> Hadoop has terasort and likely some others as well.
>>>
>>
>> Hadoop has the terasort, and grid mix. There is even a new version of the
>> grid mix coming out. Look at:
>>
>> h
ould view output only from
> Map1 and nothing from Map2 in the logs.
>
> Regards,
> Raakhi
>
> On Fri, Jul 17, 2009 at 6:59 PM, jason hadoop >wrote:
>
> > There are some examples for chain mapping in the example bundle for
> chapter
> > 8 of Pro Hadoop.
>
wrote:
> Thank you for your reply, I am novice, I do not quite understand how to
> check dfs.data.dir more than eight minutes?
> My settings are dfs.data.dir
>
>dfs.data.dir
>/ b, / c, / d, / e, / f, / g, / h, / i, / j, / k, / l
>
>
> Each directory has a disk
There are some examples for chain mapping in the example bundle for chapter
8 of Pro Hadoop.
One think that may not be clear is that the chain of mappers execute within
the single task jvm assigned for each map task,
and the mappers chained to the reduce execute in the jvm assigned to the
reduce
, mingyang wrote:
> i using hadoop storage my media files,, but when the number of documents
> when more than one million,
> Hadoop start about 10-20 minutes, my datanode automatically down,
> namenode log shows that the loss of heart, but I see my normal datanode,
> port 50010 can be a nor
that let you log what is going on in the field comparator or field
partitioner.
On Thu, Jul 16, 2009 at 11:05 PM, jason hadoop wrote:
> In the example code for Pro Hadoop there are some shims for the
> fieldcomparator classes, that let you log what is going on in the
> partitioner.
&g
In the example code for Pro Hadoop there are some shims for the
fieldcomparator classes, that let you log what is going on in the
partitioner.
Also it is very useful if cumbersome to step through that in the debugger.
On Thu, Jul 16, 2009 at 10:59 PM, David_ca wrote:
> Hi,
>
> I am
>
> > Map-side join is almost always more efficient, but only handles some
> cases.
> > Reduce side joins always work, but require a complete map/reduce job to
> get
> > the join.
> >
> > -- Owen
> >
>
--
Pro Hadoop, a book to guide you from beginner to
, Boyu Zhang wrote:
> Thank you for your suggestion.
>
> I have done that plenty of times, and every time I delete the pids and the
> files /tmp/hadoop-name that namenode formate genrated. But I got the same
> error over and over.
>
>
> I found out that after I start-dfs.sh,
e files) then the big file can be
> processed
> in parallel by multiple mappers (each processing a split of the file).
> However if the compression codec that you use does not supprt file
> splitting, then whole file will be processed by one mapper and you won't
> achieve parallel
Thanks Tom.
The single reducer is greatly limiting in local mode.
On Tue, Jul 14, 2009 at 3:15 AM, Tom White wrote:
> There's a Jira to fix this here:
> https://issues.apache.org/jira/browse/MAPREDUCE-434
>
> Tom
>
> On Mon, Jul 13, 2009 at 12:34 AM, jason hadoop
> wr
If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>
--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.
sks(4);
>
> before starting the job and it seems that Hadoop overrides the
> variable, as it says:
>
> 09/07/12 12:07:40 INFO mapred.MapTask: numReduceTasks: 1
>
> Thanks!
> Rares
>
--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http:/
:
> A few comments before I answer:
> 1) Each time you send an email, we receive two emails. Is your mail client
> misconfigured?
> 2) You already asked this question in another thread :). See my response
> there.
>
> Short answer: <
>
> http://hadoop.apache.org/co
this issue and solved it? :)
>
>
> --
> Andraz Tori, CTO
> Zemanta Ltd, New York, London, Ljubljana
> www.zemanta.com
> mail: and...@zemanta.com
> tel: +386 41 515 767
> twitter: andraz, skype: minmax_test
>
>
>
>
--
Pro Hadoop, a book to guide you from begi
There will be no updates :)
On Fri, Jul 10, 2009 at 11:37 AM, Ted Dunning wrote:
> And NEVER expect updates to these variables to work like you think.
>
> On Thu, Jul 9, 2009 at 8:24 PM, jason hadoop
> wrote:
>
> > Then use them as needed in the map method of your mappe
; >
> > Hey Ram.
> > The problem is i initialize these variables in the run function after
> > receiving the cmd line arguments .
> > I want to access the same vars in the mpa function.
> > Is there a diff way other than passing the variables through a Conf
> ob
r the ten most
> visited pages on a certain site and so on.
>
> Wondering if that is even possible with hadopp or if I need to process the
> file outside of hadoop.
>
> Cheers
>
> /Marcus
>
> --
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
>
In the example code from Pro Hadoop, is a sample map reduce job that uses
mapside join to merge the files into a single output.
It is part of the chapter 9 examples.
On Wed, Jul 8, 2009 at 4:55 PM, Ted Dunning wrote:
> On Wed, Jul 8, 2009 at 3:38 PM, Owen O'Malley wrote:
>
>
Just out of curiosity, what happens when you run your script by hand?
On Wed, Jul 8, 2009 at 8:09 AM, Rares Vernica wrote:
> On Tue, Jul 7, 2009 at 10:26 PM, jason hadoop
> wrote:
> >
> > The mapper has no control at the point where your mymapper.sh script is
> > runni
d ?
> I log the counter call incrCounter.
> log stmt counters are fine and Reporter counters are not matching
>
> m i missing something ?
>
> -Sagar
>
>
>
>
>
--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?t
ould end up in your job's userlog.
You could redirect it to a temporary file to make it available in /tmp.
Chapter 8 of Pro Hadoop covers some fine details of streaming jobs.
It may be that there is something going on in the environment that is
resulting in your permission denied error.
On
to do it in parallel.
>
> I would guess this has been done before? Is there example code
> anywhere? I can imagine creating a mapper-only job with a list of files
> as input, but how do I easily write to HDFS from a mapper?
>
> Thanks,
> \EF
>
--
Pro Hadoop, a boo
lse);
>
> or
>
> conf.setMapSpeculativeExecution(false);
> conf.setReduceSpeculativeExecution(false);
>
> Thibaut
>
>
> marcusherou wrote:
> >
> > Hi.
> >
> > I've noticed that hadoop spawns parallell copies of the same task on
> > different hosts. I've underst
lot of smaller zip files (not gzip) that need to be
> processed. I put these into a SequenceFile outside of hadoop and then
> upload to hdfs. Once in hdfs, I have the mapper read the SequenceFile with
> each record being a zip file, then read it in as bytes that get
> decompressed, an
;
>
>
> On 7/6/09 12:23 PM, "Songting Chen" wrote:
>
>
>
> No response from the HaDoop cluster then - stop/start map/reduce would
> solve the problem.
>
> Note: HDFS has no such issue.
> Is it a common problem (we use v.19)?
>
> Thanks,
> -Songti
1 - 100 of 106 matches
Mail list logo