Hi,
I am trying unsuccessfully to apply a patch (HADOOP-6835) to hadoop-0.20.2
(64bit Ubuntu 10.04)
I have downloaded the tar.gz and can build the project -
I tried to apply the patch from
https://issues.apache.org/jira/browse/HADOOP-6835
(specifically
is it running in safemode?hadoop wil run in this case when start for a
moment.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Problem-in-copyFromLocal-tp1446688p1453684.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
So I'm pretty new to Hadoop, just learning it for work, and starting to play
with some of our data on a VM cluster to see it work, and to make sure it can
do what we need to. By and large, very cool, I think I'm getting the hang of
it, but when I try and make a custom composite key class, it
On Fri, Sep 10, 2010 at 1:08 PM, leibnitz se3g2...@gmail.com wrote:
is it running in safemode?hadoop wil run in this case when start for a
How do I find that out if it's running in safemode?
After the clue about datanode failures given in earlier replies, I did check
the datanode logs and they
If I submit a jar that has a lib directory that contains a bunch of
jars, shouldn't those jars be in the classpath and available to all nodes?
The reason I ask this is because I am trying to submit a jar myjar.jar
that has the following structure
--src
\ (My source classes)
-- lib
\
On Sep 10, 2010, at 11:53 AM, Mark wrote:
If I submit a jar that has a lib directory that contains a bunch of jars,
shouldn't those jars be in the classpath and available to all nodes?
Are you using distributed cache?
Lewis Crawford wrote:
I am trying unsuccessfully to apply a patch (HADOOP-6835) to hadoop-0.20.2
(64bit Ubuntu 10.04)
using ant on the command line I was able to build the project again
and generate a new jar hadoop-0.20.3-dev-core.jar which I copied back
into the $HADOOP_HOME and started
Yes that seems to have done the trick!
Thanks
Lewis.
On 10 September 2010 20:39, Greg Roelofs roel...@yahoo-inc.com wrote:
Lewis Crawford wrote:
I am trying unsuccessfully to apply a patch (HADOOP-6835) to hadoop-0.20.2
(64bit Ubuntu 10.04)
using ant on the command line I was able to
Hello ,
I am new to Hadoop.Can anybody suggest any example or procedure of
outputting TOP N items having maximum total count, where the input file has
have (Item, count ) pair in each line .
Items can repeat.
Thanks
Neil
http://neilghosh.com
Hello ,
I am new to Hadoop.Can anybody suggest any example or procedure of
outputting TOP N items having maximum total count, where the input file has
have (Item, count ) pair in each line .
Items can repeat.
Thanks
Neil
http://neilghosh.com
--
Thanks and Regards
Neil
http://neilghosh.com
Welcome to the land of the fuzzy elephant!
Of course there are many ways to do it. Here is one, it might not be brilliant
or the right was, but I am sure you will get more :)
Use the identity mapper...
job.setMapperClass(Mapper.class);
then have one reducer
I dont know? I'm running in a fully distributed environment.. ie not
local or psuedo.
On 9/10/10 12:03 PM, Allen Wittenauer wrote:
On Sep 10, 2010, at 11:53 AM, Mark wrote:
If I submit a jar that has a lib directory that contains a bunch of jars,
shouldn't those jars be in the classpath
Hi Sonal,
The 0.21.0 jars are not available in Maven yet, since the process for
publishing them post split has changed.
See HDFS-1292 and MAPREDUCE-1929.
Cheers,
Tom
On Fri, Sep 10, 2010 at 1:33 PM, Sonal Goyal sonalgoy...@gmail.com wrote:
Hi,
Can someone please point me to the Maven repo
Thanks James,
This gives me only N results for sure but not necessarily the top N
I have used the Item as Key and Count as Value as input to the reducer.
and my reducing logic is to sum the count for a particular item.
Now my output comes as grouped but not in order.
Do I need to use custom
Thanks Aaron. I employed two Jobs and solved the problem.
I was just wondering is there anyway , it can be done in single job so that
disk/network I/O is less and no temporary storage is required between 1st
and second job.
Neil
On Sat, Sep 11, 2010 at 4:37 AM, Aaron Baff
Hi Neil,
Uniques and Top N, as well as percentiles, are inherently difficult to
distribute/parallelize since you have to have a global view of the dataset.
You can optimize the computations given some assumptions about the input
(the # of unique values, prevalence of the most frequent value
Hi Alex ,
Thanks so much for the reply . As of now I don't have any issue with 2
Jobs.I was just making sure that I am not missing any obvious way of writing
the program in one job.I will get back if I need to optimize on performance
based on specific pattern of input.
Thank you so much you all
If I deploy 1 jar (that contains a lib directory with all the required
dependencies) shouldn't that jar be inherently be distributed to all the
nodes?
On 9/10/10 2:49 PM, Mark wrote:
I dont know? I'm running in a fully distributed environment.. ie not
local or psuedo.
On 9/10/10 12:03 PM,
Have you considered using something higher-level like PIG or Hive? Are
there reasons why you need to process at this low level?
-Original Message-
From: Aaron Baff [mailto:aaron.b...@telescope.tv]
Sent: Friday, September 10, 2010 11:50 PM
To: common-user@hadoop.apache.org
Subject: Custom
Is the footer on this email a little rough for content that will be passed
around and made indexable on the internets?
Just saying :)
Cheers
James
Sent from my mobile. Please excuse the typos.
On 2010-09-10, at 8:01 PM, Kaluskar, Sanjay skalus...@informatica.com wrote:
Have you considered
Are the libs exploded inside the main jar? If not then no it probably won't
work.
James
Sent from my mobile. Please excuse the typos.
On 2010-09-10, at 7:43 PM, Mark static.void@gmail.com wrote:
If I deploy 1 jar (that contains a lib directory with all the required
dependencies)
Assuming N is not too large in the sense that your reducers can keep a tree
map of N elements, then you can have your reducer maintain the top N
elements in a tree-map (or a priority queue, or a heap, whatever), with
counts as keys in the tree-map. As the reducers progress, you throw away
the
22 matches
Mail list logo