Stuart,
We use Hadoop in parts of our ETL processing for our data warehouse.
We ran into a similar problem of needing to share about 60 million key
value pairs (dimension keys) amongst the mapper jobs running in the
final phase of our ETL process. Our cluster is a small 3 machine 20
cor
memcached
lookup fails but that should not happen to often.
Thanks.
--sean
Sean Shanny
ssha...@tripadvisor.com
On Jan 14, 2009, at 9:47 PM, Delip Rao wrote:
Hi,
I need to lookup a large number of key/value pairs in my map(). Is
there any indexed hashtable available as a part of Hadoop I/O API
I am
stumped on how to use a MapReader based on a file in the
DistributedCache.
Thanks.
--sean
Sean Shanny
ssha...@tripadvisor.com
On Dec 28, 2008, at 10:59 PM, Amareshwari Sriramadasu wrote:
Sean Shanny wrote:
To all,
Version: hadoop-0.17.2.1-core.jar
I have created a MapFile.
W
To all,
Version: hadoop-0.17.2.1-core.jar
I have created a MapFile.
What I don't seem to be able to do is correctly place the MapFile in
the DistributedCache and the make use of it in a map method.
I need the following info please:
1. How and where to place the MapFile directory so that i
Thanks for your suggestion but unfortunately it did not fix the issue.
Thanks.
--sean
Sean Shanny
ssha...@tripadvisor.com
On Dec 25, 2008, at 8:19 AM, Devaraj Das wrote:
IIRC, enabling symlink creation for your files should solve the
problem.
Call DistributedCache.createSymLink
To all,
Version: hadoop-0.17.2.1-core.jar
I created a MapFile on a local node.
I put the files into the HDFS using the following commands:
$ bin/hadoop fs -copyFromLocal /tmp/ur/data/2008-12-19/url/data
$ bin/hadoop fs -copyFromLocal /tmp/ur/index /2008-12-19/url/index
and placed them