How to -moveToLocal to local machine instead of remote machine?

2011-07-31 Thread Gabriele Kahlout
I have sort of the same problem I feel. I've a large segment I cannot index because there's not enough room, nor -copyToLocal, local being the server. How can I instead -copyToLocal where local is my local terminal machine, instead of the server? On Tue, Jun 7, 2011 at 9:30 PM, Joe Greenawalt wro

Hadoop cluster network requirement

2011-07-31 Thread jonathan.hwang
I was asked by our IT folks if we can put hadoop name nodes storage using a shared disk storage unit. Does anyone have experience of how much IO throughput is required on the name nodes? What are the latency/data throughput requirements between the master and data nodes - can this tolerate net

DFSClient Protocol and FileSystem class

2011-07-31 Thread jagaran das
What is the difference between DFSClient Protocol and FileSystem class in Hadoop DFS (HDFS). Both of these classes are used for connecting a remote client to the namenode in HDFS. So,  I wanted to know the advantages of one over the other and which one is suitable for remote-client connection

Re: How to -moveToLocal to local machine instead of remote machine?

2011-07-31 Thread Gabriele Kahlout
It looks like karmasphere can handle it: INFO com.karmasphere.studio.hadoop.netbeans.filesystem.CommonsVfsBrowserTransferHandler importData: Past filenode: FileNode for file:///Users/simpatico/Documents, state=Updated; type=folder INFO com.karmasphere.studio.hadoop.netbeans.filesystem.FileOperatio

Re: DFSClient Protocol and FileSystem class

2011-07-31 Thread Tsz Wo Sze
Hi JD, FileSystem is a public API but DFSClient is an internal class.  For developing Hadoop applications, we should use FileSystem. Tsz-Wo From: jagaran das To: "common-user@hadoop.apache.org" Sent: Sunday, July 31, 2011 2:50 PM Subject: DFSClient Protocol

Re: Hadoop cluster network requirement

2011-07-31 Thread Allen Wittenauer
On Jul 31, 2011, at 12:08 PM, wrote: > I was asked by our IT folks if we can put hadoop name nodes storage using a > shared disk storage unit. What do you mean by "shared disk storage unit"? There are lots of products out there that would claim this, so actual deployment semantic

Re: Moving Files to Distributed Cache in MapReduce

2011-07-31 Thread Allen Wittenauer
We really need to build a working example to the wiki and add a link from the FAQ page. Any volunteers? On Jul 29, 2011, at 7:49 PM, Michael Segel wrote: > > Here's the meat of my post earlier... > Sample code on putting a file on the cache: > DistributedCache.addCacheFile(new URI(path+"MyFil

RE: Hadoop cluster network requirement

2011-07-31 Thread Saqib Jang -- Margalla Communications
Thanks, I'm independently doing some digging into Hadoop networking requirements and had a couple of quick follow-ups. Could I have some specific info on why different data centers cannot be supported for master node and data node comms? Also, what may be the benefits/use cases for such a scenar

Re: Hadoop cluster network requirement

2011-07-31 Thread Allen Wittenauer
On Jul 31, 2011, at 7:30 PM, Saqib Jang -- Margalla Communications wrote: > Thanks, I'm independently doing some digging into Hadoop networking > requirements and > had a couple of quick follow-ups. Could I have some specific info on why > different data centers > cannot be supported for master

Re: how to use TotalOrderPartitioner

2011-07-31 Thread Amareshwari Sri Ramadasu
The example Sort, at org.apache.hadoop.examples.Sort, uses TotalOrderPartitioner with InputSampler. You can have a look at it. Thanks Amareshwari On 7/29/11 11:20 PM, "Sofia Georgiakaki" wrote: Good evening, does anyone have an example of how I can use the TotalOrderPartitioner (with InputSam