Building a distributed system

2016-07-18 Thread Richard Whitehead
Hello, I wonder if the community can help me get started. I’m trying to design the architecture of a project and I think that using some Apache Hadoop technologies may make sense, but I am completely new to distributed systems and to Apache (I am a very experienced developer, but my expertise

Re: Building a distributed system

2016-07-18 Thread Marcin Tustin
I think you're confused as to what these things are. The fundamental question is do you want to run one job on sub parts of the data, then stitch their results together (in which case hive/map-reduce/spark will be for you), or do you essentially already have splitting to computer-sized chunks figu

Re: FileSystem object's close method is not called. hadoop 2.7.2

2016-07-18 Thread Chris Nauroth
Hello Michael, Historically, there has never been a firm requirement that clients must call FileSystem#close upon finishing usage of an instance. I think the history here is that the close method was not part of the initial API definition, and when it was added, there were already a lot of exi

Re: Hadoop Installation on Windows 7 in 64 bit

2016-07-18 Thread Arpit Agarwal
Hi Vinodh, Are there any spaces in your JAVA_HOME path? If so you need to use the short (8.3) path. E.g. c:\progra~1\java (assuming you haven’t done so already). From: Rakesh Radhakrishnan Date: Sunday, July 17, 2016 at 11:03 PM To: Vinodh Nagaraj Cc: "user@hadoop.apache.org" Subject: Re: Ha

Re: Building a distributed system

2016-07-18 Thread Ravi Prakash
Welcome to the community Richard! I suspect Hadoop can be more useful than just splitting and stitching back data. Depending on your use cases, it may come in handy to manage your machines, restart failed tasks, scheduling work when data becomes available etc. I wouldn't necessarily count it out.

Where's official Docker image for Hadoop?

2016-07-18 Thread Klaus Ma
Hi team, Does anyone know where's official docker images? If not, I'd like to contribute the Dockefile for it. BTW, do we have official docker hub account for Hadoop? If any suggestion, please let me know. Da (Klaus), Ma (??), PMP®| Software Architect Platform DCOS Development & Suppo

Re: Where's official Docker image for Hadoop?

2016-07-18 Thread Roman Shaposhnik
On Mon, Jul 18, 2016 at 5:34 PM, Klaus Ma wrote: > Hi team, > > > Does anyone know where's official docker images? If not, I'd like to > contribute the Dockefile for it. I am just curious, what's your use case? Also, you may want to look at the following "prior art" in the area of Hadoop/Docker:

Re: Where's official Docker image for Hadoop?

2016-07-18 Thread Klaus Ma
I'd like to deploy YARN by Kubernetes. I built docker images with Apache Hadoop, and I'd like to contribute it into hadoop source if not. It'll be great if Hadoop have an official place for those images. Da (Klaus), Ma (??), PMP(r)| Software Architect Platform DCOS Development & Support

Re: Where's official Docker image for Hadoop?

2016-07-18 Thread Deepak Vohra
The cloudera/quickstart is the Docker image for Hadoop. https://hub.docker.com/r/cloudera/quickstart/ Also refer, http://www.cloudera.com/documentation/enterprise/5-6-x/topics/quickstart_docker_container.html http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-apache-h

Re: Where's official Docker image for Hadoop?

2016-07-18 Thread Deepak Vohra
A custom implementation would have to be developed using some container orchestration service such as Kubernetes. Create a cluster of Pods (container sets) with different daemons running in different Pods and scale the Pods. For example, start ResourceManager on one instance and NodeManager on m