Re: Building a distributed system

2016-07-19 Thread Marcin Tustin
i Prakash <ravihad...@gmail.com> > *Sent:* Monday, July 18, 2016 7:45 PM > *To:* Marcin Tustin <mtus...@handybook.com> > *Cc:* Richard Whitehead <richard.whiteh...@ieee.org> ; > user@hadoop.apache.org > *Subject:* Re: Building a distributed system > > Welcome

Re: Building a distributed system

2016-07-19 Thread Richard Whitehead
can’t tell if that’s supported. I think we are going to drop this for now, the barrier to getting started is so high. Thanks a lot for your help, Richard From: Mirko Kämpf Sent: Tuesday, July 19, 2016 9:53 AM To: Richard Whitehead Subject: Re: Building a distributed system Hello Richard

Re: Building a distributed system

2016-07-19 Thread Richard Whitehead
@hadoop.apache.org Subject: Re: Building a distributed system Welcome to the community Richard! I suspect Hadoop can be more useful than just splitting and stitching back data. Depending on your use cases, it may come in handy to manage your machines, restart failed tasks, scheduling work when

Re: Building a distributed system

2016-07-18 Thread Ravi Prakash
Welcome to the community Richard! I suspect Hadoop can be more useful than just splitting and stitching back data. Depending on your use cases, it may come in handy to manage your machines, restart failed tasks, scheduling work when data becomes available etc. I wouldn't necessarily count it out.

Re: Building a distributed system

2016-07-18 Thread Marcin Tustin
I think you're confused as to what these things are. The fundamental question is do you want to run one job on sub parts of the data, then stitch their results together (in which case hive/map-reduce/spark will be for you), or do you essentially already have splitting to computer-sized chunks

Building a distributed system

2016-07-18 Thread Richard Whitehead
Hello, I wonder if the community can help me get started. I’m trying to design the architecture of a project and I think that using some Apache Hadoop technologies may make sense, but I am completely new to distributed systems and to Apache (I am a very experienced developer, but my