Agree with Bejoy. The problem you've mentioned sounds like building something like a workflow, which is what Oozie is supposed to do.
Thanks hemanth On Wed, Sep 26, 2012 at 12:22 AM, Bejoy Ks <bejoy.had...@gmail.com> wrote: > Hi Peter > > AFAIK oozie has a mechanism to achieve this. You can trigger your jobs as > soon as the files are written to a certain hdfs directory. > > > On Tue, Sep 25, 2012 at 10:23 PM, Peter Sheridan < > psheri...@millennialmedia.com> wrote: > >> These are log files being deposited by other processes, which we may >> not have control over. >> >> We don't want multiple processes to write to the same files — we just >> don't want to start our jobs until they have been completely written. >> >> Sorry for lack of clarity & thanks for the response. >> >> >> --Pete >> >> From: Bertrand Dechoux <decho...@gmail.com> >> Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org> >> Date: Tuesday, September 25, 2012 12:33 PM >> To: "user@hadoop.apache.org" <user@hadoop.apache.org> >> Subject: Re: Detect when file is not being written by another process >> >> Hi, >> >> Multiple files and aggregation or something like hbase? >> >> Could you tell use more about your context? What are the volumes? Why do >> you want multiple processes to write to the same file? >> >> Regards >> >> Bertrand >> >> On Tue, Sep 25, 2012 at 6:28 PM, Peter Sheridan < >> psheri...@millennialmedia.com> wrote: >> >>> Hi all. >>> >>> We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) >>> files when they've finished being written to HDFS by a different process. >>> There doesn't appear to be an API specifically for this. We had >>> discovered through experimentation that the FileSystem.append() method can >>> be used for this purpose — it will fail if another process is writing to >>> the file. >>> >>> However: when running this on a multi-node cluster, using that API >>> actually corrupts the file. Perhaps this is a known issue? Looking at the >>> bug tracker I see https://issues.apache.org/jira/browse/HDFS-265 and a >>> bunch of similar-sounding things. >>> >>> What's the right way to solve this problem? Thanks. >>> >>> >>> --Pete >>> >>> >> >> >> -- >> Bertrand Dechoux >> > >