hi harsh, yes thank you, we are using sync() API, and still unable to read unclosed files in mapreduce. we are able to cat non-closed files, was it possible if we haven't use sync API() call?
have anybody tried ruing a M/R on a non-closed files ? are we missing something ? 10x Niv On Mon, Mar 5, 2012 at 3:42 PM, Harsh J <ha...@cloudera.com> wrote: > Niv, > > Did you also try the sync() approach I mentioned? Did that not work? > CDH3u2 does have the sync() API in it, so you can use it right away. > > On Sun, Mar 4, 2012 at 11:26 PM, Niv Mizrahi <n...@taykey.com> wrote: > > hi harsh, > > > > thank you for the quick response. > > we are currently running with cdh3u2. > > > > i have run map-reduces in many forms on non-closed files: > > 1. streaming -mapper /bin/cat > > 2. run word count > > 3. run our own java job. > > > > output parts are always empty, the jobs ended successfully. > > > > running hadoop fs -cat on the same input return results. > > > > am i doing something wrong ? > > > > niv > > > > > > > > On Sun, Mar 4, 2012 at 6:49 PM, Harsh J <ha...@cloudera.com> wrote: > >> > >> Technically, yes, you can run MR jobs on non-closed files (It'll run > >> the reader in the same way as your -cat) , but your would only be able > >> to read until the last complete block, or until the point sync() was > >> called on the output stream. > >> > >> It is better if your file-writer uses the sync() API judiciously to > >> mark sync points after a considerable amount of records, so that your > >> MR readers in tasks read until whole records and not just block > >> boundaries. > >> > >> For a description on sync() API, read the section 'Coherency Model' in > >> Tom White's "Hadoop: The Definitive Guide" (O'Reilly), Page 68. > >> > >> On Sun, Mar 4, 2012 at 8:07 PM, Niv Mizrahi <n...@taykey.com> wrote: > >> > hi all, > >> > > >> > we are looking for a way, to map-reduce on a non-closed files. > >> > we currently able to run a > >> > hadoop fs -cat <non-closed-file> > >> > > >> > non-closed files - files that are currently been written, and have not > >> > been > >> > closed yet. > >> > > >> > is there any way to run map-reduce a on non-closed files ?? > >> > > >> > > >> > 10x in advance for any answer > >> > -- > >> > Niv Mizrahi > >> > Taykey | www.taykey.com > >> > > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > > > -- > > Niv Mizrahi > > Taykey | www.taykey.com > > > > > > -- > Harsh J > -- *Niv Mizrahi* Taykey | www.taykey.com