Re: FileSystem iterative or limited alternative to listStatus()

2014-01-11 Thread Ted Yu
Can you utilize the following API ? public FileStatus[] listStatus(Path f, PathFilter filter) Cheers On Sat, Jan 11, 2014 at 3:52 PM, John Lilley wrote: > Is there an HDFS file system method for listing a directory contents > iteratively, or at least stopping at some limit? We have an appl

FileSystem iterative or limited alternative to listStatus()

2014-01-11 Thread John Lilley
Is there an HDFS file system method for listing a directory contents iteratively, or at least stopping at some limit? We have an application in which the user can type wildcards, which our app expands. However, during interactive phases we only want the first matching file to check its format.

How to make AM terminate if client crashes?

2014-01-11 Thread John Lilley
We have a YARN application that we want to automatically terminate if the YARN client disconnects or crashes. Is it possible to configure the YarnClient-RM connection so that if the client terminates the RM automatically terminates the AM? Or do we need to build our own logic (e.g. a direct cl

Expressions in MapReduce

2014-01-11 Thread unmesha sreeveni
Are we able to do expresions in Mapreduce Say if i am having a csv file . which has 2 columns. The user is giving an expresion col1 + col2 = col3 Are we able to do this? And when again the user wants col1 - col2 = col4 Can we do it in the same mapreduce (dynamic change of expressions) -- *Thanks

Re: what all can be done using MR

2014-01-11 Thread unmesha sreeveni
For that do we have to write a custom class for value inorder to pass all the columns as value. ie in the example 2 values. Or jst do a concatenation and emit values. On Sat, Jan 11, 2014 at 9:46 PM, Chris Mawata wrote: > Results will be sorted by key so make A the key and put the rest in the >

Re: what all can be done using MR

2014-01-11 Thread Chris Mawata
Results will be sorted by key so make A the key and put the rest in the value Chris On Jan 11, 2014 10:11 AM, "unmesha sreeveni" wrote: > What about sorting . > Acutually it is done by MapReduce itself. > But if we are giving a csv file as input and trying to sort one/multiple > column...Whether

Re: what all can be done using MR

2014-01-11 Thread unmesha sreeveni
What about sorting . Acutually it is done by MapReduce itself. But if we are giving a csv file as input and trying to sort one/multiple column...Whether the corresponting columns also get reflectted?? eg: foo.csv B,2,3 A,4,6 When we apply sorting to first column:whether the resultent will be A,4

Re: Find max and min of a column in a csvfile

2014-01-11 Thread unmesha sreeveni
Thanks Jiayu and John Hancock. Showered a very nice hint for me. John that was a really gud link you provided. But i dnt know Pig. I am using java. Is there any java related document. like : http://packtlib.packtpub.com/library/hadoop-mapreduce-cookbook/ch06lvl1sec66# On Sat, Jan 11, 2014 at 6:

Re: Find max and min of a column in a csvfile

2014-01-11 Thread John Hancock
Unmesha, You may want to write your own mapper and reducer for the purpose of learning more about map-reduce programming techniques. However, the Pig documentation also discusses aggregate functions such as max() which may save you some time: http://pig.apache.org/docs/r0.12.0/udf.html -John