Re: Spark2.1 installation issue

2017-07-27 Thread Vikash Kumar
orums / support > line instead of the Apache group. > > On Thu, Jul 27, 2017 at 10:54 AM, Vikash Kumar > <vikash.ku...@oneconvergence.com> wrote: > > I have installed spark2 parcel through cloudera CDH 12.0. I see some > issue > > there. Look like it didn't got configure

Spark2.1 installation issue

2017-07-27 Thread Vikash Kumar
I have installed spark2 parcel through cloudera CDH 12.0. I see some issue there. Look like it didn't got configured properly. $ spark2-shell Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream at

Split RDD by key and save to different files

2016-09-07 Thread Vikash Kumar
I need to spilt RDD [keys, Iterable[Value]] to save each key into different file. e.g I have records like: customerId, name, age, sex 111,abc,34,M 122, xyz,32,F 111,def,31,F 122.trp,30,F 133,jkl,35,M I need to write 3 different files based on customerId file1: 111,abc,34,M 111,def,31,F file2:

get and append file name in record being reading

2016-06-01 Thread Vikash Kumar
How I can get the file name of each record being reading? suppose input file ABC_input_0528.txt contains 111,abc,234 222,xyz,456 suppose input file ABC_input_0531.txt contains 100,abc,299 200,xyz,499 and I need to create one final output with file name in each record using dataframes my output

Re: how to get file name of record being reading in spark

2016-05-31 Thread Vikash Kumar
Can anybody suggest different solution using inputFileName or input_file_name On Tue, May 31, 2016 at 11:43 PM, Vikash Kumar <vikashsp...@gmail.com> wrote: > thanks Ajay but I have this below code to generate dataframes, So I wanted > to change in df only to achieve this. I thought i

Re: how to get file name of record being reading in spark

2016-05-31 Thread Vikash Kumar
file(s)...") *val df: DataFrame = readTextFile(sqlContext)* On Tue, May 31, 2016 at 11:26 PM, Ajay Chander <itsche...@gmail.com> wrote: > Hi Vikash, > > These are my thoughts, read the input directory using wholeTextFiles() > which would give a paired RDD with key as

how to get file name of record being reading in spark

2016-05-31 Thread Vikash Kumar
I have a requirement in which I need to read the input files from a directory and append the file name in each record while output. e.g. I have directory /input/files/ which have folllowing files: ABC_input_0528.txt ABC_input_0531.txt suppose input file ABC_input_0528.txt contains 111,abc,234

How to achieve nested for loop in Spark

2016-03-02 Thread Vikash Kumar
Can we implement nested for/while loop in spark? I have to convert some SQL procedure code into Spark. And it has multiple loops and processing and I want to implement this in spark. How to implement this. 1. open cursor and fetch for personType 2. open cursor and fetch for personGroup