Re: Programatically create RDDs based on input

2015-11-02 Thread amit tewari
Thanks Natu, Ayan.

I was able to create an array of Dataframes (Spark 1.3+).

DataFrame[] dfs = new DataFrame[uniqueFileIds.length];

Thanks
Amit

On Sun, Nov 1, 2015 at 10:58 AM, Natu Lauchande 
wrote:

> Hi Amit,
>
> I don't see any default constructor in the JavaRDD docs
> https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaRDD.html
> .
>
> Have you tried the following ?
>
> JavaRDD jRDD[] ;
>
> jRDD.add( jsc.textFile("/file1.txt") )
> jRDD.add( jsc.textFile("/file2.txt") )
> ..
> ;
>
> Natu
>
>
> On Sat, Oct 31, 2015 at 11:18 PM, ayan guha  wrote:
>
>> My java knowledge is limited, but you may try with a hashmap and put RDDs
>> in it?
>>
>> On Sun, Nov 1, 2015 at 4:34 AM, amit tewari 
>> wrote:
>>
>>> Thanks Ayan thats something similar to what I am looking at but trying
>>> the same in Java is giving compile error:
>>>
>>> JavaRDD jRDD[] = new JavaRDD[3];
>>>
>>> //Error: Cannot create a generic array of JavaRDD
>>>
>>> Thanks
>>> Amit
>>>
>>>
>>>
>>> On Sat, Oct 31, 2015 at 5:46 PM, ayan guha  wrote:
>>>
 Corrected a typo...

 # In Driver
 fileList=["/file1.txt","/file2.txt"]
 rdds = []
 for f in fileList:
  rdd = jsc.textFile(f)
  rdds.append(rdd)


 On Sat, Oct 31, 2015 at 11:14 PM, ayan guha 
 wrote:

> Yes, this can be done. quick python equivalent:
>
> # In Driver
> fileList=["/file1.txt","/file2.txt"]
> rdd = []
> for f in fileList:
>  rdd = jsc.textFile(f)
>  rdds.append(rdd)
>
>
>
> On Sat, Oct 31, 2015 at 11:09 PM, amit tewari 
> wrote:
>
>> Hi
>>
>> I need the ability to be able to create RDDs programatically inside
>> my program (e.g. based on varaible number of input files).
>>
>> Can this be done?
>>
>> I need this as I want to run the following statement inside an
>> iteration:
>>
>> JavaRDD rdd1 = jsc.textFile("/file1.txt");
>>
>> Thanks
>> Amit
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>



 --
 Best Regards,
 Ayan Guha

>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Ayan Guha
>>
>
>


Re: Programatically create RDDs based on input

2015-10-31 Thread Natu Lauchande
Hi Amit,

I don't see any default constructor in the JavaRDD docs
https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaRDD.html
.

Have you tried the following ?

JavaRDD jRDD[] ;

jRDD.add( jsc.textFile("/file1.txt") )
jRDD.add( jsc.textFile("/file2.txt") )
..
;

Natu


On Sat, Oct 31, 2015 at 11:18 PM, ayan guha  wrote:

> My java knowledge is limited, but you may try with a hashmap and put RDDs
> in it?
>
> On Sun, Nov 1, 2015 at 4:34 AM, amit tewari 
> wrote:
>
>> Thanks Ayan thats something similar to what I am looking at but trying
>> the same in Java is giving compile error:
>>
>> JavaRDD jRDD[] = new JavaRDD[3];
>>
>> //Error: Cannot create a generic array of JavaRDD
>>
>> Thanks
>> Amit
>>
>>
>>
>> On Sat, Oct 31, 2015 at 5:46 PM, ayan guha  wrote:
>>
>>> Corrected a typo...
>>>
>>> # In Driver
>>> fileList=["/file1.txt","/file2.txt"]
>>> rdds = []
>>> for f in fileList:
>>>  rdd = jsc.textFile(f)
>>>  rdds.append(rdd)
>>>
>>>
>>> On Sat, Oct 31, 2015 at 11:14 PM, ayan guha  wrote:
>>>
 Yes, this can be done. quick python equivalent:

 # In Driver
 fileList=["/file1.txt","/file2.txt"]
 rdd = []
 for f in fileList:
  rdd = jsc.textFile(f)
  rdds.append(rdd)



 On Sat, Oct 31, 2015 at 11:09 PM, amit tewari 
 wrote:

> Hi
>
> I need the ability to be able to create RDDs programatically inside my
> program (e.g. based on varaible number of input files).
>
> Can this be done?
>
> I need this as I want to run the following statement inside an
> iteration:
>
> JavaRDD rdd1 = jsc.textFile("/file1.txt");
>
> Thanks
> Amit
>



 --
 Best Regards,
 Ayan Guha

>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>


Re: Programatically create RDDs based on input

2015-10-31 Thread ayan guha
Yes, this can be done. quick python equivalent:

# In Driver
fileList=["/file1.txt","/file2.txt"]
rdd = []
for f in fileList:
 rdd = jsc.textFile(f)
 rdds.append(rdd)



On Sat, Oct 31, 2015 at 11:09 PM, amit tewari 
wrote:

> Hi
>
> I need the ability to be able to create RDDs programatically inside my
> program (e.g. based on varaible number of input files).
>
> Can this be done?
>
> I need this as I want to run the following statement inside an iteration:
>
> JavaRDD rdd1 = jsc.textFile("/file1.txt");
>
> Thanks
> Amit
>



-- 
Best Regards,
Ayan Guha


Re: Programatically create RDDs based on input

2015-10-31 Thread amit tewari
Thanks Ayan thats something similar to what I am looking at but trying the
same in Java is giving compile error:

JavaRDD jRDD[] = new JavaRDD[3];

//Error: Cannot create a generic array of JavaRDD

Thanks
Amit



On Sat, Oct 31, 2015 at 5:46 PM, ayan guha  wrote:

> Corrected a typo...
>
> # In Driver
> fileList=["/file1.txt","/file2.txt"]
> rdds = []
> for f in fileList:
>  rdd = jsc.textFile(f)
>  rdds.append(rdd)
>
>
> On Sat, Oct 31, 2015 at 11:14 PM, ayan guha  wrote:
>
>> Yes, this can be done. quick python equivalent:
>>
>> # In Driver
>> fileList=["/file1.txt","/file2.txt"]
>> rdd = []
>> for f in fileList:
>>  rdd = jsc.textFile(f)
>>  rdds.append(rdd)
>>
>>
>>
>> On Sat, Oct 31, 2015 at 11:09 PM, amit tewari 
>> wrote:
>>
>>> Hi
>>>
>>> I need the ability to be able to create RDDs programatically inside my
>>> program (e.g. based on varaible number of input files).
>>>
>>> Can this be done?
>>>
>>> I need this as I want to run the following statement inside an iteration:
>>>
>>> JavaRDD rdd1 = jsc.textFile("/file1.txt");
>>>
>>> Thanks
>>> Amit
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Ayan Guha
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>


Re: Programatically create RDDs based on input

2015-10-31 Thread ayan guha
My java knowledge is limited, but you may try with a hashmap and put RDDs
in it?

On Sun, Nov 1, 2015 at 4:34 AM, amit tewari  wrote:

> Thanks Ayan thats something similar to what I am looking at but trying the
> same in Java is giving compile error:
>
> JavaRDD jRDD[] = new JavaRDD[3];
>
> //Error: Cannot create a generic array of JavaRDD
>
> Thanks
> Amit
>
>
>
> On Sat, Oct 31, 2015 at 5:46 PM, ayan guha  wrote:
>
>> Corrected a typo...
>>
>> # In Driver
>> fileList=["/file1.txt","/file2.txt"]
>> rdds = []
>> for f in fileList:
>>  rdd = jsc.textFile(f)
>>  rdds.append(rdd)
>>
>>
>> On Sat, Oct 31, 2015 at 11:14 PM, ayan guha  wrote:
>>
>>> Yes, this can be done. quick python equivalent:
>>>
>>> # In Driver
>>> fileList=["/file1.txt","/file2.txt"]
>>> rdd = []
>>> for f in fileList:
>>>  rdd = jsc.textFile(f)
>>>  rdds.append(rdd)
>>>
>>>
>>>
>>> On Sat, Oct 31, 2015 at 11:09 PM, amit tewari 
>>> wrote:
>>>
 Hi

 I need the ability to be able to create RDDs programatically inside my
 program (e.g. based on varaible number of input files).

 Can this be done?

 I need this as I want to run the following statement inside an
 iteration:

 JavaRDD rdd1 = jsc.textFile("/file1.txt");

 Thanks
 Amit

>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Ayan Guha
>>
>
>


-- 
Best Regards,
Ayan Guha