Re: Programatically create RDDs based on input
Thanks Natu, Ayan. I was able to create an array of Dataframes (Spark 1.3+). DataFrame[] dfs = new DataFrame[uniqueFileIds.length]; Thanks Amit On Sun, Nov 1, 2015 at 10:58 AM, Natu Lauchande wrote: > Hi Amit, > > I don't see any default constructor in the JavaRDD docs > https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaRDD.html > . > > Have you tried the following ? > > JavaRDD jRDD[] ; > > jRDD.add( jsc.textFile("/file1.txt") ) > jRDD.add( jsc.textFile("/file2.txt") ) > .. > ; > > Natu > > > On Sat, Oct 31, 2015 at 11:18 PM, ayan guha wrote: > >> My java knowledge is limited, but you may try with a hashmap and put RDDs >> in it? >> >> On Sun, Nov 1, 2015 at 4:34 AM, amit tewari >> wrote: >> >>> Thanks Ayan thats something similar to what I am looking at but trying >>> the same in Java is giving compile error: >>> >>> JavaRDD jRDD[] = new JavaRDD[3]; >>> >>> //Error: Cannot create a generic array of JavaRDD >>> >>> Thanks >>> Amit >>> >>> >>> >>> On Sat, Oct 31, 2015 at 5:46 PM, ayan guha wrote: >>> Corrected a typo... # In Driver fileList=["/file1.txt","/file2.txt"] rdds = [] for f in fileList: rdd = jsc.textFile(f) rdds.append(rdd) On Sat, Oct 31, 2015 at 11:14 PM, ayan guha wrote: > Yes, this can be done. quick python equivalent: > > # In Driver > fileList=["/file1.txt","/file2.txt"] > rdd = [] > for f in fileList: > rdd = jsc.textFile(f) > rdds.append(rdd) > > > > On Sat, Oct 31, 2015 at 11:09 PM, amit tewari > wrote: > >> Hi >> >> I need the ability to be able to create RDDs programatically inside >> my program (e.g. based on varaible number of input files). >> >> Can this be done? >> >> I need this as I want to run the following statement inside an >> iteration: >> >> JavaRDD rdd1 = jsc.textFile("/file1.txt"); >> >> Thanks >> Amit >> > > > > -- > Best Regards, > Ayan Guha > -- Best Regards, Ayan Guha >>> >>> >> >> >> -- >> Best Regards, >> Ayan Guha >> > >
Re: Programatically create RDDs based on input
Hi Amit, I don't see any default constructor in the JavaRDD docs https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaRDD.html . Have you tried the following ? JavaRDD jRDD[] ; jRDD.add( jsc.textFile("/file1.txt") ) jRDD.add( jsc.textFile("/file2.txt") ) .. ; Natu On Sat, Oct 31, 2015 at 11:18 PM, ayan guha wrote: > My java knowledge is limited, but you may try with a hashmap and put RDDs > in it? > > On Sun, Nov 1, 2015 at 4:34 AM, amit tewari > wrote: > >> Thanks Ayan thats something similar to what I am looking at but trying >> the same in Java is giving compile error: >> >> JavaRDD jRDD[] = new JavaRDD[3]; >> >> //Error: Cannot create a generic array of JavaRDD >> >> Thanks >> Amit >> >> >> >> On Sat, Oct 31, 2015 at 5:46 PM, ayan guha wrote: >> >>> Corrected a typo... >>> >>> # In Driver >>> fileList=["/file1.txt","/file2.txt"] >>> rdds = [] >>> for f in fileList: >>> rdd = jsc.textFile(f) >>> rdds.append(rdd) >>> >>> >>> On Sat, Oct 31, 2015 at 11:14 PM, ayan guha wrote: >>> Yes, this can be done. quick python equivalent: # In Driver fileList=["/file1.txt","/file2.txt"] rdd = [] for f in fileList: rdd = jsc.textFile(f) rdds.append(rdd) On Sat, Oct 31, 2015 at 11:09 PM, amit tewari wrote: > Hi > > I need the ability to be able to create RDDs programatically inside my > program (e.g. based on varaible number of input files). > > Can this be done? > > I need this as I want to run the following statement inside an > iteration: > > JavaRDD rdd1 = jsc.textFile("/file1.txt"); > > Thanks > Amit > -- Best Regards, Ayan Guha >>> >>> >>> >>> -- >>> Best Regards, >>> Ayan Guha >>> >> >> > > > -- > Best Regards, > Ayan Guha >
Re: Programatically create RDDs based on input
My java knowledge is limited, but you may try with a hashmap and put RDDs in it? On Sun, Nov 1, 2015 at 4:34 AM, amit tewari wrote: > Thanks Ayan thats something similar to what I am looking at but trying the > same in Java is giving compile error: > > JavaRDD jRDD[] = new JavaRDD[3]; > > //Error: Cannot create a generic array of JavaRDD > > Thanks > Amit > > > > On Sat, Oct 31, 2015 at 5:46 PM, ayan guha wrote: > >> Corrected a typo... >> >> # In Driver >> fileList=["/file1.txt","/file2.txt"] >> rdds = [] >> for f in fileList: >> rdd = jsc.textFile(f) >> rdds.append(rdd) >> >> >> On Sat, Oct 31, 2015 at 11:14 PM, ayan guha wrote: >> >>> Yes, this can be done. quick python equivalent: >>> >>> # In Driver >>> fileList=["/file1.txt","/file2.txt"] >>> rdd = [] >>> for f in fileList: >>> rdd = jsc.textFile(f) >>> rdds.append(rdd) >>> >>> >>> >>> On Sat, Oct 31, 2015 at 11:09 PM, amit tewari >>> wrote: >>> Hi I need the ability to be able to create RDDs programatically inside my program (e.g. based on varaible number of input files). Can this be done? I need this as I want to run the following statement inside an iteration: JavaRDD rdd1 = jsc.textFile("/file1.txt"); Thanks Amit >>> >>> >>> >>> -- >>> Best Regards, >>> Ayan Guha >>> >> >> >> >> -- >> Best Regards, >> Ayan Guha >> > > -- Best Regards, Ayan Guha
Re: Programatically create RDDs based on input
Thanks Ayan thats something similar to what I am looking at but trying the same in Java is giving compile error: JavaRDD jRDD[] = new JavaRDD[3]; //Error: Cannot create a generic array of JavaRDD Thanks Amit On Sat, Oct 31, 2015 at 5:46 PM, ayan guha wrote: > Corrected a typo... > > # In Driver > fileList=["/file1.txt","/file2.txt"] > rdds = [] > for f in fileList: > rdd = jsc.textFile(f) > rdds.append(rdd) > > > On Sat, Oct 31, 2015 at 11:14 PM, ayan guha wrote: > >> Yes, this can be done. quick python equivalent: >> >> # In Driver >> fileList=["/file1.txt","/file2.txt"] >> rdd = [] >> for f in fileList: >> rdd = jsc.textFile(f) >> rdds.append(rdd) >> >> >> >> On Sat, Oct 31, 2015 at 11:09 PM, amit tewari >> wrote: >> >>> Hi >>> >>> I need the ability to be able to create RDDs programatically inside my >>> program (e.g. based on varaible number of input files). >>> >>> Can this be done? >>> >>> I need this as I want to run the following statement inside an iteration: >>> >>> JavaRDD rdd1 = jsc.textFile("/file1.txt"); >>> >>> Thanks >>> Amit >>> >> >> >> >> -- >> Best Regards, >> Ayan Guha >> > > > > -- > Best Regards, > Ayan Guha >
Re: Programatically create RDDs based on input
Yes, this can be done. quick python equivalent: # In Driver fileList=["/file1.txt","/file2.txt"] rdd = [] for f in fileList: rdd = jsc.textFile(f) rdds.append(rdd) On Sat, Oct 31, 2015 at 11:09 PM, amit tewari wrote: > Hi > > I need the ability to be able to create RDDs programatically inside my > program (e.g. based on varaible number of input files). > > Can this be done? > > I need this as I want to run the following statement inside an iteration: > > JavaRDD rdd1 = jsc.textFile("/file1.txt"); > > Thanks > Amit > -- Best Regards, Ayan Guha