Re: Fine control with sc.sequenceFile

2015-06-29 Thread Koert Kuipers
see also: https://github.com/apache/spark/pull/6848 On Mon, Jun 29, 2015 at 12:48 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: sc.hadoopConfiguration.set(mapreduce.input.fileinputformat.split.maxsize, 67108864) sc.sequenceFile(getMostRecentDirectory(tablePath, _.startsWith(_)).get + /*,

Re: Fine control with sc.sequenceFile

2015-06-28 Thread ๏̯͡๏
val hadoopConf = new Configuration(sc.hadoopConfiguration) hadoopConf.set(mapreduce.input.fileinputformat.split.maxsize, 67108864) sc.hadoopConfiguration(hadoopConf) or sc.hadoopConfiguration = hadoopConf threw error. On Sun, Jun 28, 2015 at 9:32 PM, Ted Yu

Fine control with sc.sequenceFile

2015-06-28 Thread ๏̯͡๏
I can do this val hadoopConf = new Configuration(sc.hadoopConfiguration) *hadoopConf.set(mapreduce.input.fileinputformat.split.maxsize, 67108864)* sc.newAPIHadoopFile( path + /*.avro, classOf[AvroKeyInputFormat[GenericRecord]], classOf[AvroKey[GenericRecord]],

Re: Fine control with sc.sequenceFile

2015-06-28 Thread ๏̯͡๏
sc.hadoopConfiguration.set(mapreduce.input.fileinputformat.split.maxsize, 67108864) sc.sequenceFile(getMostRecentDirectory(tablePath, _.startsWith(_)).get + /*, classOf[Text], classOf[Text]) works On Sun, Jun 28, 2015 at 9:46 PM, Ted Yu yuzhih...@gmail.com wrote: There isn't setter for

Re: Fine control with sc.sequenceFile

2015-06-28 Thread Ted Yu
There isn't setter for sc.hadoopConfiguration You can directly change value of parameter in sc.hadoopConfiguration However, see the note in scaladoc: * '''Note:''' As it will be reused in all Hadoop RDDs, it's better not to modify it unless you * plan to set some global configurations for

Re: Fine control with sc.sequenceFile

2015-06-28 Thread Ted Yu
sequenceFile() calls hadoopFile() where: val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration)) You can set the parameter in sc.hadoopConfiguration before calling sc.sequenceFile(). Cheers On Sun, Jun 28, 2015 at 9:23 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: