Re: Using sc.HadoopConfiguration in Python

2015-05-14 Thread ayan guha
Super, it worked. Thanks On Fri, May 15, 2015 at 12:26 AM, Ram Sriharsha wrote: > Here is an example of how I would pass in the S3 parameters to hadoop > configuration in pyspark. > You can do something similar for other parameters you want to pass to the > hadoop configuration > > hadoopConf=sc

Re: Using sc.HadoopConfiguration in Python

2015-05-14 Thread Ram Sriharsha
Here is an example of how I would pass in the S3 parameters to hadoop configuration in pyspark. You can do something similar for other parameters you want to pass to the hadoop configuration hadoopConf=sc._jsc.hadoopConfiguration() hadoopConf.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.Native

Re: Using sc.HadoopConfiguration in Python

2015-05-14 Thread ayan guha
Jo Thanks for the reply, but _jsc does not have anything to pass hadoop configs. can you illustrate your answer a bit more? TIA... On Wed, May 13, 2015 at 12:08 AM, Ram Sriharsha wrote: > yes, the SparkContext in the Python API has a reference to the > JavaSparkContext (jsc) > > https://spark.a

Re: Using sc.HadoopConfiguration in Python

2015-05-12 Thread Ram Sriharsha
yes, the SparkContext in the Python API has a reference to the JavaSparkContext (jsc) https://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.SparkContext through which you can access the hadoop configuration On Tue, May 12, 2015 at 6:39 AM, ayan guha wrote: > Hi > > I found this m

Using sc.HadoopConfiguration in Python

2015-05-12 Thread ayan guha
Hi I found this method in scala API but not in python API (1.3.1). Basically, I want to change blocksize in order to read a binary file using sc.binaryRecords but with multiple partitions (for testing I want to generate partitions smaller than default blocksize)/ Is it possible in python? if so,