Super, it worked. Thanks
On Fri, May 15, 2015 at 12:26 AM, Ram Sriharsha
wrote:
> Here is an example of how I would pass in the S3 parameters to hadoop
> configuration in pyspark.
> You can do something similar for other parameters you want to pass to the
> hadoop configuration
>
> hadoopConf=sc
Here is an example of how I would pass in the S3 parameters to hadoop
configuration in pyspark.
You can do something similar for other parameters you want to pass to the
hadoop configuration
hadoopConf=sc._jsc.hadoopConfiguration()
hadoopConf.set("fs.s3.impl",
"org.apache.hadoop.fs.s3native.Native
Jo
Thanks for the reply, but _jsc does not have anything to pass hadoop
configs. can you illustrate your answer a bit more? TIA...
On Wed, May 13, 2015 at 12:08 AM, Ram Sriharsha
wrote:
> yes, the SparkContext in the Python API has a reference to the
> JavaSparkContext (jsc)
>
> https://spark.a
yes, the SparkContext in the Python API has a reference to the
JavaSparkContext (jsc)
https://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.SparkContext
through which you can access the hadoop configuration
On Tue, May 12, 2015 at 6:39 AM, ayan guha wrote:
> Hi
>
> I found this m
Hi
I found this method in scala API but not in python API (1.3.1).
Basically, I want to change blocksize in order to read a binary file using
sc.binaryRecords but with multiple partitions (for testing I want to
generate partitions smaller than default blocksize)/
Is it possible in python? if so,