spark.local.dir

http://spark.apache.org/docs/latest/configuration.html

On Fri, Apr 28, 2017 at 8:51 AM, Shashi Vishwakarma <
shashi.vish...@gmail.com> wrote:

> Yes I am using HDFS .Just trying to understand couple of point.
>
> There would be two kind of encryption which would be required.
>
> 1. Data in Motion - This could be achieved by enabling SSL -
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.
> 0/bk_spark-component-guide/content/spark-encryption.html
>
> 2. Data at Rest - HDFS Encryption can be applied.
>
> Apart from this when spark executes a job , each disk available in all
> node needs to be encrypted .
>
> I can have multiple disk on each node and encrypting all of them could be
> costly operation - Therefore I was trying to identify during job execution
> what are possible folders where spark can spill data .
>
> Once these items are identified those specific disk can be encrypted.
>
> Thanks
> Shashi
>
>
>
>
> On Fri, Apr 28, 2017 at 4:34 PM, Jörn Franke <jornfra...@gmail.com> wrote:
>
>> Why don't you use whole disk encryption?
>> Are you using HDFS?
>>
>> On 28. Apr 2017, at 16:57, Shashi Vishwakarma <shashi.vish...@gmail.com>
>> wrote:
>>
>> Agreed Jorn. Disk encryption is one option that will help to secure data
>> but how do I know at which location Spark is spilling temp file, shuffle
>> data and application data ?
>>
>> Thanks
>> Shashi
>>
>> On Fri, Apr 28, 2017 at 3:54 PM, Jörn Franke <jornfra...@gmail.com>
>> wrote:
>>
>>> You can use disk encryption as provided by the operating system.
>>> Additionally, you may think about shredding disks after they are not used
>>> anymore.
>>>
>>> > On 28. Apr 2017, at 14:45, Shashi Vishwakarma <
>>> shashi.vish...@gmail.com> wrote:
>>> >
>>> > Hi All
>>> >
>>> > I was dealing with one the spark requirement here where Client (like
>>> Banking Client where security is major concern) needs all spark processing
>>> should happen securely.
>>> >
>>> > For example all communication happening between spark client and
>>> server ( driver & executor communication) should be on secure channel. Even
>>> when spark spills on disk based on storage level (Mem+Disk), it should not
>>> be written in un-encrypted format on local disk or there should be some
>>> workaround to prevent spill.
>>> >
>>> > I did some research  but could not get any concrete solution.Let me
>>> know if someone has done this.
>>> >
>>> > Any guidance would be a great help.
>>> >
>>> > Thanks
>>> > Shashi
>>>
>>
>>
>

Reply via email to