[jira] [Comment Edited] (SPARK-5682) Add encrypted shuffle in spark

Krish Dey (JIRA) Wed, 09 Nov 2016 14:30:24 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15652217#comment-15652217
 ]


Krish Dey edited comment on SPARK-5682 at 11/9/16 10:29 PM:
------------------------------------------------------------

The constructor still seems to be the same as it is. Doesn't this to be changed 
to accommodate encryption of spill to disk?  Moreover passing the 
DummySerializerInstance it should be allowed to pass any Serializer

public UnsafeSorterSpillWriter(BlockManager blockManager, int fileBufferSize, 
ShuffleWriteMetrics writeMetrics, int numRecordsToWrite) throws IOException {
    final Tuple2<TempLocalBlockId, File> spilledFileInfo =  
blockManager.diskBlockManager().createTempLocalBlock();
    this.file = spilledFileInfo._2();
    this.blockId = spilledFileInfo._1();
    this.numRecordsToWrite = numRecordsToWrite;
    // Unfortunately, we need a serializer instance in order to construct a 
DiskBlockObjectWriter.
    // Our write path doesn't actually use this serializer (since we end up 
calling the `write()`
    // OutputStream methods), but DiskBlockObjectWriter still calls some 
methods on it. To work
    // around this, we pass a dummy no-op serializer.
    writer = blockManager.getDiskWriter(
      blockId, file, DummySerializerInstance.INSTANCE, fileBufferSize, 
writeMetrics);
    // Write the number of records
    writeIntToBuffer(numRecordsToWrite, 0);
    writer.write(writeBuffer, 0, 4);
  }



was (Author: krish.dey):
The method still seems to be the same as it is. Doesn't this to be changed to 
accommodate encryption of spill to disk?

public UnsafeSorterSpillWriter(BlockManager blockManager, int fileBufferSize, 
ShuffleWriteMetrics writeMetrics, int numRecordsToWrite) throws IOException {
    final Tuple2<TempLocalBlockId, File> spilledFileInfo =  
blockManager.diskBlockManager().createTempLocalBlock();
    this.file = spilledFileInfo._2();
    this.blockId = spilledFileInfo._1();
    this.numRecordsToWrite = numRecordsToWrite;
    // Unfortunately, we need a serializer instance in order to construct a 
DiskBlockObjectWriter.
    // Our write path doesn't actually use this serializer (since we end up 
calling the `write()`
    // OutputStream methods), but DiskBlockObjectWriter still calls some 
methods on it. To work
    // around this, we pass a dummy no-op serializer.
    writer = blockManager.getDiskWriter(
      blockId, file, DummySerializerInstance.INSTANCE, fileBufferSize, 
writeMetrics);
    // Write the number of records
    writeIntToBuffer(numRecordsToWrite, 0);
    writer.write(writeBuffer, 0, 4);
  }


> Add encrypted shuffle in spark
> ------------------------------
>
>                 Key: SPARK-5682
>                 URL: https://issues.apache.org/jira/browse/SPARK-5682
>             Project: Spark
>          Issue Type: New Feature
>          Components: Shuffle
>            Reporter: liyunzhang_intel
>            Assignee: Ferdinand Xu
>             Fix For: 2.1.0
>
>         Attachments: Design Document of Encrypted Spark 
> Shuffle_20150209.docx, Design Document of Encrypted Spark 
> Shuffle_20150318.docx, Design Document of Encrypted Spark 
> Shuffle_20150402.docx, Design Document of Encrypted Spark 
> Shuffle_20150506.docx
>
>
> Encrypted shuffle is enabled in hadoop 2.6 which make the process of shuffle 
> data safer. This feature is necessary in spark. AES  is a specification for 
> the encryption of electronic data. There are 5 common modes in AES. CTR is 
> one of the modes. We use two codec JceAesCtrCryptoCodec and 
> OpensslAesCtrCryptoCodec to enable spark encrypted shuffle which is also used 
> in hadoop encrypted shuffle. JceAesCtrypoCodec uses encrypted algorithms  jdk 
> provides while OpensslAesCtrCryptoCodec uses encrypted algorithms  openssl 
> provides. 
> Because ugi credential info is used in the process of encrypted shuffle, we 
> first enable encrypted shuffle on spark-on-yarn framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-5682) Add encrypted shuffle in spark

Reply via email to