[ 
https://issues.apache.org/jira/browse/PARQUET-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257952#comment-14257952
 ] 

Manish Agarwal edited comment on PARQUET-154 at 12/24/14 5:03 AM:
------------------------------------------------------------------

for write : If the path is not Qualified and we pass path and file system 
separately in a write call then there  should not be any issue .  it would open 
a handle using the file System on the unqualified path . 

once we do it ,  ParquetFileWriter(FileSystem fs, MessageType schema, Path 
file) wil not be required to so FileSystem fs = 
file.getFileSystem(configuration);  as it will use the file system passed form 
the user


was (Author: manish.agarwal):
for write : If the path is not Qualified and we pass path and file system 
separately in a write call then there  should not be any issue .  it would open 
a handle using the file System on the unqualified path . 

once we do it , 
+  public ParquetFileWriter(FileSystem fs, MessageType schema, Path file) 
throws IOException {
+    super();
+    this.schema = schema;
+    // FileSystem fs = file.getFileSystem(configuration);   <== NOT REQUIRED 
+    this.out = fs.create(file, false);
+  }


& writer will be 
+  public ParquetWriter(
+      Path file,
+      WriteSupport<T> writeSupport,
+      CompressionCodecName compressionCodecName,
+      int blockSize,
+      int pageSize,
+      int dictionaryPageSize,
+      boolean enableDictionary,
+      boolean validating,
+      WriterVersion writerVersion,
+      Configuration conf,
+      FileSystem fs) throws IOException {
+
+    WriteSupport.WriteContext writeContext = writeSupport.init(conf);
+    MessageType schema = writeContext.getSchema();
+
+    ParquetFileWriter fileWriter = new ParquetFileWriter(fs, schema, file);
+    fileWriter.start();
+
+    CodecFactory codecFactory = new CodecFactory(conf);
+    CodecFactory.BytesCompressor compressor =  
codecFactory.getCompressor(compressionCodecName, 0);
+    this.writer = new InternalParquetRecordWriter<T>(
+        fileWriter,
+        writeSupport,
+        schema,
+        writeContext.getExtraMetaData(),
+        blockSize,
+        pageSize,
+        compressor,
+        dictionaryPageSize,
+        enableDictionary,
+        validating,
+        writerVersion);
+  }




> parquet Constructors not taking file System object
> --------------------------------------------------
>
>                 Key: PARQUET-154
>                 URL: https://issues.apache.org/jira/browse/PARQUET-154
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: parquet-mr_1.6.0
>            Reporter: Manish Agarwal
>
> I am trying to create a file in parquet file format and in RC file format  . 
>  
> No Parquet constructor accepts fileSystem object as an argument. This means 
> that i will have to append the uri from the file system in front of the 
> filepath object everytime i need to create a new file . 
> In RC format the  file system object is allowed to be passed in the 
> constructor . 
> The advantage of passing the file System object into the  constructor is that 
> i can specify my yarn  instance  file system pointer to be used while 
> creating the file and its quite  straight forward . For example RC file 
> constructors uses the file System which we have passed . 
> In parquet i see everywhere file System object being derived out of the 
> Parameters or created a fresh . 
> Is there some reason we avoided using the fileSystem object in constructor ? 
> If we allow a file System object constructor as well , I would not have to 
> worry about modifying my file name to contain the uri part .  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to