[
https://issues.apache.org/jira/browse/PARQUET-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257952#comment-14257952
]
Manish Agarwal edited comment on PARQUET-154 at 12/24/14 5:03 AM:
------------------------------------------------------------------
for write : If the path is not Qualified and we pass path and file system
separately in a write call then there should not be any issue . it would open
a handle using the file System on the unqualified path .
once we do it , ParquetFileWriter(FileSystem fs, MessageType schema, Path
file) wil not be required to so FileSystem fs =
file.getFileSystem(configuration); as it will use the file system passed form
the user
was (Author: manish.agarwal):
for write : If the path is not Qualified and we pass path and file system
separately in a write call then there should not be any issue . it would open
a handle using the file System on the unqualified path .
once we do it ,
+ public ParquetFileWriter(FileSystem fs, MessageType schema, Path file)
throws IOException {
+ super();
+ this.schema = schema;
+ // FileSystem fs = file.getFileSystem(configuration); <== NOT REQUIRED
+ this.out = fs.create(file, false);
+ }
& writer will be
+ public ParquetWriter(
+ Path file,
+ WriteSupport<T> writeSupport,
+ CompressionCodecName compressionCodecName,
+ int blockSize,
+ int pageSize,
+ int dictionaryPageSize,
+ boolean enableDictionary,
+ boolean validating,
+ WriterVersion writerVersion,
+ Configuration conf,
+ FileSystem fs) throws IOException {
+
+ WriteSupport.WriteContext writeContext = writeSupport.init(conf);
+ MessageType schema = writeContext.getSchema();
+
+ ParquetFileWriter fileWriter = new ParquetFileWriter(fs, schema, file);
+ fileWriter.start();
+
+ CodecFactory codecFactory = new CodecFactory(conf);
+ CodecFactory.BytesCompressor compressor =
codecFactory.getCompressor(compressionCodecName, 0);
+ this.writer = new InternalParquetRecordWriter<T>(
+ fileWriter,
+ writeSupport,
+ schema,
+ writeContext.getExtraMetaData(),
+ blockSize,
+ pageSize,
+ compressor,
+ dictionaryPageSize,
+ enableDictionary,
+ validating,
+ writerVersion);
+ }
> parquet Constructors not taking file System object
> --------------------------------------------------
>
> Key: PARQUET-154
> URL: https://issues.apache.org/jira/browse/PARQUET-154
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: parquet-mr_1.6.0
> Reporter: Manish Agarwal
>
> I am trying to create a file in parquet file format and in RC file format .
>
> No Parquet constructor accepts fileSystem object as an argument. This means
> that i will have to append the uri from the file system in front of the
> filepath object everytime i need to create a new file .
> In RC format the file system object is allowed to be passed in the
> constructor .
> The advantage of passing the file System object into the constructor is that
> i can specify my yarn instance file system pointer to be used while
> creating the file and its quite straight forward . For example RC file
> constructors uses the file System which we have passed .
> In parquet i see everywhere file System object being derived out of the
> Parameters or created a fresh .
> Is there some reason we avoided using the fileSystem object in constructor ?
> If we allow a file System object constructor as well , I would not have to
> worry about modifying my file name to contain the uri part .
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)