This discussion is going to the Jira. Please refer the Jira if anyone is interested in this. On 9 Mar 2016 6:31 p.m., "Sean Owen" <so...@cloudera.com> wrote:
> From your JIRA, it seems like you're referring to the "part-*" files. > These files are effectively an internal representation, and I would > not expect them to have such an extension. For example, you're not > really guaranteed that the way the data breaks up leaves each file a > valid JSON doc. > > On Wed, Mar 9, 2016 at 5:49 AM, Hyukjin Kwon <gurwls...@gmail.com> wrote: > > Hi all, > > > > Currently, the output from CSV, TEXT and JSON data sources does not have > > file extensions such as .csv, .txt and .json (except for compression > > extensions such as .gz, .deflate and .bz4). > > > > In addition, it looks Parquet has the extensions such as .gz.parquet or > > .snappy.parquet according to compression codecs whereas ORC does not have > > such extensions but it is just .orc. > > > > I tried to search some JIRAs related with this but I could not find yet > but > > I did not open a JIRA directly because I feel like this is already > concerned > > > > Maybe could I open a JIRA for this inconsistent file extensions? > > > > It would be thankful if you give me some feedback > > > > Thanks! >