jhorstmann opened a new pull request #8686:
URL: https://github.com/apache/arrow/pull/8686
This makes it much easier to analyze parquet files, for example by
processing the output with other command line tools like `jq`.
I'm opening this as a draft for now since I'd like some feedback on whether
this should be the default or whether the dependency on `serde_json` should be
optional.
Example output:
```
$ parquet-read parquet-testing/data/alltypes_plain.parquet 2 | jq .
{
"id": 4,
"bool_col": true,
"tinyint_col": 0,
"smallint_col": 0,
"int_col": 0,
"bigint_col": 0,
"float_col": 0,
"double_col": 0,
"date_string_col": "03/01/09",
"string_col": "0",
"timestamp_col": "2009-03-01 01:00:00 +01:00"
}
{
"id": 5,
"bool_col": false,
"tinyint_col": 1,
"smallint_col": 1,
"int_col": 1,
"bigint_col": 10,
"float_col": 1.100000023841858,
"double_col": 10.1,
"date_string_col": "03/01/09",
"string_col": "1",
"timestamp_col": "2009-03-01 01:01:00 +01:00"
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]