jhorstmann opened a new pull request #8686:
URL: https://github.com/apache/arrow/pull/8686


   This makes it much easier to analyze parquet files, for example by 
processing the output with other command line tools like `jq`.
   
   I'm opening this as a draft for now since I'd like some feedback on whether 
this should be the default or whether the dependency on `serde_json` should be 
optional.
   
   Example output:
   
   ```
   $ parquet-read parquet-testing/data/alltypes_plain.parquet 2 | jq .
   {
     "id": 4,
     "bool_col": true,
     "tinyint_col": 0,
     "smallint_col": 0,
     "int_col": 0,
     "bigint_col": 0,
     "float_col": 0,
     "double_col": 0,
     "date_string_col": "03/01/09",
     "string_col": "0",
     "timestamp_col": "2009-03-01 01:00:00 +01:00"
   }
   {
     "id": 5,
     "bool_col": false,
     "tinyint_col": 1,
     "smallint_col": 1,
     "int_col": 1,
     "bigint_col": 10,
     "float_col": 1.100000023841858,
     "double_col": 10.1,
     "date_string_col": "03/01/09",
     "string_col": "1",
     "timestamp_col": "2009-03-01 01:01:00 +01:00"
   }
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to