Hello Team,

I have been a consumer of Apache Parquet through Apache Hive for several
years now.  For a long time, logging in Parquet has been pretty painful.
Some of the logging was going to STDOUT and some was going to Log4J.
Overall, though the framework has been too verbose, spewing many log lines
about internal details of Parquet I don't understand.

The logging has gotten a lot better with recent releases moving solidly
into SLF4J.  That is awesome and very welcomed.  However, (opinion alert) I
think the logging is still too verbose.  I think Parquet should be a silent
partner in data processing.  If everything is going well, it should be
silent (DEBUG level logging).  If things are going wrong, it should throw
an Exception.

If an operator suspects Parquet is the issue (and that's rarely the first
thing to check), they can set the logging for all of the Loggers in the
entire Parquet package (org.apache.parquet) to DEBUG to get the required
information.  Not to mention, the less logging it does, the faster it will
be.

I've opened this discussion because I've got two PRs related to this topic
ready to go:

PARQUET-1758
PARQUET-1761

Thanks,
David

Reply via email to