[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging is Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313051#comment-17313051 ] ASF GitHub Bot commented on PARQUET-1758: - gszadovszky commented on a change in pull request #742: URL: https://github.com/apache/parquet-mr/pull/742#discussion_r605519876 ## File path: parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java ## @@ -111,18 +111,18 @@ private void checkRead() throws IOException { if (current == totalCountLoadedSoFar) { if (current != 0) { totalTimeSpentProcessingRecords += (System.currentTimeMillis() - startedAssemblingCurrentBlockAt); -if (LOG.isInfoEnabled()) { -LOG.info("Assembled and processed " + totalCountLoadedSoFar + " records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords + " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / totalTimeSpentProcessingRecords) + " cell/ms"); +if (LOG.isDebugEnabled()) { +LOG.debug("Assembled and processed " + totalCountLoadedSoFar + " records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords + " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / totalTimeSpentProcessingRecords) + " cell/ms"); Review comment: I agree with @Fokko. Could you update this? ## File path: parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java ## @@ -111,18 +111,18 @@ private void checkRead() throws IOException { if (current == totalCountLoadedSoFar) { if (current != 0) { totalTimeSpentProcessingRecords += (System.currentTimeMillis() - startedAssemblingCurrentBlockAt); -if (LOG.isInfoEnabled()) { -LOG.info("Assembled and processed " + totalCountLoadedSoFar + " records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords + " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / totalTimeSpentProcessingRecords) + " cell/ms"); +if (LOG.isDebugEnabled()) { +LOG.debug("Assembled and processed " + totalCountLoadedSoFar + " records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords + " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / totalTimeSpentProcessingRecords) + " cell/ms"); final long totalTime = totalTimeSpentProcessingRecords + totalTimeSpentReadingBytes; if (totalTime != 0) { final long percentReading = 100 * totalTimeSpentReadingBytes / totalTime; final long percentProcessing = 100 * totalTimeSpentProcessingRecords / totalTime; -LOG.info("time spent so far " + percentReading + "% reading ("+totalTimeSpentReadingBytes+" ms) and " + percentProcessing + "% processing ("+totalTimeSpentProcessingRecords+" ms)"); +LOG.debug("time spent so far " + percentReading + "% reading ("+totalTimeSpentReadingBytes+" ms) and " + percentProcessing + "% processing ("+totalTimeSpentProcessingRecords+" ms)"); Review comment: I agree with @Fokko. Could you update this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > InternalParquetRecordReader Logging is Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging is Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312501#comment-17312501 ] ASF GitHub Bot commented on PARQUET-1758: - dossett commented on pull request #742: URL: https://github.com/apache/parquet-mr/pull/742#issuecomment-811179580 +1 (non-binding) this seems sensible to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > InternalParquetRecordReader Logging is Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014416#comment-17014416 ] David Mollitor commented on PARQUET-1758: - I think the general idea is that almost all logging is DEBUG level for such a library. It may be advantageous to setup YETUS so that the automated builds are with DEBUG log enabled, but my feeling is that most logging shouldn't be enabled by default. > InternalParquetRecordReader Logging it Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014010#comment-17014010 ] David Mollitor commented on PARQUET-1758: - Debug logging would only help performance since less time would be spent logging. > InternalParquetRecordReader Logging it Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013979#comment-17013979 ] Junjie Chen commented on PARQUET-1758: -- It might be better to draft a discussion on mail list for this, some user may use these logs for analysis. The debug mode might impact the performance. > InternalParquetRecordReader Logging it Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013869#comment-17013869 ] Michael Heuer commented on PARQUET-1758: +1, excessive logging from Parquet has been a pain for us downstream for many years > InternalParquetRecordReader Logging it Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013789#comment-17013789 ] ASF GitHub Bot commented on PARQUET-1758: - belugabehr commented on pull request #742: PARQUET-1758: InternalParquetRecordReader Logging it Too Verbose URL: https://github.com/apache/parquet-mr/pull/742 Make sure you have checked _all_ steps below. ### Jira - [X] My PR addresses the following [PARQUET-1758](https://issues.apache.org/jira/browse/PARQUET/PARQUET-1758) issues and references them in the PR title. For example, "PARQUET-1234: My Parquet PR" - https://issues.apache.org/jira/browse/PARQUET-1758 - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). ### Tests - [X] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [X] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [X] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain Javadoc that explain what it does This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > InternalParquetRecordReader Logging it Too Verbose > -- > > Key: PARQUET-1758 > URL: https://issues.apache.org/jira/browse/PARQUET-1758 > Project: Parquet > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > > A low-level library like Parquet should be pretty quiet. It should just do > its work and keep quiet. Most issues should be addressed by throwing > Exceptions, and the occasional warning message otherwise it will clutter the > logging for the top-level application. If debugging is required, > administrator can enable it for the specific workload. > *Warning:* This is my opinion. No stats to back it up. -- This message was sent by Atlassian Jira (v8.3.4#803005)