[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging is Too Verbose

2021-04-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313051#comment-17313051
 ] 

ASF GitHub Bot commented on PARQUET-1758:
-

gszadovszky commented on a change in pull request #742:
URL: https://github.com/apache/parquet-mr/pull/742#discussion_r605519876



##
File path: 
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java
##
@@ -111,18 +111,18 @@ private void checkRead() throws IOException {
 if (current == totalCountLoadedSoFar) {
   if (current != 0) {
 totalTimeSpentProcessingRecords += (System.currentTimeMillis() - 
startedAssemblingCurrentBlockAt);
-if (LOG.isInfoEnabled()) {
-LOG.info("Assembled and processed " + totalCountLoadedSoFar + " 
records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords 
+ " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " 
rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / 
totalTimeSpentProcessingRecords) + " cell/ms");
+if (LOG.isDebugEnabled()) {
+LOG.debug("Assembled and processed " + totalCountLoadedSoFar + " 
records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords 
+ " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " 
rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / 
totalTimeSpentProcessingRecords) + " cell/ms");

Review comment:
   I agree with @Fokko. Could you update this?

##
File path: 
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java
##
@@ -111,18 +111,18 @@ private void checkRead() throws IOException {
 if (current == totalCountLoadedSoFar) {
   if (current != 0) {
 totalTimeSpentProcessingRecords += (System.currentTimeMillis() - 
startedAssemblingCurrentBlockAt);
-if (LOG.isInfoEnabled()) {
-LOG.info("Assembled and processed " + totalCountLoadedSoFar + " 
records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords 
+ " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " 
rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / 
totalTimeSpentProcessingRecords) + " cell/ms");
+if (LOG.isDebugEnabled()) {
+LOG.debug("Assembled and processed " + totalCountLoadedSoFar + " 
records from " + columnCount + " columns in " + totalTimeSpentProcessingRecords 
+ " ms: "+((float)totalCountLoadedSoFar / totalTimeSpentProcessingRecords) + " 
rec/ms, " + ((float)totalCountLoadedSoFar * columnCount / 
totalTimeSpentProcessingRecords) + " cell/ms");
 final long totalTime = totalTimeSpentProcessingRecords + 
totalTimeSpentReadingBytes;
 if (totalTime != 0) {
 final long percentReading = 100 * totalTimeSpentReadingBytes / 
totalTime;
 final long percentProcessing = 100 * 
totalTimeSpentProcessingRecords / totalTime;
-LOG.info("time spent so far " + percentReading + "% reading 
("+totalTimeSpentReadingBytes+" ms) and " + percentProcessing + "% processing 
("+totalTimeSpentProcessingRecords+" ms)");
+LOG.debug("time spent so far " + percentReading + "% reading 
("+totalTimeSpentReadingBytes+" ms) and " + percentProcessing + "% processing 
("+totalTimeSpentProcessingRecords+" ms)");

Review comment:
   I agree with @Fokko. Could you update this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> InternalParquetRecordReader Logging is Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging is Too Verbose

2021-03-31 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312501#comment-17312501
 ] 

ASF GitHub Bot commented on PARQUET-1758:
-

dossett commented on pull request #742:
URL: https://github.com/apache/parquet-mr/pull/742#issuecomment-811179580


   +1 (non-binding) this seems sensible to me


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> InternalParquetRecordReader Logging is Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-13 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014416#comment-17014416
 ] 

David Mollitor commented on PARQUET-1758:
-

I think the general idea is that almost all logging is DEBUG level for such a 
library.  It may be advantageous to setup YETUS so that the automated builds 
are with DEBUG log enabled, but my feeling is that most logging shouldn't be 
enabled by default.

> InternalParquetRecordReader Logging it Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-12 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014010#comment-17014010
 ] 

David Mollitor commented on PARQUET-1758:
-

Debug logging would only help performance since less time would be spent 
logging.

> InternalParquetRecordReader Logging it Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-12 Thread Junjie Chen (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013979#comment-17013979
 ] 

Junjie Chen commented on PARQUET-1758:
--

It might be better to draft a discussion on mail list for this, some user may 
use these logs for analysis. The debug mode might impact the performance.

> InternalParquetRecordReader Logging it Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-12 Thread Michael Heuer (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013869#comment-17013869
 ] 

Michael Heuer commented on PARQUET-1758:


+1, excessive logging from Parquet has been a pain for us downstream for many 
years

> InternalParquetRecordReader Logging it Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013789#comment-17013789
 ] 

ASF GitHub Bot commented on PARQUET-1758:
-

belugabehr commented on pull request #742: PARQUET-1758: 
InternalParquetRecordReader Logging it Too Verbose
URL: https://github.com/apache/parquet-mr/pull/742
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following 
[PARQUET-1758](https://issues.apache.org/jira/browse/PARQUET/PARQUET-1758) 
issues and references them in the PR title. For example, "PARQUET-1234: My 
Parquet PR"
 - https://issues.apache.org/jira/browse/PARQUET-1758
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines. In 
addition, my commits follow the guidelines from "[How to write a good git 
commit message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [X] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain Javadoc that 
explain what it does
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> InternalParquetRecordReader Logging it Too Verbose
> --
>
> Key: PARQUET-1758
> URL: https://issues.apache.org/jira/browse/PARQUET-1758
> Project: Parquet
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>
> A low-level library like Parquet should be pretty quiet.  It should just do 
> its work and keep quiet.  Most issues should be addressed by throwing 
> Exceptions, and the occasional warning message otherwise it will clutter the 
> logging for the top-level application.  If debugging is required, 
> administrator can enable it for the specific workload.
> *Warning:* This is my opinion. No stats to back it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)