[ 
https://issues.apache.org/jira/browse/HADOOP-19597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HADOOP-19597:
-----------------------------------------
    Description: 
Currently, the warning message is logged once (or at most twice after 
HADOOP-8865) when we first use (set/get) a deprecated configuration key and 
most of the time this happens early on during system startup. 

Users tend to set/get properties on their job scripts/applications  that keep 
running on a hourly/daily/etc basis. When a problem comes up the user will 
package the latest logs and send them to us (developers/support) for further 
analysis and troubleshooting. However, it's very likely that these logs will 
not contain information about the deprecated usages which might be crucial for 
advancing the investigation.

On the other end, a warning that appears just once is not that worrisome so 
even if the users/customers/developers see it, they can easily ignore it, and 
move on, thinking that there is no action needed on their end.

The above scenarios are based on applications such as the Hivemetastore, 
HiveServer2, which use the Hadoop Configuration, and usually run for 
weeks/months without a restart.

I propose to change the existing logic to always log a message when a 
deprecated configuration key is in use. This will minimize the risk of losing 
important deprecation logs and will also simplify the implementation.

Moreover, since there is a dedicated logger (HADOOP-9487) for these warning 
messages applications/users can suppress/limit the log content by changing 
their logging configuration. It may not be as precise as logging each 
deprecation once but it can still address the verbosity concern that was raised 
in the past.

+History+
In HADOOP-6105, where the  logging was first introduced the initial intention 
was to log on every usage as proposed here. Just before merging there were some 
concerns that this may clutter the logs and a more conservative approach was 
adopted:

bq. The warning message will be printed for every set and also for first get of 
deprecated key after every reload of configuration.

Then HADOOP-8197 came in, claiming that it is a bug to log warning on every 
user and restricted the logging even more.

Even, with HADOOP-8197 in place, some users argued that there are use-cases 
where it makes sense to completely suppress these warnings thus HADOOP-9487 
introduced a dedicated logger just for the warning messages.

  was:
Currently, the warning message is logged once (or at most twice after 
HADOOP-8865) when we first use (set/get) a deprecated configuration key and 
most of the time this happens early on during system startup. 

Users tend to set/get properties on their job scripts/applications  that keep 
running on a hourly/daily/etc basis. When a problem comes up the user will 
package the latest logs and send them to us (developers/support) for further 
analysis and troubleshooting. However, it's very likely that these logs will 
not contain information about the deprecated usages which might be crucial for 
advancing the investigation.

On the other end, a warning that appears just once is not that worrisome so 
even if the users/customers/developers see it, they can easily ignore it, and 
move on, thinking that there is no action needed on their end.

The above scenarios are based on applications such as the Hivemetastore, 
HiveServer2, which use the Hadoop Configuration, and usually run for 
weeks/months without a restart.

I propose to change the existing logic to always log a message when a 
deprecated configuration key is in use. This will minimize the risk of losing 
important deprecation logs and will also simplify the implementation.

Moreover, since there is a dedicated logger (HADOOP-8197) for these warning 
messages applications/users can suppress/limit the log content by changing 
their logging configuration. It may not be as precise as logging each 
deprecation once but it can still address the verbosity concern that was raised 
in the past.

+History+
In HADOOP-6105, where the  logging was first introduced the initial intention 
was to log on every usage as proposed here. Just before merging there were some 
concerns that this may clutter the logs and a more conservative approach was 
adopted:

bq. The warning message will be printed for every set and also for first get of 
deprecated key after every reload of configuration.

Then HADOOP-8197 came in, claiming that it is a bug to log warning on every 
user and restricted the logging even more.

Even, with HADOOP-8197 in place, some users argued that there are use-cases 
where it makes sense to completely suppress these warnings thus HADOOP-9487 
introduced a dedicated logger just for the warning messages.


> Log warning message on every set/get of a deprecated configuration property
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-19597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19597
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: hadoop-common
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> Currently, the warning message is logged once (or at most twice after 
> HADOOP-8865) when we first use (set/get) a deprecated configuration key and 
> most of the time this happens early on during system startup. 
> Users tend to set/get properties on their job scripts/applications  that keep 
> running on a hourly/daily/etc basis. When a problem comes up the user will 
> package the latest logs and send them to us (developers/support) for further 
> analysis and troubleshooting. However, it's very likely that these logs will 
> not contain information about the deprecated usages which might be crucial 
> for advancing the investigation.
> On the other end, a warning that appears just once is not that worrisome so 
> even if the users/customers/developers see it, they can easily ignore it, and 
> move on, thinking that there is no action needed on their end.
> The above scenarios are based on applications such as the Hivemetastore, 
> HiveServer2, which use the Hadoop Configuration, and usually run for 
> weeks/months without a restart.
> I propose to change the existing logic to always log a message when a 
> deprecated configuration key is in use. This will minimize the risk of losing 
> important deprecation logs and will also simplify the implementation.
> Moreover, since there is a dedicated logger (HADOOP-9487) for these warning 
> messages applications/users can suppress/limit the log content by changing 
> their logging configuration. It may not be as precise as logging each 
> deprecation once but it can still address the verbosity concern that was 
> raised in the past.
> +History+
> In HADOOP-6105, where the  logging was first introduced the initial intention 
> was to log on every usage as proposed here. Just before merging there were 
> some concerns that this may clutter the logs and a more conservative approach 
> was adopted:
> bq. The warning message will be printed for every set and also for first get 
> of deprecated key after every reload of configuration.
> Then HADOOP-8197 came in, claiming that it is a bug to log warning on every 
> user and restricted the logging even more.
> Even, with HADOOP-8197 in place, some users argued that there are use-cases 
> where it makes sense to completely suppress these warnings thus HADOOP-9487 
> introduced a dedicated logger just for the warning messages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to