[ 
https://issues.apache.org/jira/browse/ARROW-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nic Crane updated ARROW-12914:
------------------------------
    Summary: [C++] ArrowLog with FATAL level is not robust if running in the 
service  (was: C++ ArrowLog with FATAL level is not robust if running in the 
service)

> [C++] ArrowLog with FATAL level is not robust if running in the service
> -----------------------------------------------------------------------
>
>                 Key: ARROW-12914
>                 URL: https://issues.apache.org/jira/browse/ARROW-12914
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 1.0.0, 2.0.0, 3.0.0, 4.0.0, 5.0.0
>            Reporter: Gang Wu
>            Priority: Major
>
> We are a public cloud service provider and use Apache Arrow C++ as a 
> dependency in the provisioned services. However, ArrowLog (CerrLog) w/ level 
> ARROW_FATAL causes the service crash due to std::abort() call. The detail is 
> as below.
>  
> 1. The Arrow C++ implementation has defined marco ARROW_CHECK 
> ([https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/logging.h#L61])
>  and uses it every where. If the condition in ARROW_CHECK does not meet, it 
> creates an ArrowLog object with the level ARROW_FATAL and then logs something.
> {code:cpp}
> #define ARROW_CHECK(condition)                                               \
>   ARROW_PREDICT_TRUE(condition)                                              \
>   ? ARROW_IGNORE_EXPR(0)                                                     \
>   : ::arrow::util::Voidify() &                                               \
>           ::arrow::util::ArrowLog(__FILE__, __LINE__,                        \
>                                   ::arrow::util::ArrowLogLevel::ARROW_FATAL) \
>               << " Check failed: " #condition " "
> {code}
>  
> 2. The ArrowLog uses CerrLog 
> ([https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/logging.cc#L62])
>  as the logging provider if GLog is not enabled.
> {code:cpp}
> #ifdef ARROW_USE_GLOG
> typedef google::LogMessage LoggingProvider;
> #else
> typedef CerrLog LoggingProvider;
> #endif
> {code}
>  
> 3. The problem is in the destructor of CerrLog. It prints back trace and then 
> dies with abort call. It results in crash of the service process and thus 
> makes it unavailable.
> {code:cpp}
>   virtual ~CerrLog() {
>     if (has_logged_) {
>       std::cerr << std::endl;
>     }
>     if (severity_ == ArrowLogLevel::ARROW_FATAL) {
>       PrintBackTrace();
>       std::abort();
>     }
>   }
> {code}
>  
> I have traced back to https://issues.apache.org/jira/browse/ARROW-2138 and it 
> seems that the behavior is expected. I'd suggest a more robust approach to 
> get rid of the abort call. It should throw exception where the fatal error 
> emerges and the logger's job is as simple as logging something. Thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to