taiyang-li commented on code in PR #8023:
URL: https://github.com/apache/incubator-gluten/pull/8023#discussion_r1863004722
##########
cpp-ch/local-engine/Storages/Output/NormalFileWriter.cpp:
##########
@@ -19,13 +19,134 @@
#include <QueryPipeline/QueryPipeline.h>
#include <Poco/URI.h>
#include <Common/DebugUtils.h>
+#include <Columns/ColumnConst.h>
+#include <Columns/ColumnArray.h>
+#include <Columns/ColumnMap.h>
namespace local_engine
{
+using namespace DB;
+
const std::string SubstraitFileSink::NO_PARTITION_ID{"__NO_PARTITION_ID__"};
const std::string
SparkPartitionedBaseSink::DEFAULT_PARTITION_NAME{"__HIVE_DEFAULT_PARTITION__"};
+/// For Nullable(Map(K, V)) or Nullable(Array(T)), if the i-th row is null, we
must make sure its nested data is empty.
+/// It is for ORC/Parquet writing compatiability. For more details, refer to
+/// https://github.com/apache/incubator-gluten/issues/8022 and
https://github.com/apache/incubator-gluten/issues/8021
+static ColumnPtr truncateNestedDataIfNull(const ColumnPtr & column)
Review Comment:
It is only needed in current file. It is not late to put it in seperate
header file when it is referenced by multiple places.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]