[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-11-10 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r520459478



##
File path: libminifi/test/Utils.h
##
@@ -29,4 +30,11 @@
 return std::forward(instance).method(std::forward(args)...); \
   }
 
-#endif  // LIBMINIFI_TEST_UTILS_H_
+std::string repeat(const std::string& str, std::size_t count) {

Review comment:
   reimplemented repeat using join





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503770589



##
File path: extensions/libarchive/CompressContent.h
##
@@ -33,19 +33,14 @@
 #include "core/Property.h"
 #include "core/logging/LoggerConfiguration.h"
 #include "io/ZlibStream.h"
+#include "utils/Enum.h"
 
 namespace org {
 namespace apache {
 namespace nifi {
 namespace minifi {
 namespace processors {
 
-#define COMPRESSION_FORMAT_ATTRIBUTE "use mime.type attribute"
-#define COMPRESSION_FORMAT_GZIP "gzip"
-#define COMPRESSION_FORMAT_BZIP2 "bzip2"
-#define COMPRESSION_FORMAT_XZ_LZMA2 "xz-lzma2"
-#define COMPRESSION_FORMAT_LZMA "lzma"
-
 #define MODE_COMPRESS "compress"
 #define MODE_DECOMPRESS "decompress"

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503770589



##
File path: extensions/libarchive/CompressContent.h
##
@@ -33,19 +33,14 @@
 #include "core/Property.h"
 #include "core/logging/LoggerConfiguration.h"
 #include "io/ZlibStream.h"
+#include "utils/Enum.h"
 
 namespace org {
 namespace apache {
 namespace nifi {
 namespace minifi {
 namespace processors {
 
-#define COMPRESSION_FORMAT_ATTRIBUTE "use mime.type attribute"
-#define COMPRESSION_FORMAT_GZIP "gzip"
-#define COMPRESSION_FORMAT_BZIP2 "bzip2"
-#define COMPRESSION_FORMAT_XZ_LZMA2 "xz-lzma2"
-#define COMPRESSION_FORMAT_LZMA "lzma"
-
 #define MODE_COMPRESS "compress"
 #define MODE_DECOMPRESS "decompress"

Review comment:
   done, made it into an enum





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503770153



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -81,39 +101,42 @@ void CompressContent::initialize() {
 }
 
 void CompressContent::onSchedule(core::ProcessContext *context, 
core::ProcessSessionFactory *sessionFactory) {
-  std::string value;
   context->getProperty(CompressLevel.getName(), compressLevel_);
   context->getProperty(CompressMode.getName(), compressMode_);
-  context->getProperty(CompressFormat.getName(), compressFormat_);
+
+  {
+std::string compressFormatStr;
+context->getProperty(CompressFormat.getName(), compressFormatStr);
+std::transform(compressFormatStr.begin(), compressFormatStr.end(), 
compressFormatStr.begin(), ::tolower);
+compressFormat_ = 
ExtendedCompressionFormat::parse(compressFormatStr.c_str());
+if (!compressFormat_) {
+  throw Exception(PROCESS_SCHEDULE_EXCEPTION, "Unknown compression format: 
\"" + compressFormatStr + "\"");
+}
+  }

Review comment:
   done, moved the checking into the extraction (getProperty) itself





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503769523



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -60,10 +60,29 @@ core::Property CompressContent::EncapsulateInTar(
   "If false, on compression the content of the 
FlowFile simply gets compressed, and on decompression a simple compressed 
content is expected.\n"
   "true is the behaviour compatible with older MiNiFi 
C++ versions, false is the behaviour compatible with NiFi.")
 ->isRequired(false)->withDefaultValue(true)->build());
+core::Property CompressContent::BatchSize(
+core::PropertyBuilder::createProperty("Batch Size")
+->withDescription("Maximum number of FlowFiles processed in a single 
session")
+->withDefaultValue(1)->build());
 
 core::Relationship CompressContent::Success("success", "FlowFiles will be 
transferred to the success relationship after successfully being compressed or 
decompressed");
 core::Relationship CompressContent::Failure("failure", "FlowFiles will be 
transferred to the failure relationship if they fail to compress/decompress");
 
+std::map 
CompressContent::compressionFormatMimeTypeMap_{

Review comment:
   made them constant, the member variable formatting is all over the 
place, we have both snake_case and camelCase in BinFiles, CompressContent and 
MergeContent (and possibly others as well), I wouldn't touch their names for 
now, maybe in a separate PR renaming them all at once?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503763908



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -45,11 +45,11 @@ core::Property CompressContent::CompressFormat(
 core::PropertyBuilder::createProperty("Compression 
Format")->withDescription("The compression format to use.")
 ->isRequired(false)
 ->withAllowableValues({
-  COMPRESSION_FORMAT_ATTRIBUTE,
-  COMPRESSION_FORMAT_GZIP,
-  COMPRESSION_FORMAT_BZIP2,
-  COMPRESSION_FORMAT_XZ_LZMA2,
-  
COMPRESSION_FORMAT_LZMA})->withDefaultValue(COMPRESSION_FORMAT_ATTRIBUTE)->build());
+  toString(ExtendedCompressionFormat::USE_MIME_TYPE),
+  toString(CompressionFormat::GZIP),
+  toString(CompressionFormat::BZIP2),
+  toString(CompressionFormat::XZ_LZMA2),
+  
toString(CompressionFormat::LZMA)})->withDefaultValue(toString(ExtendedCompressionFormat::USE_MIME_TYPE))->build());

Review comment:
   done, added means to query all enum representations, i.e. 
`ExtendedCompressionFormat::values()`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503739298



##
File path: extensions/libarchive/BinFiles.cpp
##
@@ -259,9 +279,13 @@ void BinFiles::onTrigger(const 
std::shared_ptr &context, c
 }
   }
 
-  auto flow = session->get();
+  for (size_t i = 0; i < batchSize_; ++i) {
+auto flow = session->get();
+
+if (flow == nullptr) {
+  break;

Review comment:
   the problem with yielding is that, this processor (MergeContent) can 
function correctly even if there are no incoming flowFiles (it can still emit 
already processed merged files)
   this is not true for CompressContent so added a yield there





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503739298



##
File path: extensions/libarchive/BinFiles.cpp
##
@@ -259,9 +279,13 @@ void BinFiles::onTrigger(const 
std::shared_ptr &context, c
 }
   }
 
-  auto flow = session->get();
+  for (size_t i = 0; i < batchSize_; ++i) {
+auto flow = session->get();
+
+if (flow == nullptr) {
+  break;

Review comment:
   the problem with yielding is that, this processor can function correctly 
even if there are no incoming flowFiles (it can still emit already processed 
merged files)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503713117



##
File path: extensions/libarchive/BinFiles.cpp
##
@@ -259,9 +279,13 @@ void BinFiles::onTrigger(const 
std::shared_ptr &context, c
 }
   }
 
-  auto flow = session->get();
+  for (size_t i = 0; i < batchSize_; ++i) {
+auto flow = session->get();
+
+if (flow == nullptr) {
+  break;

Review comment:
   👍 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-13 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503712927



##
File path: extensions/libarchive/BinFiles.cpp
##
@@ -38,12 +38,34 @@ namespace nifi {
 namespace minifi {
 namespace processors {
 
-core::Property BinFiles::MinSize("Minimum Group Size", "The minimum size of 
for the bundle", "0");
-core::Property BinFiles::MaxSize("Maximum Group Size", "The maximum size for 
the bundle. If not specified, there is no maximum.", "");
-core::Property BinFiles::MinEntries("Minimum Number of Entries", "The minimum 
number of files to include in a bundle", "1");
-core::Property BinFiles::MaxEntries("Maximum Number of Entries", "The maximum 
number of files to include in a bundle. If not specified, there is no 
maximum.", "");
-core::Property BinFiles::MaxBinAge("Max Bin Age", "The maximum age of a Bin 
that will trigger a Bin to be complete. Expected format is  ", "");
-core::Property BinFiles::MaxBinCount("Maximum number of Bins", "Specifies the 
maximum number of bins that can be held in memory at any one time", "100");
+core::Property BinFiles::MinSize(
+core::PropertyBuilder::createProperty("Minimum Group Size")
+->withDescription("The minimum size of for the bundle")
+->withDefaultValue(0)->build());
+core::Property BinFiles::MaxSize(
+core::PropertyBuilder::createProperty("Maximum Group Size")
+->withDescription("The maximum size for the bundle. If not specified, 
there is no maximum.")
+
->withType(core::StandardValidators::get().UNSIGNED_LONG_VALIDATOR)->build());
+core::Property BinFiles::MinEntries(
+core::PropertyBuilder::createProperty("Minimum Number of Entries")
+->withDescription("The minimum number of files to include in a bundle")
+->withDefaultValue(1)->build());
+core::Property BinFiles::MaxEntries(
+core::PropertyBuilder::createProperty("Maximum Number of Entries")
+->withDescription("The maximum number of files to include in a bundle. If 
not specified, there is no maximum.")
+
->withType(core::StandardValidators::get().UNSIGNED_INT_VALIDATOR)->build());
+core::Property BinFiles::MaxBinAge(
+core::PropertyBuilder::createProperty("Max Bin Age")
+->withDescription("The maximum age of a Bin that will trigger a Bin to be 
complete. Expected format is  ")
+
->withType(core::StandardValidators::get().TIME_PERIOD_VALIDATOR)->build());
+core::Property BinFiles::MaxBinCount(
+core::PropertyBuilder::createProperty("Maximum number of Bins")
+->withDescription("Specifies the maximum number of bins that can be held 
in memory at any one time")
+->withDefaultValue(100)->build());
+core::Property BinFiles::BatchSize(
+core::PropertyBuilder::createProperty("Batch Size")
+->withDescription("Maximum number of FlowFiles processed in a single 
session")
+->withDefaultValue(1)->build());

Review comment:
   I wanted to preserve the original behavior, as this is an optimazation 
feature the user can configure on an as-needed basis.
   Should we change it, and if so, what do you think would be the appropriate 
default?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-12 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503699011



##
File path: libminifi/include/utils/Enum.h
##
@@ -0,0 +1,190 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#include 
+#include 
+#include 
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace utils {
+
+#define COMMA(...) ,
+#define MSVC_HACK(x) x
+
+#define PICK_(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, 
_15, ...) _15
+#define COUNT(...) \
+  MSVC_HACK(PICK_(__VA_ARGS__, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 
0))
+
+#define CONCAT_(a, b) a ## b
+#define CONCAT(a, b) CONCAT_(a, b)
+
+#define CALL(Fn, ...) MSVC_HACK(Fn(__VA_ARGS__))
+#define SPREAD(...) __VA_ARGS__
+
+#define FOR_EACH(fn, delim, ARGS) \
+  CALL(CONCAT(FOR_EACH_, COUNT ARGS), fn, delim, SPREAD ARGS)
+
+#define FOR_EACH_0(...)
+#define FOR_EACH_1(fn, delim, _1) fn(_1)
+#define FOR_EACH_2(fn, delim, _1, _2) fn(_1) delim() fn(_2)
+#define FOR_EACH_3(fn, delim, _1, _2, _3) fn(_1) delim() fn(_2) delim() fn(_3)
+#define FOR_EACH_4(fn, delim, _1, _2, _3, _4) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4)
+#define FOR_EACH_5(fn, delim, _1, _2, _3, _4, _5) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() fn(_5)
+#define FOR_EACH_6(fn, delim, _1, _2, _3, _4, _5, _6) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() \
+  fn(_5) delim() fn(_6)
+#define FOR_EACH_7(fn, delim, _1, _2, _3, _4, _5, _6, _7) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() \
+  fn(_5) delim() fn(_6) delim() fn(_7)
+#define FOR_EACH_8(fn, delim, _1, _2, _3, _4, _5, _6, _7, _8) \

Review comment:
   we could give something like this a try:
   `#define FOR_EACH_8(fn, delim, _1, ...)  fn(_1) delim() FOR_EACH_7(fn, 
delim, __VA_ARGS__)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-12 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503697223



##
File path: libminifi/include/utils/Enum.h
##
@@ -0,0 +1,190 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#include 
+#include 
+#include 
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace utils {
+
+#define COMMA(...) ,
+#define MSVC_HACK(x) x
+
+#define PICK_(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, 
_15, ...) _15
+#define COUNT(...) \
+  MSVC_HACK(PICK_(__VA_ARGS__, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 
0))
+
+#define CONCAT_(a, b) a ## b
+#define CONCAT(a, b) CONCAT_(a, b)
+
+#define CALL(Fn, ...) MSVC_HACK(Fn(__VA_ARGS__))
+#define SPREAD(...) __VA_ARGS__
+
+#define FOR_EACH(fn, delim, ARGS) \
+  CALL(CONCAT(FOR_EACH_, COUNT ARGS), fn, delim, SPREAD ARGS)
+
+#define FOR_EACH_0(...)
+#define FOR_EACH_1(fn, delim, _1) fn(_1)
+#define FOR_EACH_2(fn, delim, _1, _2) fn(_1) delim() fn(_2)
+#define FOR_EACH_3(fn, delim, _1, _2, _3) fn(_1) delim() fn(_2) delim() fn(_3)
+#define FOR_EACH_4(fn, delim, _1, _2, _3, _4) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4)
+#define FOR_EACH_5(fn, delim, _1, _2, _3, _4, _5) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() fn(_5)
+#define FOR_EACH_6(fn, delim, _1, _2, _3, _4, _5, _6) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() \
+  fn(_5) delim() fn(_6)
+#define FOR_EACH_7(fn, delim, _1, _2, _3, _4, _5, _6, _7) \
+  fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() \
+  fn(_5) delim() fn(_6) delim() fn(_7)
+#define FOR_EACH_8(fn, delim, _1, _2, _3, _4, _5, _6, _7, _8) \

Review comment:
   I wish, but recursive macro expansion is a no-go 😢 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-12 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r503696850



##
File path: libminifi/test/Utils.h
##
@@ -29,4 +30,11 @@
 return std::forward(instance).method(std::forward(args)...); \
   }
 
-#endif  // LIBMINIFI_TEST_UTILS_H_
+std::string repeat(const std::string& str, std::size_t count) {

Review comment:
   oh nice, I was looking in the StringUtils for something like this, and 
even found join, but did not put the pieces together





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-09 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r502398968



##
File path: extensions/libarchive/BinFiles.h
##
@@ -146,36 +146,23 @@ class Bin {
 // BinManager Class
 class BinManager {
  public:
-  // Constructor
-  /*!
-   * Create a new BinManager
-   */
-  BinManager()
-  : minSize_(0),
-maxSize_(ULLONG_MAX),
-maxEntries_(INT_MAX),
-minEntries_(1),
-binAge_(ULLONG_MAX),
-binCount_(0),
-logger_(logging::LoggerFactory::getLogger()) {
-  }
   virtual ~BinManager() {
 purge();
   }
   void setMinSize(const uint64_t &size) {
-minSize_ = size;
+minSize_ = {size};
   }
   void setMaxSize(const uint64_t &size) {
-maxSize_ = size;
+maxSize_ = {size};
   }
-  void setMaxEntries(const int &entries) {
-maxEntries_ = entries;
+  void setMaxEntries(const uint32_t &entries) {
+maxEntries_ = {entries};
   }
-  void setMinEntries(const int &entries) {
-minEntries_ = entries;
+  void setMinEntries(const uint32_t &entries) {
+minEntries_ = {entries};
   }
   void setBinAge(const uint64_t &age) {
-binAge_ = age;
+binAge_ = {age};

Review comment:
   braces are added to prevent implicit narrowing, so we get a compile 
error for the following:
   ```
   int a = 0;
   long b = 3;
   a = {b};
   ```
   +1 for passig by values

##
File path: extensions/libarchive/BinFiles.h
##
@@ -146,36 +146,23 @@ class Bin {
 // BinManager Class
 class BinManager {
  public:
-  // Constructor
-  /*!
-   * Create a new BinManager
-   */
-  BinManager()
-  : minSize_(0),
-maxSize_(ULLONG_MAX),
-maxEntries_(INT_MAX),
-minEntries_(1),
-binAge_(ULLONG_MAX),
-binCount_(0),
-logger_(logging::LoggerFactory::getLogger()) {
-  }
   virtual ~BinManager() {
 purge();
   }
   void setMinSize(const uint64_t &size) {
-minSize_ = size;
+minSize_ = {size};
   }
   void setMaxSize(const uint64_t &size) {
-maxSize_ = size;
+maxSize_ = {size};
   }
-  void setMaxEntries(const int &entries) {
-maxEntries_ = entries;
+  void setMaxEntries(const uint32_t &entries) {
+maxEntries_ = {entries};
   }
-  void setMinEntries(const int &entries) {
-minEntries_ = entries;
+  void setMinEntries(const uint32_t &entries) {
+minEntries_ = {entries};
   }
   void setBinAge(const uint64_t &age) {
-binAge_ = age;
+binAge_ = {age};

Review comment:
   braces are added to prevent implicit narrowing, so we get a compile 
error for the following:
   ```
   int a = 0;
   long b = 3;
   a = {b};
   ```
   +1 for passing by values





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-09 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r502398968



##
File path: extensions/libarchive/BinFiles.h
##
@@ -146,36 +146,23 @@ class Bin {
 // BinManager Class
 class BinManager {
  public:
-  // Constructor
-  /*!
-   * Create a new BinManager
-   */
-  BinManager()
-  : minSize_(0),
-maxSize_(ULLONG_MAX),
-maxEntries_(INT_MAX),
-minEntries_(1),
-binAge_(ULLONG_MAX),
-binCount_(0),
-logger_(logging::LoggerFactory::getLogger()) {
-  }
   virtual ~BinManager() {
 purge();
   }
   void setMinSize(const uint64_t &size) {
-minSize_ = size;
+minSize_ = {size};
   }
   void setMaxSize(const uint64_t &size) {
-maxSize_ = size;
+maxSize_ = {size};
   }
-  void setMaxEntries(const int &entries) {
-maxEntries_ = entries;
+  void setMaxEntries(const uint32_t &entries) {
+maxEntries_ = {entries};
   }
-  void setMinEntries(const int &entries) {
-minEntries_ = entries;
+  void setMinEntries(const uint32_t &entries) {
+minEntries_ = {entries};
   }
   void setBinAge(const uint64_t &age) {
-binAge_ = age;
+binAge_ = {age};

Review comment:
   braces are added to prevent implicit narrowing, so we get a compile 
error for the following:
   ```
   int a = 0;
   long b = 3;
   a = {b};
   ```
   +1 for passig by values

##
File path: extensions/libarchive/BinFiles.h
##
@@ -146,36 +146,23 @@ class Bin {
 // BinManager Class
 class BinManager {
  public:
-  // Constructor
-  /*!
-   * Create a new BinManager
-   */
-  BinManager()
-  : minSize_(0),
-maxSize_(ULLONG_MAX),
-maxEntries_(INT_MAX),
-minEntries_(1),
-binAge_(ULLONG_MAX),
-binCount_(0),
-logger_(logging::LoggerFactory::getLogger()) {
-  }
   virtual ~BinManager() {
 purge();
   }
   void setMinSize(const uint64_t &size) {
-minSize_ = size;
+minSize_ = {size};
   }
   void setMaxSize(const uint64_t &size) {
-maxSize_ = size;
+maxSize_ = {size};
   }
-  void setMaxEntries(const int &entries) {
-maxEntries_ = entries;
+  void setMaxEntries(const uint32_t &entries) {
+maxEntries_ = {entries};
   }
-  void setMinEntries(const int &entries) {
-minEntries_ = entries;
+  void setMinEntries(const uint32_t &entries) {
+minEntries_ = {entries};
   }
   void setBinAge(const uint64_t &age) {
-binAge_ = age;
+binAge_ = {age};

Review comment:
   braces are added to prevent implicit narrowing, so we get a compile 
error for the following:
   ```
   int a = 0;
   long b = 3;
   a = {b};
   ```
   +1 for passing by values





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-01 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r498163298



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -171,7 +183,7 @@ void CompressContent::onTrigger(const 
std::shared_ptr &con
   std::shared_ptr processFlowFile = session->create(flowFile);
   bool success = false;
   if (encapsulateInTar_) {
-CompressContent::WriteCallback callback(compressMode_, compressLevel_, 
compressFormat, flowFile, session);
+CompressContent::WriteCallback callback(compressMode_, compressLevel_, 
toString(compressFormat), flowFile, session);

Review comment:
   we should forward the enum itself instead of its string representation





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-01 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r498162379



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -81,86 +101,78 @@ void CompressContent::initialize() {
 }
 
 void CompressContent::onSchedule(core::ProcessContext *context, 
core::ProcessSessionFactory *sessionFactory) {
-  std::string value;
   context->getProperty(CompressLevel.getName(), compressLevel_);
   context->getProperty(CompressMode.getName(), compressMode_);
-  context->getProperty(CompressFormat.getName(), compressFormat_);
+
+  {
+std::string compressFormatStr;
+context->getProperty(CompressFormat.getName(), compressFormatStr);
+std::transform(compressFormatStr.begin(), compressFormatStr.end(), 
compressFormatStr.begin(), ::tolower);
+compressFormat_ = 
ExtendedCompressionFormat::parse(compressFormatStr.c_str());
+  }
+
   context->getProperty(UpdateFileName.getName(), updateFileName_);
   context->getProperty(EncapsulateInTar.getName(), encapsulateInTar_);
+  context->getProperty(BatchSize.getName(), batchSize_);
 
-  logger_->log_info("Compress Content: Mode [%s] Format [%s] Level [%d] 
UpdateFileName [%d] EncapsulateInTar [%d]",
-  compressMode_, compressFormat_, compressLevel_, updateFileName_, 
encapsulateInTar_);
-
-  // update the mimeTypeMap
-  compressionFormatMimeTypeMap_["application/gzip"] = COMPRESSION_FORMAT_GZIP;
-  compressionFormatMimeTypeMap_["application/bzip2"] = 
COMPRESSION_FORMAT_BZIP2;
-  compressionFormatMimeTypeMap_["application/x-bzip2"] = 
COMPRESSION_FORMAT_BZIP2;
-  compressionFormatMimeTypeMap_["application/x-lzma"] = 
COMPRESSION_FORMAT_LZMA;
-  compressionFormatMimeTypeMap_["application/x-xz"] = 
COMPRESSION_FORMAT_XZ_LZMA2;
-  fileExtension_[COMPRESSION_FORMAT_GZIP] = ".gz";
-  fileExtension_[COMPRESSION_FORMAT_LZMA] = ".lzma";
-  fileExtension_[COMPRESSION_FORMAT_BZIP2] = ".bz2";
-  fileExtension_[COMPRESSION_FORMAT_XZ_LZMA2] = ".xz";
+  logger_->log_info("Compress Content: Mode [%s] CompressionFormat [%s] Level 
[%d] UpdateFileName [%d] EncapsulateInTar [%d]",
+  compressMode_, toString(compressFormat_), compressLevel_, 
updateFileName_, encapsulateInTar_);
 }
 
 void CompressContent::onTrigger(const std::shared_ptr 
&context, const std::shared_ptr &session) {
+  for (size_t i = 0; i < batchSize_; ++i) {
+if (onTriggerImpl(context, session) != TriggerResult::CONTINUE) {
+  break;
+}
+  }
+}
+
+CompressContent::TriggerResult CompressContent::onTriggerImpl(const 
std::shared_ptr &context, const 
std::shared_ptr &session) {
   std::shared_ptr flowFile = session->get();
 
   if (!flowFile) {
-return;
+return TriggerResult::BREAK;

Review comment:
   since this is the only place we return a `BREAK` we could extract the 
flowFile in the `onTrigger` and call this something like `processFlowFile` and 
eliminate the `TriggerResult`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-01 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r498162629



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -81,86 +101,78 @@ void CompressContent::initialize() {
 }
 
 void CompressContent::onSchedule(core::ProcessContext *context, 
core::ProcessSessionFactory *sessionFactory) {
-  std::string value;
   context->getProperty(CompressLevel.getName(), compressLevel_);
   context->getProperty(CompressMode.getName(), compressMode_);
-  context->getProperty(CompressFormat.getName(), compressFormat_);
+
+  {
+std::string compressFormatStr;
+context->getProperty(CompressFormat.getName(), compressFormatStr);
+std::transform(compressFormatStr.begin(), compressFormatStr.end(), 
compressFormatStr.begin(), ::tolower);
+compressFormat_ = 
ExtendedCompressionFormat::parse(compressFormatStr.c_str());
+  }
+
   context->getProperty(UpdateFileName.getName(), updateFileName_);
   context->getProperty(EncapsulateInTar.getName(), encapsulateInTar_);
+  context->getProperty(BatchSize.getName(), batchSize_);
 
-  logger_->log_info("Compress Content: Mode [%s] Format [%s] Level [%d] 
UpdateFileName [%d] EncapsulateInTar [%d]",
-  compressMode_, compressFormat_, compressLevel_, updateFileName_, 
encapsulateInTar_);
-
-  // update the mimeTypeMap
-  compressionFormatMimeTypeMap_["application/gzip"] = COMPRESSION_FORMAT_GZIP;
-  compressionFormatMimeTypeMap_["application/bzip2"] = 
COMPRESSION_FORMAT_BZIP2;
-  compressionFormatMimeTypeMap_["application/x-bzip2"] = 
COMPRESSION_FORMAT_BZIP2;
-  compressionFormatMimeTypeMap_["application/x-lzma"] = 
COMPRESSION_FORMAT_LZMA;
-  compressionFormatMimeTypeMap_["application/x-xz"] = 
COMPRESSION_FORMAT_XZ_LZMA2;
-  fileExtension_[COMPRESSION_FORMAT_GZIP] = ".gz";
-  fileExtension_[COMPRESSION_FORMAT_LZMA] = ".lzma";
-  fileExtension_[COMPRESSION_FORMAT_BZIP2] = ".bz2";
-  fileExtension_[COMPRESSION_FORMAT_XZ_LZMA2] = ".xz";
+  logger_->log_info("Compress Content: Mode [%s] CompressionFormat [%s] Level 
[%d] UpdateFileName [%d] EncapsulateInTar [%d]",
+  compressMode_, toString(compressFormat_), compressLevel_, 
updateFileName_, encapsulateInTar_);
 }
 
 void CompressContent::onTrigger(const std::shared_ptr 
&context, const std::shared_ptr &session) {
+  for (size_t i = 0; i < batchSize_; ++i) {
+if (onTriggerImpl(context, session) != TriggerResult::CONTINUE) {
+  break;
+}
+  }
+}
+
+CompressContent::TriggerResult CompressContent::onTriggerImpl(const 
std::shared_ptr &context, const 
std::shared_ptr &session) {
   std::shared_ptr flowFile = session->get();
 
   if (!flowFile) {
-return;
+return TriggerResult::BREAK;
   }
 
   session->remove(flowFile);
 
-  std::string compressFormat = compressFormat_;
-  if (compressFormat_ == COMPRESSION_FORMAT_ATTRIBUTE) {
+  CompressionFormat::Type compressFormat;
+  if (compressFormat_ == ExtendedCompressionFormat::USE_MIME_TYPE) {
 std::string attr;
 flowFile->getAttribute(core::SpecialFlowAttribute::MIME_TYPE, attr);
 if (attr.empty()) {
   logger_->log_error("No %s attribute existed for the flow, route to 
failure", core::SpecialFlowAttribute::MIME_TYPE);
   session->transfer(flowFile, Failure);
-  return;
+  return TriggerResult::CONTINUE;
 }
 auto search = compressionFormatMimeTypeMap_.find(attr);
 if (search != compressionFormatMimeTypeMap_.end()) {
   compressFormat = search->second;
 } else {
-  logger_->log_info("Mime type of %s is not indicated a support format, 
route to success", attr);
+  logger_->log_info("Mime type of %s is not indicated a support 
CompressionFormat, route to success", attr);

Review comment:
   Ctrl+R error





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-10-01 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r498160741



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -42,14 +42,14 @@ core::Property CompressContent::CompressMode(
 core::PropertyBuilder::createProperty("Mode")->withDescription("Indicates 
whether the processor should compress content or decompress content.")
 ->isRequired(false)->withAllowableValues({MODE_COMPRESS, 
MODE_DECOMPRESS})->withDefaultValue(MODE_COMPRESS)->build());
 core::Property CompressContent::CompressFormat(
-core::PropertyBuilder::createProperty("Compression 
Format")->withDescription("The compression format to use.")
+core::PropertyBuilder::createProperty("Compression 
CompressionFormat")->withDescription("The compression CompressionFormat to 
use.")

Review comment:
   restore property name





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org