[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread trkurc
Github user trkurc commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161325906
  
So, some evidence that this was intentional. 

1) it is *harder* to handle the single record case without arrays. 
@markap14 appears to have contributed this as part of NIFI-855 (even though 
@joemeszaros added the comments and was the last to touch these lines of code)
2) the processor description clearly states it has different behavior with 
multiple records
3) It makes sense to me that if you were administering a flow that only 
ever had a single avro record per flow file, you would NOT want it inside an 
array

I believe 3 is a valid use case, and this change may break flows. I highly 
recommend adding a flag to preserve this behavior. I explained more in jira 
(https://issues.apache.org/jira/browse/NIFI-1234)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread apiri
Github user apiri commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161330101
  
@trkurc 
I am failing at the Github and JIRA comment game, but as commented on the 
issue (and I am additionally editing myself for context here):

Sorry, missed some of your earlier comments (these were from last night 
when you expanded the cases) in flipping between here and Github. Also hate the 
property bloat, but think it is necessary to avoid the breaking change. Had a 
hard time grokking the use cases that steered it away from bug status, so am 
onboard with your viewpoint.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread apiri
Github user apiri commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161357053
  
Made the changes to include a property which defaults to the old behavior 
but enables control over whether a single record is placed in a container


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread trkurc
Github user trkurc commented on a diff in the pull request:

https://github.com/apache/nifi/pull/136#discussion_r46499759
  
--- Diff: 
nifi-nar-bundles/nifi-avro-bundle/nifi-avro-processors/src/main/java/org/apache/nifi/processors/avro/ConvertAvroToJSON.java
 ---
@@ -116,51 +124,59 @@ public void onTrigger(ProcessContext context, 
ProcessSession session) throws Pro
 }
 
 final String containerOption = 
context.getProperty(CONTAINER_OPTIONS).getValue();
+final boolean useContainer = 
containerOption.equals(CONTAINER_ARRAY);
+// Wrap a single record (inclusive of no records) only when a 
container is being used
+final boolean wrapSingleRecord = 
context.getProperty(WRAP_SINGLE_RECORD).asBoolean() && useContainer;
 
 try {
 flowFile = session.write(flowFile, new StreamCallback() {
 @Override
 public void process(final InputStream rawIn, final 
OutputStream rawOut) throws IOException {
 try (final InputStream in = new 
BufferedInputStream(rawIn);
-
-final OutputStream out = new 
BufferedOutputStream(rawOut);
-final DataFileStream reader = new 
DataFileStream<>(in, new GenericDatumReader())) {
+ final OutputStream out = new 
BufferedOutputStream(rawOut);
+ final DataFileStream reader = new 
DataFileStream<>(in, new GenericDatumReader())) {
 
 final GenericData genericData = GenericData.get();
 
-if (reader.hasNext() == false ) {
-out.write(EMPTY_JSON_OBJECT);
-return;
+int recordCount = 0;
+GenericRecord currRecord = null;
+if (reader.hasNext()) {
+currRecord = reader.next();
+recordCount++;
 }
-int recordCount = 1;
-GenericRecord reuse = reader.next();
-// Only open container if more than one record
-if(reader.hasNext() && 
containerOption.equals(CONTAINER_ARRAY)){
+
+// Open container if desired output is an array 
format and there are are multiple records or
+// if configured to wrap single record
+if (reader.hasNext() && useContainer || 
wrapSingleRecord) {
 out.write('[');
 }
-
out.write(genericData.toString(reuse).getBytes(StandardCharsets.UTF_8));
+
+// Determine the initial output record, inclusive 
if we should have an empty set of Avro records
+final byte[] outputBytes = (currRecord == null) ? 
EMPTY_JSON_OBJECT : 
genericData.toString(currRecord).getBytes(StandardCharsets.UTF_8);
+out.write(outputBytes);
 
 while (reader.hasNext()) {
-if (containerOption.equals(CONTAINER_ARRAY)) {
+if (useContainer) {
 out.write(',');
 } else {
 out.write('\n');
 }
 
-reuse = reader.next(reuse);
-
out.write(genericData.toString(reuse).getBytes(StandardCharsets.UTF_8));
+currRecord = reader.next(currRecord);
+
out.write(genericData.toString(currRecord).getBytes(StandardCharsets.UTF_8));
 recordCount++;
 }
 
-// Only close container if more than one record
-if (recordCount > 1 && 
containerOption.equals(CONTAINER_ARRAY)) {
+// Close container if desired output is an array 
format and there are multiple records or if
+// configured to wrap a single record
+if (recordCount > 1 && useContainer || 
wrapSingleRecord) {
--- End diff --

For next time, comment is good, but parens would make the logic more clear 
for the next coder to come along:
```
if ((recordCount > 1 && useContainer) || wrapSingleRecord) {
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org 

[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread trkurc
Github user trkurc commented on a diff in the pull request:

https://github.com/apache/nifi/pull/136#discussion_r46500106
  
--- Diff: 
nifi-nar-bundles/nifi-avro-bundle/nifi-avro-processors/src/main/java/org/apache/nifi/processors/avro/ConvertAvroToJSON.java
 ---
@@ -116,51 +124,59 @@ public void onTrigger(ProcessContext context, 
ProcessSession session) throws Pro
 }
 
 final String containerOption = 
context.getProperty(CONTAINER_OPTIONS).getValue();
+final boolean useContainer = 
containerOption.equals(CONTAINER_ARRAY);
+// Wrap a single record (inclusive of no records) only when a 
container is being used
+final boolean wrapSingleRecord = 
context.getProperty(WRAP_SINGLE_RECORD).asBoolean() && useContainer;
 
 try {
 flowFile = session.write(flowFile, new StreamCallback() {
 @Override
 public void process(final InputStream rawIn, final 
OutputStream rawOut) throws IOException {
 try (final InputStream in = new 
BufferedInputStream(rawIn);
-
-final OutputStream out = new 
BufferedOutputStream(rawOut);
-final DataFileStream reader = new 
DataFileStream<>(in, new GenericDatumReader())) {
+ final OutputStream out = new 
BufferedOutputStream(rawOut);
+ final DataFileStream reader = new 
DataFileStream<>(in, new GenericDatumReader())) {
 
 final GenericData genericData = GenericData.get();
 
-if (reader.hasNext() == false ) {
-out.write(EMPTY_JSON_OBJECT);
-return;
+int recordCount = 0;
+GenericRecord currRecord = null;
+if (reader.hasNext()) {
+currRecord = reader.next();
+recordCount++;
 }
-int recordCount = 1;
-GenericRecord reuse = reader.next();
-// Only open container if more than one record
-if(reader.hasNext() && 
containerOption.equals(CONTAINER_ARRAY)){
+
+// Open container if desired output is an array 
format and there are are multiple records or
+// if configured to wrap single record
+if (reader.hasNext() && useContainer || 
wrapSingleRecord) {
 out.write('[');
 }
-
out.write(genericData.toString(reuse).getBytes(StandardCharsets.UTF_8));
+
+// Determine the initial output record, inclusive 
if we should have an empty set of Avro records
+final byte[] outputBytes = (currRecord == null) ? 
EMPTY_JSON_OBJECT : 
genericData.toString(currRecord).getBytes(StandardCharsets.UTF_8);
+out.write(outputBytes);
 
 while (reader.hasNext()) {
-if (containerOption.equals(CONTAINER_ARRAY)) {
+if (useContainer) {
 out.write(',');
 } else {
 out.write('\n');
 }
 
-reuse = reader.next(reuse);
-
out.write(genericData.toString(reuse).getBytes(StandardCharsets.UTF_8));
+currRecord = reader.next(currRecord);
+
out.write(genericData.toString(currRecord).getBytes(StandardCharsets.UTF_8));
 recordCount++;
 }
 
-// Only close container if more than one record
-if (recordCount > 1 && 
containerOption.equals(CONTAINER_ARRAY)) {
+// Close container if desired output is an array 
format and there are multiple records or if
+// configured to wrap a single record
+if (recordCount > 1 && useContainer || 
wrapSingleRecord) {
--- End diff --

actually, isn't that the wrong logic? shouldn't it be 

if (useContainer && (recordCount > 1 || wrapSingleRecord))


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread trkurc
Github user trkurc commented on a diff in the pull request:

https://github.com/apache/nifi/pull/136#discussion_r46501389
  
--- Diff: 
nifi-nar-bundles/nifi-avro-bundle/nifi-avro-processors/src/main/java/org/apache/nifi/processors/avro/ConvertAvroToJSON.java
 ---
@@ -116,51 +124,59 @@ public void onTrigger(ProcessContext context, 
ProcessSession session) throws Pro
 }
 
 final String containerOption = 
context.getProperty(CONTAINER_OPTIONS).getValue();
+final boolean useContainer = 
containerOption.equals(CONTAINER_ARRAY);
+// Wrap a single record (inclusive of no records) only when a 
container is being used
+final boolean wrapSingleRecord = 
context.getProperty(WRAP_SINGLE_RECORD).asBoolean() && useContainer;
 
 try {
 flowFile = session.write(flowFile, new StreamCallback() {
 @Override
 public void process(final InputStream rawIn, final 
OutputStream rawOut) throws IOException {
 try (final InputStream in = new 
BufferedInputStream(rawIn);
-
-final OutputStream out = new 
BufferedOutputStream(rawOut);
-final DataFileStream reader = new 
DataFileStream<>(in, new GenericDatumReader())) {
+ final OutputStream out = new 
BufferedOutputStream(rawOut);
+ final DataFileStream reader = new 
DataFileStream<>(in, new GenericDatumReader())) {
 
 final GenericData genericData = GenericData.get();
 
-if (reader.hasNext() == false ) {
-out.write(EMPTY_JSON_OBJECT);
-return;
+int recordCount = 0;
+GenericRecord currRecord = null;
+if (reader.hasNext()) {
+currRecord = reader.next();
+recordCount++;
 }
-int recordCount = 1;
-GenericRecord reuse = reader.next();
-// Only open container if more than one record
-if(reader.hasNext() && 
containerOption.equals(CONTAINER_ARRAY)){
+
+// Open container if desired output is an array 
format and there are are multiple records or
+// if configured to wrap single record
+if (reader.hasNext() && useContainer || 
wrapSingleRecord) {
 out.write('[');
 }
-
out.write(genericData.toString(reuse).getBytes(StandardCharsets.UTF_8));
+
+// Determine the initial output record, inclusive 
if we should have an empty set of Avro records
+final byte[] outputBytes = (currRecord == null) ? 
EMPTY_JSON_OBJECT : 
genericData.toString(currRecord).getBytes(StandardCharsets.UTF_8);
+out.write(outputBytes);
 
 while (reader.hasNext()) {
-if (containerOption.equals(CONTAINER_ARRAY)) {
+if (useContainer) {
 out.write(',');
 } else {
 out.write('\n');
 }
 
-reuse = reader.next(reuse);
-
out.write(genericData.toString(reuse).getBytes(StandardCharsets.UTF_8));
+currRecord = reader.next(currRecord);
+
out.write(genericData.toString(currRecord).getBytes(StandardCharsets.UTF_8));
 recordCount++;
 }
 
-// Only close container if more than one record
-if (recordCount > 1 && 
containerOption.equals(CONTAINER_ARRAY)) {
+// Close container if desired output is an array 
format and there are multiple records or if
+// configured to wrap a single record
+if (recordCount > 1 && useContainer || 
wrapSingleRecord) {
--- End diff --

@apiri how logic flows through code is a personal preference. I was 
confused for a spell, but there were unit tests checking the behavior. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/136


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread olegz
Github user olegz commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161287833
  
I am on the side of @apiri. I can't imagine that was the desirable behavior 
either, since it was simply inconsistent. If one **explicitly** requests an 
array, than it should return an array, no matter how many elements in it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread apiri
Github user apiri commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161289671
  
It covers the case of records but not record.  I read the documentation to 
mean that as long as the input flowfile was of Avro formatted record(s), it 
would perform the associated conversion but admittedly the comments in the code 
provide much more context.

Regardless, have updated the comments and pushed the branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-02 Thread trkurc
Github user trkurc commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161487382
  
@olegz - I am also guilty of missing comments between github and jira. the 
choice of array or none for container wasn't in 0.3.0, so it wasn't an explicit 
option then. I was basing a lot of my assumptions for "correct" behavior on the 
original patch submitted for NIFI-855 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-01 Thread apiri
GitHub user apiri opened a pull request:

https://github.com/apache/nifi/pull/136

NIFI-1234 Correcting container functionality for single Avro records.

@joemeszaros Had a user open an issue concerning the processor's handling 
of events with only one record.  I agree that the functionality seems a bit odd 
and was wondering if there was a specific case that would be missed should the 
containerOption be the only item to dictate format.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apiri/incubator-nifi NIFI-1234

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/136.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #136


commit 4e07619fd48737b003361070a583330abd2e5d8b
Author: Aldrin Piri 
Date:   2015-12-01T22:38:46Z

NIFI-1234 Correcting container functionality for single Avro records.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-01 Thread apiri
Github user apiri commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161178953
  
@trkurc good catches on the comments. Overall would you feel like this is a 
bug? That's the impression I get and trying to think if this is adjusted is it 
breaking? I imagine it could be but have a hard time envisioning that behavior 
being desirable. Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1234 Correcting container functionality fo...

2015-12-01 Thread trkurc
Github user trkurc commented on the pull request:

https://github.com/apache/nifi/pull/136#issuecomment-161179877
  
both the code and processor description strongly indicated this was 
intentional.

"If an incoming FlowFile contains a stream of multiple Avro records, the 
resultant FlowFile will contain a JSON Array containing all of the Avro 
records."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---