[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2256


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-12 Thread MikeThomsen
Github user MikeThomsen commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2256#discussion_r150412388
  
--- Diff: 
nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/java/org/apache/nifi/hbase/PutHBaseRecord.java
 ---
@@ -127,6 +127,18 @@
 .defaultValue("1000")
 .build();
 
+protected static final AllowableValue NULL_FIELD_EMPTY = new 
AllowableValue("empty-string", "Empty String", "Use an empty string");
+protected static final AllowableValue NULL_FIELD_SKIP  = new 
AllowableValue("skip-field", "Skip Field", "Skip the field (don't process it at 
all).");
--- End diff --

Yeah, I think that something like that is warranted here.


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-12 Thread MikeThomsen
Github user MikeThomsen commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2256#discussion_r150412256
  
--- Diff: 
nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/java/org/apache/nifi/hbase/PutHBaseRecord.java
 ---
@@ -127,6 +127,18 @@
 .defaultValue("1000")
 .build();
 
+protected static final AllowableValue NULL_FIELD_EMPTY = new 
AllowableValue("empty-string", "Empty String", "Use an empty string");
--- End diff --

My original intent was to make it easier on people writing Spark or MR jobs 
against HBase so they could assume in this case that the cell is there but has 
something akin to a null value. However, the points you made here do stand and 
should be added.


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-10 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2256#discussion_r150375738
  
--- Diff: 
nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/java/org/apache/nifi/hbase/PutHBaseRecord.java
 ---
@@ -127,6 +127,18 @@
 .defaultValue("1000")
 .build();
 
+protected static final AllowableValue NULL_FIELD_EMPTY = new 
AllowableValue("empty-string", "Empty String", "Use an empty string");
--- End diff --

I assume the purpose of putting an empty bytes is to empty an existing cell 
data in the column family. If so, I think we should mention that in the 
description of this and 'Skip Field' options so that user can understand 
ramifications, such as 'Skipping null value doesn't clear existing cell value.' 
.. etc


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-10 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2256#discussion_r150375912
  
--- Diff: 
nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/java/org/apache/nifi/hbase/PutHBaseRecord.java
 ---
@@ -127,6 +127,18 @@
 .defaultValue("1000")
 .build();
 
+protected static final AllowableValue NULL_FIELD_EMPTY = new 
AllowableValue("empty-string", "Empty String", "Use an empty string");
+protected static final AllowableValue NULL_FIELD_SKIP  = new 
AllowableValue("skip-field", "Skip Field", "Skip the field (don't process it at 
all).");
+
+protected static final PropertyDescriptor NULL_FIELD_STRATEGY = new 
PropertyDescriptor.Builder()
+.name("hbase-record-null-field-strategy")
+.displayName("Null Field Strategy")
+.required(true)
+.defaultValue("empty-string")
--- End diff --

IMHO, I personally prefer having 'Skip Field' as the default value, because 
I think this processor is used more for ingesting stream of data, than updating 
existing data. In that sense, I prefer not to store empty bytes to minimize 
resource utilization.


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-10 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2256#discussion_r150375690
  
--- Diff: 
nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/java/org/apache/nifi/hbase/PutHBaseRecord.java
 ---
@@ -127,6 +127,18 @@
 .defaultValue("1000")
 .build();
 
+protected static final AllowableValue NULL_FIELD_EMPTY = new 
AllowableValue("empty-string", "Empty String", "Use an empty string");
--- End diff --

Shouldn't the name of this option be 'Empty Bytes'? As there's Record data 
types other than String, 'Empty Bytes' sounds more appropriate to me.


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-10 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2256#discussion_r150375195
  
--- Diff: 
nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/java/org/apache/nifi/hbase/PutHBaseRecord.java
 ---
@@ -127,6 +127,18 @@
 .defaultValue("1000")
 .build();
 
+protected static final AllowableValue NULL_FIELD_EMPTY = new 
AllowableValue("empty-string", "Empty String", "Use an empty string");
+protected static final AllowableValue NULL_FIELD_SKIP  = new 
AllowableValue("skip-field", "Skip Field", "Skip the field (don't process it at 
all).");
--- End diff --

If a record only has 'Row Identifier Field', I got the following error with 
'Skip Field' strategy. This is another edge case, but do we want to treat such 
records as no-op, so that other records in the same batch operation can be put 
unaffected?

```
2017-11-11 14:09:46,868 ERROR [Timer-Driven Process Thread-2] 
org.apache.nifi.hbase.PutHBaseRecord 
PutHBaseRecord[id=9237f831-4db5-30d0-95e6-9402aa75bac0] Failed to put records 
to HBase.: java.lang.IllegalArgumentException: No columns to insert
java.lang.IllegalArgumentException: No columns to insert
at 
org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1515)
at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.validatePut(BufferedMutatorImpl.java:147)
at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.doMutate(BufferedMutatorImpl.java:134)
at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:105)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1050)
at 
org.apache.nifi.hbase.HBase_1_1_2_ClientService.put(HBase_1_1_2_ClientService.java:315)
```


---


[GitHub] nifi pull request #2256: NIFI-4578 Added strategy for dealing with nullable ...

2017-11-07 Thread MikeThomsen
GitHub user MikeThomsen opened a pull request:

https://github.com/apache/nifi/pull/2256

NIFI-4578 Added strategy for dealing with nullable fields in PutHBase…

…Record.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MikeThomsen/nifi NIFI-4578

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2256.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2256


commit d71ca4cd0be94e286b8b063d679444663d061668
Author: Mike Thomsen 
Date:   2017-11-07T16:24:46Z

NIFI-4578 Added strategy for dealing with nullable fields in PutHBaseRecord.




---