Re: ListS3 emitted record twice

2017-12-06 Thread James Wing
Neil,

I'm not aware of this problem for ListS3.  I do not suggest there are no
issues, rather that many users might not notice or have come to accept some
variance in the accuracy of ListS3.  If you can persuade ListS3 to do it
again, that would be great :).

We did recently hear a report of similar behavior in the
similarly-implemented ListGCSBucket processor that does the same list
operation for Google Cloud Storage.  In my brief experience troubleshooting
ListGCSBucket, the issue appears to be that GCS would report different last
modified timestamps in different list API responses, despite what I
believed to be a single write.  I rationalized that as a product of
eventual consistency when write and list operations were taking place
within a few seconds.  That explanation would not make sense with a 10-week
old file.

One outcome of the ListGCSBucket episode was that using a DetectDuplicates
processor after the list processor to check for unique keys can be an
effective workaround.

Thanks,

James

On Wed, Dec 6, 2017 at 11:24 AM, Neil Derraugh <
neil.derra...@intellifylearning.com> wrote:

> I have a slowly changing S3 bucket.  It has about 10 files in it.
>
> Prior to today the bucket's most recently modified file was modified
> on September 15, 2017 2:54:40 PM.
>
> One of the files just got updated today (December 6, 2017 4:58:22 PM) and
> ListS3 emitted it properly.  But It also (re-)emitted that file last
> modified on September 15, 2017 2:54:40 PM.  I checked the etags from
> September and today on the spurious file and they match.  Confusing
> behavior.
>
> Anybody seen anything like this before, or know why it happened?
>
> Thanks,
> Neil
>


RE: [EXT] CDC like updates on Nifi

2017-12-06 Thread Peter Wicks (pwicks)
Alberto,

You probably just need to try out the options and see what works best (Avro or 
ORC, etc…).

With the Avro option, you wouldn’t need to change the type of your main HIVE 
table, keep that as ORC.
Only the staging table would use Avro. Then call Hive QL to merge the data from 
your staging table into your main table. Let your clusters CPU power crunch 
through the data to do the merge.

If you split the data using SplitRecord into individual rows then you could 
probably route on the transaction type. But working with individual rows in 
NiFi adds a lot of overhead, and just imagine executing 10k Hive QL SQL 
statements instead of 1 big one… If you have ACID enabled I guess it would all 
get recombined, but the overhead of calling that many statements would be 
really high.

--Peter

From: Alberto Bengoa [mailto:albe...@propus.com.br]
Sent: Thursday, December 07, 2017 02:27
To: users@nifi.apache.org
Subject: Re: [EXT] CDC like updates on Nifi

On Tue, Dec 5, 2017 at 11:55 PM, Peter Wicks (pwicks) 
> wrote:

Alberto,
Hello Peter,

Thanks for your answer.



Since it sounds like you have control over the structure of the tables, this 
should be doable.



If you have a changelog table for each table this will probably be easier, and 
in your changelog table you’ll need to make sure you have a good transaction 
timestamp column and a change type column (I/U/D). Then use QueryDatabaseTable 
to tail your change log table, one copy of QueryDatabaseTable for each change 
table.

Yes. This is the way that I'm trying to do. I have the TimeStamp and Operation 
type columns as "metadata columns" and all the other "data columns" of each 
table.



Now your changes are in easy to ingest Avro files. For HIVE I’d probably use an 
external table with the Avro schema, this makes it easy to use PutHDFS to load 
the file and make it accessible from HIVE. I haven’t used Phoenix, sorry.

Hmm. Sounds interesting.

I was planning to use ORC because it's allow transactions (to make updates / 
deletes). Avro do not allow transactions, but changing data using HDFS instead 
of HiveQL would be an option.

Would be possible to update fields of specific records using PutHDFS?

On my changelog table I do not have the entire row data when triggered by an 
update. I just have values of changed fields (not changed fields have  
values on changelog tables).

_TimeStamp  _OperationColumn_A Column_B  
Column_C
2017-12-01 14:35:56:204 - 02:00  3 7501   
2017-12-01 14:35:56:211 - 02:00  4 7501 1234  
2017-12-01 15:25:35:945 - 02:00  3 7503   
2017-12-01 15:25:35:945 - 02:00  4 7503 5678  

In the example above, we had two update operations (_Operation = 4). Column_B 
was changed, Column_C not. Column_C would have any prior value.


If you have a single change table for all tables, then you can still use the 
above patter, but you’ll need a middle step where you extract and rebuild the 
changes. Maybe if you store the changes in JSON you could extract them using 
one of the Record parsers and then rebuild the data row. Much harder though.

I have one changelog table for each table.

Considering that I would use HiveQL to update tables on the Datalake, could I 
use a RouteOnContent processor to create SQL Queries according to the 
_Operation type?






Thanks,

  Peter



Thanks you!

Alberto


From: Alberto Bengoa 
[mailto:albe...@propus.com.br]
Sent: Wednesday, December 06, 2017 06:24
To: users@nifi.apache.org
Subject: [EXT] CDC like updates on Nifi



Hey folks,



I read about Nifi CDC processor for MySQL and other CDC "solutions" with Nifi 
found on Google, like these:



https://community.hortonworks.com/idea/53420/apache-nifi-processor-to-address-cdc-use-cases-for.html

https://community.hortonworks.com/questions/88686/change-data-capture-using-nifi-1.html

https://community.hortonworks.com/articles/113941/change-data-capture-cdc-with-apache-nifi-version-1-1.html



I'm trying a different approach to acquire fresh information from tables, using 
triggers on source database's tables to write changes to a "changelog table".



This is done, but my questions are:



Would Nifi be capable to read this tables, transform these data to generate a 
SQL equivalent query (insert/update/delete) to send to Hive and/or Phoenix with 
current available processors?



Which would be the best / suggested flow?



The objective is to keep tables on the Data Lake as up-to-date as possible for 
real time analyses.



Cheers,

Alberto



Re: Hyphenated Tables and Columns names

2017-12-06 Thread Alberto Bengoa
Matt,

Not sure if it's related.

I'm trying to use a timestamp column as Maximum-value Column, but it keeps
looping.

I have set Use Avro Logical Types = true on my QueryDatabase processor.

The original columns values are like this:

_Time-Stamp
--
2017-12-01 14:35:56:204 - 02:00
2017-12-01 14:35:56:211 - 02:00
2017-12-01 15:25:35:945 - 02:00
2017-12-01 15:25:35:945 - 02:00
2017-12-01 15:28:23:046 - 02:00

So I'm converting to timestamp milis using CAST("_Time-Stamp" as
TIMESTAMP)"_Time-Stamp"

_Time-Stamp
-
2017-12-01 14:35:56.204
2017-12-01 14:35:56.211
2017-12-01 15:25:35.945
2017-12-01 15:25:35.945
2017-12-01 15:28:23.046

The state seems right:

KeyValueScope
pub."man_fabrica-cdc"@!@_time-stamp  2017-12-04 15:33:23.995
 Cluster

Any clue?

Thank you!

Alberto

On Wed, Dec 6, 2017 at 4:06 PM, Alberto Bengoa 
wrote:

> Matt,
>
> Perfect! Enabled and working now.
>
> Thank you!
>
> Cheers,
> Alberto
>
>
>
> On Wed, Dec 6, 2017 at 3:54 PM, Matt Burgess  wrote:
>
>> Alberto,
>>
>> What version of NiFi are you using? As of version 1.1.0,
>> QueryDatabaseTable has a "Normalize Table/Column Names" property that
>> you can set to true, and it will replace all Avro-illegal characters
>> with underscores.
>>
>> Regards,
>> Matt
>>
>>
>> On Wed, Dec 6, 2017 at 12:06 PM, Alberto Bengoa 
>> wrote:
>> > Hey Folks,
>> >
>> > I'm facing an odd situation with Nifi and Tables / Columns that have
>> hyphens
>> > on names (traceback below).
>> >
>> > I found on Avro Spec [1] that hyphens are not allowed, which makes
>> sense to
>> > have this error.
>> >
>> > There is any way to deal with this situation on Nifi instead of changing
>> > table/columns name or creating views to rename the hyphenated names?
>> >
>> > I'm getting this error on the first processor (QueryDatabaseTable) of my
>> > flow.
>> >
>> > Thanks!
>> >
>> > Alberto Bengoa
>> >
>> > [1] - https://avro.apache.org/docs/1.7.7/spec.html#Names
>> >
>> >
>> > 2017-12-06 14:37:25,809 ERROR [Timer-Driven Process Thread-2]
>> > o.a.n.p.standard.QueryDatabaseTable
>> > QueryDatabaseTable[id=9557387b-bbd6-1b2f-b68b-5a4458986794] Unable to
>> > execute SQL select query SELECT "_Change-Sequence" FROM
>> PUB.man_factory_cdc
>> > due to org.apache.nifi.processor.exception.ProcessException: Error
>> during
>> > database query or conversion of records to Avro.: {}
>> > org.apache.nifi.processor.exception.ProcessException: Error during
>> database
>> > query or conversion of records to Avro.
>> > at
>> > org.apache.nifi.processors.standard.QueryDatabaseTable.lambd
>> a$onTrigger$0(QueryDatabaseTable.java:289)
>> > at
>> > org.apache.nifi.controller.repository.StandardProcessSession
>> .write(StandardProcessSession.java:2526)
>> > at
>> > org.apache.nifi.processors.standard.QueryDatabaseTable.onTri
>> gger(QueryDatabaseTable.java:283)
>> > at
>> > org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
>> tandardProcessorNode.java:1118)
>> > at
>> > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>> .call(ContinuallyRunProcessorTask.java:147)
>> > at
>> > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>> .call(ContinuallyRunProcessorTask.java:47)
>> > at
>> > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
>> gent$1.run(TimerDrivenSchedulingAgent.java:132)
>> > at
>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:
>> 308)
>> > at
>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> > at
>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.run(ScheduledThreadPoolExecutor.java:294)
>> > at
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>> > at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>> > at java.lang.Thread.run(Thread.java:745)
>> > Caused by: org.apache.avro.SchemaParseException: Illegal character in:
>> > _Change-Sequence
>> >
>>
>
>


Re: [EXT] CDC like updates on Nifi

2017-12-06 Thread Alberto Bengoa
On Tue, Dec 5, 2017 at 11:55 PM, Peter Wicks (pwicks) 
wrote:

> Alberto,
>
Hello Peter,

Thanks for your answer.

>
>
> Since it sounds like you have control over the structure of the tables,
> this should be doable.
>
>
>
> If you have a changelog table for each table this will probably be easier,
> and in your changelog table you’ll need to make sure you have a good
> transaction timestamp column and a change type column (I/U/D). Then use
> QueryDatabaseTable to tail your change log table, one copy of
> QueryDatabaseTable for each change table.
>

Yes. This is the way that I'm trying to do. I have the TimeStamp and
Operation type columns as "metadata columns" and all the other "data
columns" of each table.

>
>
> Now your changes are in easy to ingest Avro files. For HIVE I’d probably
> use an external table with the Avro schema, this makes it easy to use
> PutHDFS to load the file and make it accessible from HIVE. I haven’t used
> Phoenix, sorry.
>

Hmm. Sounds interesting.

I was planning to use ORC because it's allow transactions (to make updates
/ deletes). Avro do not allow transactions, but changing data using HDFS
instead of HiveQL would be an option.

Would be possible to update fields of specific records using PutHDFS?

On my changelog table I do not have the entire row data when triggered by
an update. I just have values of changed fields (not changed fields have
 values on changelog tables).

_TimeStamp  _OperationColumn_A Column_B
Column_C
2017-12-01 14:35:56:204 - 02:00  3 7501   
2017-12-01 14:35:56:211 - 02:00  4 7501 1234  
2017-12-01 15:25:35:945 - 02:00  3 7503   
2017-12-01 15:25:35:945 - 02:00  4 7503 5678  

In the example above, we had two update operations (_Operation = 4).
Column_B was changed, Column_C not. Column_C would have any prior value.


> If you have a single change table for all tables, then you can still use
> the above patter, but you’ll need a middle step where you extract and
> rebuild the changes. Maybe if you store the changes in JSON you could
> extract them using one of the Record parsers and then rebuild the data row.
> Much harder though.
>

I have one changelog table for each table.

Considering that I would use HiveQL to update tables on the Datalake, could
I use a RouteOnContent processor to create SQL Queries according to the
_Operation type?

>
>

>


> Thanks,
>
>   Peter
>
>
>

Thanks you!

Alberto


> *From:* Alberto Bengoa [mailto:albe...@propus.com.br]
> *Sent:* Wednesday, December 06, 2017 06:24
> *To:* users@nifi.apache.org
> *Subject:* [EXT] CDC like updates on Nifi
>
>
>
> Hey folks,
>
>
>
> I read about Nifi CDC processor for MySQL and other CDC "solutions" with
> Nifi found on Google, like these:
>
>
>
> https://community.hortonworks.com/idea/53420/apache-nifi-pro
> cessor-to-address-cdc-use-cases-for.html
>
> https://community.hortonworks.com/questions/88686/change-dat
> a-capture-using-nifi-1.html
>
> https://community.hortonworks.com/articles/113941/change-dat
> a-capture-cdc-with-apache-nifi-version-1-1.html
>
>
>
> I'm trying a different approach to acquire fresh information from tables,
> using triggers on source database's tables to write changes to a "changelog
> table".
>
>
>
> This is done, but my questions are:
>
>
>
> Would Nifi be capable to read this tables, transform these data to
> generate a SQL equivalent query (insert/update/delete) to send to Hive
> and/or Phoenix with current available processors?
>
>
>
> Which would be the best / suggested flow?
>
>
>
> The objective is to keep tables on the Data Lake as up-to-date as possible
> for real time analyses.
>
>
>
> Cheers,
>
> Alberto
>


Re: Hyphenated Tables and Columns names

2017-12-06 Thread Alberto Bengoa
Matt,

Perfect! Enabled and working now.

Thank you!

Cheers,
Alberto



On Wed, Dec 6, 2017 at 3:54 PM, Matt Burgess  wrote:

> Alberto,
>
> What version of NiFi are you using? As of version 1.1.0,
> QueryDatabaseTable has a "Normalize Table/Column Names" property that
> you can set to true, and it will replace all Avro-illegal characters
> with underscores.
>
> Regards,
> Matt
>
>
> On Wed, Dec 6, 2017 at 12:06 PM, Alberto Bengoa 
> wrote:
> > Hey Folks,
> >
> > I'm facing an odd situation with Nifi and Tables / Columns that have
> hyphens
> > on names (traceback below).
> >
> > I found on Avro Spec [1] that hyphens are not allowed, which makes sense
> to
> > have this error.
> >
> > There is any way to deal with this situation on Nifi instead of changing
> > table/columns name or creating views to rename the hyphenated names?
> >
> > I'm getting this error on the first processor (QueryDatabaseTable) of my
> > flow.
> >
> > Thanks!
> >
> > Alberto Bengoa
> >
> > [1] - https://avro.apache.org/docs/1.7.7/spec.html#Names
> >
> >
> > 2017-12-06 14:37:25,809 ERROR [Timer-Driven Process Thread-2]
> > o.a.n.p.standard.QueryDatabaseTable
> > QueryDatabaseTable[id=9557387b-bbd6-1b2f-b68b-5a4458986794] Unable to
> > execute SQL select query SELECT "_Change-Sequence" FROM
> PUB.man_factory_cdc
> > due to org.apache.nifi.processor.exception.ProcessException: Error
> during
> > database query or conversion of records to Avro.: {}
> > org.apache.nifi.processor.exception.ProcessException: Error during
> database
> > query or conversion of records to Avro.
> > at
> > org.apache.nifi.processors.standard.QueryDatabaseTable.
> lambda$onTrigger$0(QueryDatabaseTable.java:289)
> > at
> > org.apache.nifi.controller.repository.StandardProcessSession.write(
> StandardProcessSession.java:2526)
> > at
> > org.apache.nifi.processors.standard.QueryDatabaseTable.
> onTrigger(QueryDatabaseTable.java:283)
> > at
> > org.apache.nifi.controller.StandardProcessorNode.onTrigger(
> StandardProcessorNode.java:1118)
> > at
> > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(
> ContinuallyRunProcessorTask.java:147)
> > at
> > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(
> ContinuallyRunProcessorTask.java:47)
> > at
> > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(
> TimerDrivenSchedulingAgent.java:132)
> > at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > at java.util.concurrent.FutureTask.runAndReset(
> FutureTask.java:308)
> > at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> > at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> > at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> > at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> > Caused by: org.apache.avro.SchemaParseException: Illegal character in:
> > _Change-Sequence
> >
>


Re: Hyphenated Tables and Columns names

2017-12-06 Thread Matt Burgess
Alberto,

What version of NiFi are you using? As of version 1.1.0,
QueryDatabaseTable has a "Normalize Table/Column Names" property that
you can set to true, and it will replace all Avro-illegal characters
with underscores.

Regards,
Matt


On Wed, Dec 6, 2017 at 12:06 PM, Alberto Bengoa  wrote:
> Hey Folks,
>
> I'm facing an odd situation with Nifi and Tables / Columns that have hyphens
> on names (traceback below).
>
> I found on Avro Spec [1] that hyphens are not allowed, which makes sense to
> have this error.
>
> There is any way to deal with this situation on Nifi instead of changing
> table/columns name or creating views to rename the hyphenated names?
>
> I'm getting this error on the first processor (QueryDatabaseTable) of my
> flow.
>
> Thanks!
>
> Alberto Bengoa
>
> [1] - https://avro.apache.org/docs/1.7.7/spec.html#Names
>
>
> 2017-12-06 14:37:25,809 ERROR [Timer-Driven Process Thread-2]
> o.a.n.p.standard.QueryDatabaseTable
> QueryDatabaseTable[id=9557387b-bbd6-1b2f-b68b-5a4458986794] Unable to
> execute SQL select query SELECT "_Change-Sequence" FROM PUB.man_factory_cdc
> due to org.apache.nifi.processor.exception.ProcessException: Error during
> database query or conversion of records to Avro.: {}
> org.apache.nifi.processor.exception.ProcessException: Error during database
> query or conversion of records to Avro.
> at
> org.apache.nifi.processors.standard.QueryDatabaseTable.lambda$onTrigger$0(QueryDatabaseTable.java:289)
> at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2526)
> at
> org.apache.nifi.processors.standard.QueryDatabaseTable.onTrigger(QueryDatabaseTable.java:283)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1118)
> at
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
> at
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.avro.SchemaParseException: Illegal character in:
> _Change-Sequence
>


Hyphenated Tables and Columns names

2017-12-06 Thread Alberto Bengoa
Hey Folks,

I'm facing an odd situation with Nifi and Tables / Columns that have
hyphens on names (traceback below).

I found on Avro Spec [1] that hyphens are not allowed, which makes sense to
have this error.

There is any way to deal with this situation on Nifi instead of changing
table/columns name or creating views to rename the hyphenated names?

I'm getting this error on the first processor (QueryDatabaseTable) of my
flow.

Thanks!

Alberto Bengoa

[1] - https://avro.apache.org/docs/1.7.7/spec.html#Names


2017-12-06 14:37:25,809 ERROR [Timer-Driven Process Thread-2]
o.a.n.p.standard.QueryDatabaseTable
QueryDatabaseTable[id=9557387b-bbd6-1b2f-b68b-5a4458986794] Unable to
execute SQL select query SELECT "_Change-Sequence" FROM PUB.man_factory_cdc
due to org.apache.nifi.processor.exception.ProcessException: Error during
database query or conversion of records to Avro.: {}
org.apache.nifi.processor.exception.ProcessException: Error during database
query or conversion of records to Avro.
at
org.apache.nifi.processors.standard.QueryDatabaseTable.lambda$onTrigger$0(QueryDatabaseTable.java:289)
at
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2526)
at
org.apache.nifi.processors.standard.QueryDatabaseTable.onTrigger(QueryDatabaseTable.java:283)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1118)
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.avro.SchemaParseException: Illegal character in:
_Change-Sequence


Re: NiFi copying windows .part files

2017-12-06 Thread Ravi Papisetti (rpapiset)
File names are ending with .part (.part is getting suffixed to filename, not 
prefixed).

In case of files from linux file system, “.” (dot) getting prefixed while 
transfer is in progress.

Thanks,
Ravi Papisetti

On 06/12/17, 1:47 AM, "Joe Witt"  wrote:

Imagine a filename construct where you wanted to pick up any file that
begins with the phrase 'start' but does NOT end in the phrase 'part'.

The name is of a form 'begin.middle.end'.

This filename start.middle.ok would get picked up.

This filename start.middle.part would not.

The pattern for that example would be
  start\..+\.(?!part)

The key part of that is the negative lookahead for ensuring it does
not end in part.

Thanks

On Wed, Dec 6, 2017 at 2:29 AM, Ravi Papisetti (rpapiset)
 wrote:
> Yeah..that is good idea, but we are already using this option to copy 
file with certain prefix. Not sure how I can use this field to meet both 
exclusion and inclusion criterion.
>
> Any thoughts.
>
> Thanks,
> Ravi Papisetti
>
> On 06/12/17, 1:26 AM, "Joe Witt"  wrote:
>
> Ravi
>
> Please use the 'File Filter' property of ListFile to control ignoring
> filenames until they no longer end in 'part'.
>
> Thanks
>
> On Wed, Dec 6, 2017 at 2:14 AM, Ravi Papisetti (rpapiset)
>  wrote:
> > Hi,
> >
> >
> >
> > We are using Apache NiFi 1.3.0
> >
> >
> >
> > We have a process flow to copy files from NFS to HDFS (with 
processors
> > ListFile, FetchFile and PutHDFS)
> >
> >
> >
> > In the NiFi process flow, ListFile is configured to listen to a 
directory on
> > NFS. When a file (ex: x.csv) is being copied from a windows machine 
to NFS
> > (while transfer is in the middle), a part file(x.csv.part) is 
created at NFS
> > until transfer is complete.
> >
> >
> >
> > ListFile has picked up this x.csv.part file and fetchFile picked up 
this to
> > transfer to HDFS, didn’t update the file name back to x.csv in HDFS 
when
> > transfer is complete.
> >
> >
> >
> > But, in case a file from linux file system, while file copy to NFS 
is in
> > progress it created (.x.csv) and when transfer is complete, at both 
NFS and
> > HDFS, filename is updated to x.csv (from .x.csv).
> >
> >
> >
> > Any thought how we can configure ListFile not to pickup these part 
files or
> > any configurations in NiFi that fixes file names for these windows 
part
> > files?
> >
> >
> >
> > Appreciate your help.
> >
> >
> >
> > Thanks,
> >
> > Ravi Papisetti
>
>




Re: unable to start InvokeHTTP processor in secure Nifi 1.4.0 cluster....

2017-12-06 Thread Josh Anderton
Joe,

If I can find some time I would love too,  not sure when that may happen.
If someone beats me to the punch, it won't hurt my feelings, but if the
JIRA is still open when I get some time, I may take a stab at it.

Thanks,
Josh

On Wed, Dec 6, 2017 at 9:19 AM, Joe Witt  wrote:

> Josh - great find and response!  Thanks.  Any chance you'd like to
> make a PR for it?
>
> On Wed, Dec 6, 2017 at 9:15 AM, dan young  wrote:
> > Heya Josh,
> >
> > Awesome!  This seemed to get me past at least starting the InvokeHTTP.  I
> > will try the flow out later this morning.  Thank you for the follow-up!
> >
> > Regards,
> >
> > Dano
> >
> >
> > On Tue, Dec 5, 2017 at 10:39 PM Josh Anderton 
> > wrote:
> >>
> >> Hi Dan/Joe,
> >>
> >> I have encountered the same issue and after a bit of digging it appears
> as
> >> if during the update to OkHttp3 a bug was introduced in the
> >> setSslFactoryMethod.  The issue is that the method attempts to prepare a
> >> keystore even if properties for the keystore are not defined in the
> >> SSLContextFactory.  The exception is being thrown around line 571 of
> >> InvokeHTTP as a keystore is attempted to be initialized without a
> keystore
> >> type.
> >>
> >> The good news is that there appears to be an easy workaround (not fully
> >> tested yet) which is to define a keystore in your SSLContextFactory,
> you can
> >> even use the same properties already defined for your truststore and I
> >> believe your processor will start working.
> >>
> >> Please let me know if I have misdiagnosed or if there are issues with
> the
> >> workaround.
> >>
> >> Thanks,
> >> Josh
> >>
> >> On Tue, Dec 5, 2017 at 9:42 AM, dan young  wrote:
> >>>
> >>> Hello Joe,
> >>>
> >>> Here's the JIRA. LMK if you need additional details.
> >>>
> >>> https://issues.apache.org/jira/browse/NIFI-4655
> >>>
> >>> Regards,
> >>>
> >>> Dano
> >>>
> >>> On Mon, Dec 4, 2017 at 10:46 AM Joe Witt  wrote:
> 
>  Dan
> 
>  Please share as much of your config for the processor as you can.
>  Also, please file a JIRA for this.  There is definitely a bug that
>  needs to be addressed if you can make an NPE happen.
> 
>  Thanks
> 
>  On Mon, Dec 4, 2017 at 12:27 PM, dan young 
> wrote:
>  > Hello,
>  >
>  >
>  > I'm working on migrating some flows over to a secure cluster with
>  > OIDC. When
>  > I try to start an InvokeHTTP processor, I'm getting the following
>  > errors in
>  > the logs.  Is there some permission/policy that I need to set for
> this
>  > to
>  > work?  or is this something else?
>  >
>  >
>  > Nifi 1.4.0
>  >
>  >
>  > 2017-12-04 17:20:03,972 ERROR [StandardProcessScheduler Thread-8]
>  > o.a.nifi.processors.standard.InvokeHTTP
>  > InvokeHTTP[id=ae055c76-88b8-3c86-bd1e-06ca4dcb43d5]
>  > InvokeHTTP[id=ae055c76-88b8-3c86-bd1e-06ca4dcb43d5] failed to
> invoke
>  > @OnScheduled method due to java.lang.RuntimeException: Failed while
>  > executing one of processor's OnScheduled task.; processor will not
> be
>  > scheduled to run for 30 seconds: java.lang.RuntimeException: Failed
>  > while
>  > executing one of processor's OnScheduled task.
>  >
>  > java.lang.RuntimeException: Failed while executing one of
> processor's
>  > OnScheduled task.
>  >
>  > at
>  >
>  > org.apache.nifi.controller.StandardProcessorNode.
> invokeTaskAsCancelableFuture(StandardProcessorNode.java:1483)
>  >
>  > at
>  >
>  > org.apache.nifi.controller.StandardProcessorNode.access$
> 000(StandardProcessorNode.java:103)
>  >
>  > at
>  >
>  > org.apache.nifi.controller.StandardProcessorNode$1.run(
> StandardProcessorNode.java:1302)
>  >
>  > at
>  >
>  > java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
>  >
>  > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  >
>  > at
>  >
>  > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  >
>  > at
>  >
>  > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  >
>  > at
>  >
>  > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1149)
>  >
>  > at
>  >
>  > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:624)
>  >
>  > at java.lang.Thread.run(Thread.java:748)
>  >
>  > Caused by: java.util.concurrent.ExecutionException:
>  > java.lang.reflect.InvocationTargetException
>  >
>  > at java.util.concurrent.FutureTask.report(FutureTask.
> java:122)
>  

Re: unable to start InvokeHTTP processor in secure Nifi 1.4.0 cluster....

2017-12-06 Thread Joe Witt
Josh - great find and response!  Thanks.  Any chance you'd like to
make a PR for it?

On Wed, Dec 6, 2017 at 9:15 AM, dan young  wrote:
> Heya Josh,
>
> Awesome!  This seemed to get me past at least starting the InvokeHTTP.  I
> will try the flow out later this morning.  Thank you for the follow-up!
>
> Regards,
>
> Dano
>
>
> On Tue, Dec 5, 2017 at 10:39 PM Josh Anderton 
> wrote:
>>
>> Hi Dan/Joe,
>>
>> I have encountered the same issue and after a bit of digging it appears as
>> if during the update to OkHttp3 a bug was introduced in the
>> setSslFactoryMethod.  The issue is that the method attempts to prepare a
>> keystore even if properties for the keystore are not defined in the
>> SSLContextFactory.  The exception is being thrown around line 571 of
>> InvokeHTTP as a keystore is attempted to be initialized without a keystore
>> type.
>>
>> The good news is that there appears to be an easy workaround (not fully
>> tested yet) which is to define a keystore in your SSLContextFactory, you can
>> even use the same properties already defined for your truststore and I
>> believe your processor will start working.
>>
>> Please let me know if I have misdiagnosed or if there are issues with the
>> workaround.
>>
>> Thanks,
>> Josh
>>
>> On Tue, Dec 5, 2017 at 9:42 AM, dan young  wrote:
>>>
>>> Hello Joe,
>>>
>>> Here's the JIRA. LMK if you need additional details.
>>>
>>> https://issues.apache.org/jira/browse/NIFI-4655
>>>
>>> Regards,
>>>
>>> Dano
>>>
>>> On Mon, Dec 4, 2017 at 10:46 AM Joe Witt  wrote:

 Dan

 Please share as much of your config for the processor as you can.
 Also, please file a JIRA for this.  There is definitely a bug that
 needs to be addressed if you can make an NPE happen.

 Thanks

 On Mon, Dec 4, 2017 at 12:27 PM, dan young  wrote:
 > Hello,
 >
 >
 > I'm working on migrating some flows over to a secure cluster with
 > OIDC. When
 > I try to start an InvokeHTTP processor, I'm getting the following
 > errors in
 > the logs.  Is there some permission/policy that I need to set for this
 > to
 > work?  or is this something else?
 >
 >
 > Nifi 1.4.0
 >
 >
 > 2017-12-04 17:20:03,972 ERROR [StandardProcessScheduler Thread-8]
 > o.a.nifi.processors.standard.InvokeHTTP
 > InvokeHTTP[id=ae055c76-88b8-3c86-bd1e-06ca4dcb43d5]
 > InvokeHTTP[id=ae055c76-88b8-3c86-bd1e-06ca4dcb43d5] failed to invoke
 > @OnScheduled method due to java.lang.RuntimeException: Failed while
 > executing one of processor's OnScheduled task.; processor will not be
 > scheduled to run for 30 seconds: java.lang.RuntimeException: Failed
 > while
 > executing one of processor's OnScheduled task.
 >
 > java.lang.RuntimeException: Failed while executing one of processor's
 > OnScheduled task.
 >
 > at
 >
 > org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1483)
 >
 > at
 >
 > org.apache.nifi.controller.StandardProcessorNode.access$000(StandardProcessorNode.java:103)
 >
 > at
 >
 > org.apache.nifi.controller.StandardProcessorNode$1.run(StandardProcessorNode.java:1302)
 >
 > at
 >
 > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 >
 > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 >
 > at
 >
 > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 >
 > at
 >
 > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 >
 > at
 >
 > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 >
 > at
 >
 > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 >
 > at java.lang.Thread.run(Thread.java:748)
 >
 > Caused by: java.util.concurrent.ExecutionException:
 > java.lang.reflect.InvocationTargetException
 >
 > at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 >
 > at java.util.concurrent.FutureTask.get(FutureTask.java:206)
 >
 > at
 >
 > org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1466)
 >
 > ... 9 common frames omitted
 >
 > Caused by: java.lang.reflect.InvocationTargetException: null
 >
 > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 >
 > at
 >
 > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 >
 > at
 >
 > 

Re: ValidateRecord1.4.0 vs ConvertJsonToAvro1.4.0 regarding required field in nested object

2017-12-06 Thread Mark Payne
Hey Martin,

Thanks for flagging this and for providing the template! It makes it really 
easy to reproduce.
It looks like there is in fact a bug. Whenever the Schema is created from the 
AvroSchemaRegistry,
it is marking all fields except for the top-level fields as as 'nullable'.

I created a JIRA [1] to track the issue and should have a PR up soon to address 
it.

Many Thanks!
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-4671


On Dec 6, 2017, at 5:25 AM, Martin Mucha 
> wrote:

I don't really understand what you're asking for...

In attachment you have nifi template,

avro template is:

{
  "name": "aRecord",
  "type": "record",
  "namespace": "a",
  "fields": [
{
  "name": "a",
  "type": {
"name": "bRecord",
"type":"record",
"fields": [
  { "name": "b", "type": "string"}
]
  }
}

  ]
}


and incorrectly validated json file is:

{"a":{}}


In given flow it's validated as valid, although required filed b is missing. 
ConvertJsonToAvro on the other hand rejects the very same json using the very 
same avro schema.

Is this all you need? If not, what do you need from me? I probably don't have 
'reproducible repository' -- I don't even know what that is.

Martin.

2017-12-06 11:07 GMT+01:00 Juan Pablo Gardella 
>:

Could you share a reproducible repo or files?

El mié., 6 de dic. de 2017 07:00, Martin Mucha 
> escribió:
Hi,

I have JSON like:

{
  "a": {
"b": "1"
  }
}

and corresponding avro schema (written for the sake of this e-mail, need not to 
be 100% accurate)

{
  "name": "aRecord",
  "type": "record",
  "namespace": "a",
  "fields": [
{
  "name": "a",
  "type": {
"name": "bRecord",
"type":"record",
"fields": [
  { "name": "b", "type": "string"}
]
  }
}

  ]
}

In ConvertJsonToAvro processor, json missing field "b":

{"a":{}}

will be rejected, while in ValidateRecord it will be accepted as valid (which 
is not valid according to schema). Is there anything I can do about it? Is it 
bug?

thanks,
Martin.





Re: unable to start InvokeHTTP processor in secure Nifi 1.4.0 cluster....

2017-12-06 Thread dan young
Heya Josh,

Awesome!  This seemed to get me past at least starting the InvokeHTTP.  I
will try the flow out later this morning.  Thank you for the follow-up!

Regards,

Dano


On Tue, Dec 5, 2017 at 10:39 PM Josh Anderton 
wrote:

> Hi Dan/Joe,
>
> I have encountered the same issue and after a bit of digging it appears as
> if during the update to OkHttp3 a bug was introduced in the
> setSslFactoryMethod.  The issue is that the method attempts to prepare a
> keystore even if properties for the keystore are not defined in the
> SSLContextFactory.  The exception is being thrown around line 571 of
> InvokeHTTP as a keystore is attempted to be initialized without a keystore
> type.
>
> The good news is that there appears to be an easy workaround (not fully
> tested yet) which is to define a keystore in your SSLContextFactory, you
> can even use the same properties already defined for your truststore and I
> believe your processor will start working.
>
> Please let me know if I have misdiagnosed or if there are issues with the
> workaround.
>
> Thanks,
> Josh
>
> On Tue, Dec 5, 2017 at 9:42 AM, dan young  wrote:
>
>> Hello Joe,
>>
>> Here's the JIRA. LMK if you need additional details.
>>
>> https://issues.apache.org/jira/browse/NIFI-4655
>>
>> Regards,
>>
>> Dano
>>
>> On Mon, Dec 4, 2017 at 10:46 AM Joe Witt  wrote:
>>
>>> Dan
>>>
>>> Please share as much of your config for the processor as you can.
>>> Also, please file a JIRA for this.  There is definitely a bug that
>>> needs to be addressed if you can make an NPE happen.
>>>
>>> Thanks
>>>
>>> On Mon, Dec 4, 2017 at 12:27 PM, dan young  wrote:
>>> > Hello,
>>> >
>>> >
>>> > I'm working on migrating some flows over to a secure cluster with
>>> OIDC. When
>>> > I try to start an InvokeHTTP processor, I'm getting the following
>>> errors in
>>> > the logs.  Is there some permission/policy that I need to set for this
>>> to
>>> > work?  or is this something else?
>>> >
>>> >
>>> > Nifi 1.4.0
>>> >
>>> >
>>> > 2017-12-04 17:20:03,972 ERROR [StandardProcessScheduler Thread-8]
>>> > o.a.nifi.processors.standard.InvokeHTTP
>>> > InvokeHTTP[id=ae055c76-88b8-3c86-bd1e-06ca4dcb43d5]
>>> > InvokeHTTP[id=ae055c76-88b8-3c86-bd1e-06ca4dcb43d5] failed to invoke
>>> > @OnScheduled method due to java.lang.RuntimeException: Failed while
>>> > executing one of processor's OnScheduled task.; processor will not be
>>> > scheduled to run for 30 seconds: java.lang.RuntimeException: Failed
>>> while
>>> > executing one of processor's OnScheduled task.
>>> >
>>> > java.lang.RuntimeException: Failed while executing one of processor's
>>> > OnScheduled task.
>>> >
>>> > at
>>> >
>>> org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1483)
>>> >
>>> > at
>>> >
>>> org.apache.nifi.controller.StandardProcessorNode.access$000(StandardProcessorNode.java:103)
>>> >
>>> > at
>>> >
>>> org.apache.nifi.controller.StandardProcessorNode$1.run(StandardProcessorNode.java:1302)
>>> >
>>> > at
>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> >
>>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>> >
>>> > at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>> >
>>> > at
>>> >
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> >
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> >
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> >
>>> > at java.lang.Thread.run(Thread.java:748)
>>> >
>>> > Caused by: java.util.concurrent.ExecutionException:
>>> > java.lang.reflect.InvocationTargetException
>>> >
>>> > at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>> >
>>> > at java.util.concurrent.FutureTask.get(FutureTask.java:206)
>>> >
>>> > at
>>> >
>>> org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1466)
>>> >
>>> > ... 9 common frames omitted
>>> >
>>> > Caused by: java.lang.reflect.InvocationTargetException: null
>>> >
>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> >
>>> > at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> >
>>> > at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> >
>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>> >
>>> > at
>>> >
>>> org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137)
>>> >
>>> > at
>>> >
>>> 

Re: ValidateRecord1.4.0 vs ConvertJsonToAvro1.4.0 regarding required field in nested object

2017-12-06 Thread Martin Mucha
I don't really understand what you're asking for...

In attachment you have nifi template,

avro template is:

{
  "name": "aRecord",
  "type": "record",
  "namespace": "a",
  "fields": [
{
  "name": "a",
  "type": {
"name": "bRecord",
"type":"record",
"fields": [
  { "name": "b", "type": "string"}
]
  }
}

  ]
}


and incorrectly validated json file is:

{"a":{}}


In given flow it's validated as valid, although required filed b is
missing. ConvertJsonToAvro on the other hand rejects the very same json
using the very same avro schema.

Is this all you need? If not, what do you need from me? I probably don't
have 'reproducible repository' -- I don't even know what that is.

Martin.

2017-12-06 11:07 GMT+01:00 Juan Pablo Gardella 
:

> Could you share a reproducible repo or files?
>
> El mié., 6 de dic. de 2017 07:00, Martin Mucha 
> escribió:
>
>> Hi,
>>
>> I have JSON like:
>>
>> {
>>   "a": {
>> "b": "1"
>>   }
>> }
>>
>> and corresponding avro schema (written for the sake of this e-mail, need
>> not to be 100% accurate)
>>
>> {
>>   "name": "aRecord",
>>   "type": "record",
>>   "namespace": "a",
>>   "fields": [
>> {
>>   "name": "a",
>>   "type": {
>> "name": "bRecord",
>> "type":"record",
>> "fields": [
>>   { "name": "b", "type": "string"}
>> ]
>>   }
>> }
>>
>>   ]
>> }
>>
>> In ConvertJsonToAvro processor, json missing field "b":
>>
>> {"a":{}}
>>
>> will be rejected, while in ValidateRecord it will be accepted as valid
>> (which is not valid according to schema). Is there anything I can do about
>> it? Is it bug?
>>
>> thanks,
>> Martin.
>>
>


  
  fdd4f7bb-015f-1000-ffef-071ce070cc59
  fail
  

  87fc9251-c4a0-3aa3--
  94076eff-d950-3ea7--
  1 GB
  1
  
94076eff-d950-3ea7--
cb1aa26f-0116-3def--
PROCESSOR
  
  0 sec
  1
  
  invalid
  
94076eff-d950-3ea7--
43196d5d-8337-3bc7--
PROCESSOR
  
  0


  9f53c803-9bff-35a0--
  94076eff-d950-3ea7--
  1 GB
  1
  
94076eff-d950-3ea7--
580f3309-656f-3450--
PROCESSOR
  
  0 sec
  1
  
  failure
  
94076eff-d950-3ea7--
43196d5d-8337-3bc7--
PROCESSOR
  
  0


  04756719-eca8-3bfb--
  94076eff-d950-3ea7--
  1 GB
  1
  
94076eff-d950-3ea7--
43196d5d-8337-3bc7--
PROCESSOR
  
  0 sec
  1
  
  success
  
94076eff-d950-3ea7--
b3355507-4d1e-34d1--
PROCESSOR
  
  0


  5fa955a2-c99b-3f96--
  94076eff-d950-3ea7--
  1 GB
  1
  
94076eff-d950-3ea7--
9ba255d6-3adf-3fac--
PROCESSOR
  
  0 sec
  1
  
  valid
  
94076eff-d950-3ea7--
43196d5d-8337-3bc7--
PROCESSOR
  
  0


  966b29cb-5a75-3dd2--
  94076eff-d950-3ea7--
  
nifi-registry-nar
org.apache.nifi
1.4.0
  
  
  

  ContactAvroSchema
  
ContactAvroSchema
  

  
  AvroSchemaRegistry
  false
  

  ContactAvroSchema
  {
  "name": "aRecord",
  "type": "record",
  "namespace": "a",
  "fields": [
{
  "name": "a",
  "type": {
"name": "bRecord",
"type":"record",
"fields": [
  { "name": "b", "type": "string"}
]
  }
}

  ]
}

  
  ENABLED
  org.apache.nifi.schemaregistry.services.AvroSchemaRegistry


  9b1539c7-a2bf-38d4--
  94076eff-d950-3ea7--
  
nifi-record-serialization-services-nar
org.apache.nifi
1.4.0
  
  
  

  Schema Write Strategy
  
Schema Write Strategy
  


  schema-access-strategy
  
schema-access-strategy
  


  schema-registry
  
org.apache.nifi.schemaregistry.services.SchemaRegistry
schema-registry
  


  schema-name
  
schema-name
  


  schema-text
  
schema-text
  


  Date Format
  
   

ValidateRecord1.4.0 vs ConvertJsonToAvro1.4.0 regarding required field in nested object

2017-12-06 Thread Martin Mucha
Hi,

I have JSON like:

{
  "a": {
"b": "1"
  }
}

and corresponding avro schema (written for the sake of this e-mail, need
not to be 100% accurate)

{
  "name": "aRecord",
  "type": "record",
  "namespace": "a",
  "fields": [
{
  "name": "a",
  "type": {
"name": "bRecord",
"type":"record",
"fields": [
  { "name": "b", "type": "string"}
]
  }
}

  ]
}

In ConvertJsonToAvro processor, json missing field "b":

{"a":{}}

will be rejected, while in ValidateRecord it will be accepted as valid
(which is not valid according to schema). Is there anything I can do about
it? Is it bug?

thanks,
Martin.