That sounds great.
> On Jun 17, 2021, at 10:14, Iñigo Angulo <[email protected]> wrote:
>
> Hi Otto,
>
> So we did this workaround, we re-forked the repo and added the changes of the
> nifi-record feature on a dedicated branch. I closed the previous PR, will
> upload the new one on the following days (we are doing some last test just in
> case we forgot to copy something...)
>
> I will keep you informed.
>
> regards,
> iñigo
>
> -----------------------------------------
> Iñigo Angulo
>
> ZYLK.net :: consultoría.openSource
> telf.: 747412337
> Ribera de Axpe, 11
> Edificio A, modulo 201-203
> 48950 Erandio (Bizkaia)
> -----------------------------------------
>
> ----- Mensaje original -----
> De: "ottobackwards" <[email protected]>
> Para: "dev" <[email protected]>
> Enviados: Jueves, 10 de Junio 2021 16:14:54
> Asunto: Re: Nifi integration record oriented processor for reading
>
> Yes, that sounds great.
> Thank you
>
>> On Jun 10, 2021, at 03:43, Iñigo Angulo <[email protected]> wrote:
>>
>> Hi Otto,
>>
>> I have tried to do the rebase of our commits, but I am having difficulties
>> at it...
>>
>> The issue is: we forked the repo on september 2020, and started making tests
>> and commits to our fork (on 'develop' branch). Now I am trying to do a
>> rebase (using git rebase -i ID) specifying the ID of our first commit. But
>> when the file is open for interactive mode, it gets ~900 commits on the
>> develop branch (belonging to members of the PLC4X community). I think this
>> happens because before opening the PullRequest, I did a 'merge upstream'
>> with the actual PLC4X repo, to get the updates of the code. So, I understand
>> that in the interactive mode file, I have to leave all commits to 'pick' (by
>> default), and change our commits of the nifi feature to 'squash' except from
>> the first one (which also remains as 'pick'). However, when I tried this,
>> many conflicts appear, almost one per each commit (comunity members'
>> commits)...
>> I may be doing something wrong (never did a rebase before...) and I prefered
>> just to ask, as I dont want to break or cause any conflict to the repo
>> code.. If you see anything Im missing please let me know.
>>
>> As a workaround, I was thinking we could close the PR, re-do the fork of the
>> PLC4X repo, and add the changes to the code on a dedicated
>> 'feature-nifi-record' branch. Maybe this could make things clearer...
>> What do you think?
>>
>> thank you,
>> iñigo
>>
>> -----------------------------------------
>> Iñigo Angulo
>>
>> ZYLK.net :: consultoría.openSource
>> telf.: 747412337
>> Ribera de Axpe, 11
>> Edificio A, modulo 201-203
>> 48950 Erandio (Bizkaia)
>> -----------------------------------------
>>
>> ----- Mensaje original -----
>> De: "ottobackwards" <[email protected]>
>> Para: "dev" <[email protected]>
>> Enviados: Jueves, 27 de Mayo 2021 16:10:19
>> Asunto: Re: Nifi integration record oriented processor for reading
>>
>> Awesome.
>>
>> If you can, can I ask you to:
>>
>> 1. Mark the PR as ready to review in github
>> 2. rebase or squash it to a single commit and force push to your branch to
>> clean it up
>>
>>
>>
>>> On May 27, 2021, at 06:45, Iñigo Angulo <[email protected]> wrote:
>>>
>>> Hi Otto, Chris
>>>
>>> we have finally commited the updates on the Nifi Processor to the Pull
>>> Request. The changes we have done are the following:
>>> - deducing avro datatypes from PlcResponse. Here we may check the method
>>> (org.apache.plc4x.nifi.util.Plc4xCommon.createSchema()), in order to see if
>>> it is the best way to do it.
>>> - we have added in the plc4j/pom.xml an "ignoredDependency" for
>>> org.apache.nifi:nifi-standard-nar, as it is used on runtime and was rising
>>> errors during compilation.
>>> - we have changed onScheduled method in Plc4xSourceRecordProcessor
>>> (comparing to BaseProcessor), as we have included the posibility to have an
>>> input connection into the processor, and indicate the target addressMap
>>> through flowfile attributes. The addressMap is now created in the onTrigger
>>> method.
>>> - we have tested the performance with S7 and Modbus protocols (using a
>>> Siemens S7-1200 and Schneider M221). We will upload an updated nifi
>>> template for both protocols, but regarding this, do you have any testing
>>> environment to simulate PLCs? If that is the case, we could prepare the
>>> Processors configuration to match these ones (connection Strings and
>>> addressMaps).
>>>
>>> Please take a look at the code, any suggestion will be very welcome.
>>>
>>> iñigo
>>>
>>> -----------------------------------------
>>> Iñigo Angulo
>>>
>>> ZYLK.net :: consultoría.openSource
>>> telf.: 747412337
>>> Ribera de Axpe, 11
>>> Edificio A, modulo 201-203
>>> 48950 Erandio (Bizkaia)
>>> -----------------------------------------
>>>
>>> ----- Mensaje original -----
>>> De: "Iñigo Angulo" <[email protected]>
>>> Para: "dev" <[email protected]>
>>> Enviados: Miércoles, 12 de Mayo 2021 14:42:22
>>> Asunto: Re: Nifi integration record oriented processor for reading
>>>
>>> Hi Otto, Chris,
>>>
>>> we have been working on the processor to include the logic of getting the
>>> values for variables from the PLC response. We tested it with the S7-1200
>>> and seems to work fine, however we would like to make some further tests
>>> before commiting it.
>>>
>>> Regarding the actual method which takes the datatype from the response
>>> object, we did it in the following way:
>>>
>>> //PlcReadResponse readResponse
>>> Map<String, ? extends PlcValue> responseDataStructure =
>>> readResponse.getAsPlcValue().getStruct();
>>> for (Map.Entry<String, ? extends PlcValue> entry :
>>> responseDataStructure.entrySet()) {
>>>
>>> if (entry.getValue() instanceof PlcINT) {
>>>
>>>
>>> builder.name(fieldName).type().unionOf().nullBuilder().endNull().and().intType().endUnion().noDefault();
>>> }else if (entry.getValue() instanceof PlcREAL) {
>>>
>>> builder.name(fieldName).type().unionOf().nullBuilder().endNull().and().doubleType().endUnion().noDefault();
>>> } ... and so on for the rest of the classes on the package
>>> (org.apache.plc4x.java.spi.values.*)
>>>
>>> //the builder object is used to build avro schema, with desired datatypes
>>> (for example intType())
>>> }
>>>
>>>
>>> Is this the solution you had in mind? If you think there is a better way to
>>> access PlcValues, please let us know and we will update it.
>>>
>>> We will upload the code soon so that you can take a deeper look.
>>>
>>> thank you!
>>> iñigo
>>>
>>> -----------------------------------------
>>> Iñigo Angulo
>>>
>>> ZYLK.net :: consultoría.openSource
>>> telf.: 747412337
>>> Ribera de Axpe, 11
>>> Edificio A, modulo 201-203
>>> 48950 Erandio (Bizkaia)
>>> -----------------------------------------
>>>
>>> ----- Mensaje original -----
>>> De: "ottobackwards" <[email protected]>
>>> Para: "dev" <[email protected]>
>>> Enviados: Viernes, 30 de Abril 2021 14:50:11
>>> Asunto: Re: Nifi integration record oriented processor for reading
>>>
>>> Sounds like a good plan!
>>>
>>>> On Apr 30, 2021, at 05:26, Iñigo Angulo <[email protected]> wrote:
>>>>
>>>> Hi Otto, Chris,
>>>>
>>>> we have been reviewing the comments on the pull request, and started to
>>>> think about the approach of extracting values from response directly.
>>>> During next week, we will work on this and make some updates to the code
>>>> with the suggestions you made. We keep you informed
>>>>
>>>> thank you,
>>>>
>>>> iñigo
>>>>
>>>> -----------------------------------------
>>>> Iñigo Angulo
>>>>
>>>> ZYLK.net :: consultoría.openSource
>>>> telf.: 747412337
>>>> Ribera de Axpe, 11
>>>> Edificio A, modulo 201-203
>>>> 48950 Erandio (Bizkaia)
>>>> -----------------------------------------
>>>>
>>>> ----- Mensaje original -----
>>>> De: "Christofer Dutz" <[email protected]>
>>>> Para: "dev" <[email protected]>
>>>> Enviados: Viernes, 23 de Abril 2021 18:10:35
>>>> Asunto: AW: AW: Nifi integration record oriented processor for reading
>>>>
>>>> Hi Inigo,
>>>>
>>>> especially if you have a look at the KNX protocol. This doesn't define the
>>>> usual IEC datatypes we tried to use for all normal PLC drivers.
>>>> Here we have hundreds of datatypes that don't match any other protocol. I
>>>> think the PLCValue approach would be the simplest.
>>>> The one thing you have to keep in mind, is that you should check a
>>>> PLCValue, if it's a list (Array type) or a Structure (Which sort of
>>>> relates to komplex types with a more sophisticated structure).
>>>>
>>>> Chris
>>>>
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: Iñigo Angulo <[email protected]>
>>>> Gesendet: Freitag, 23. April 2021 15:34
>>>> An: dev <[email protected]>
>>>> Betreff: Re: AW: Nifi integration record oriented processor for reading
>>>>
>>>> Hi Otto, Chris,
>>>>
>>>> Yes, I think the approach you propose will be best. By now, we are
>>>> generating the schema ourselves. We have a record writer who is in charge
>>>> of reading PLC values. Schema is defined previously to reading the values.
>>>> We build this schema getting the protocol from the 'connectionString' (S7,
>>>> Modbus) and the specified variable type from the 'PLC resource address
>>>> String' containing the list of variable to read. From this we deduce the
>>>> expected Avro datatype when reading, for instance, a word in S7 or a coil
>>>> in Modbus.
>>>>
>>>> However, as you mentioned, the other approach will be much clearer and
>>>> useful. Ideally, getting the actual datatype from PLCValue when getting
>>>> the response. Regarding this, we tried to keep the previously described
>>>> 'mapping' separated from the rest of the code, so that hopefully it can be
>>>> easily replaced..
>>>>
>>>> We have done the pull request, hope you can take a look at the code and
>>>> let us know what you think. We will fill the ICLA document too.
>>>>
>>>> thank you
>>>> iñigo
>>>>
>>>>
>>>>
>>>> -----------------------------------------
>>>> Iñigo Angulo
>>>>
>>>> ZYLK.net :: consultoría.openSource
>>>> telf.: 747412337
>>>> Ribera de Axpe, 11
>>>> Edificio A, modulo 201-203
>>>> 48950 Erandio (Bizkaia)
>>>> -----------------------------------------
>>>>
>>>> ----- Mensaje original -----
>>>> De: "Christofer Dutz" <[email protected]>
>>>> Para: "dev" <[email protected]>
>>>> Enviados: Jueves, 22 de Abril 2021 17:12:49
>>>> Asunto: AW: Nifi integration record oriented processor for reading
>>>>
>>>> Hi all,
>>>>
>>>> Well, you get PlcValues from the response that wrap the different
>>>> datatypes. So generally you shouldn't care about the detail type.
>>>>
>>>> However, you can call getObject() which returns the core value the
>>>> plc-value has ... so if it's the PLCValue for a Short, getObject will
>>>> return a short value.
>>>>
>>>> Does that help?
>>>>
>>>> Chris
>>>>
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: Otto Fowler <[email protected]>
>>>> Gesendet: Donnerstag, 22. April 2021 15:21
>>>> An: [email protected]
>>>> Betreff: Re: Nifi integration record oriented processor for reading
>>>>
>>>> So, you are generating the schema yourself, such that downstream if they
>>>> inherit schema they will just get what you generate? And you are trying
>>>> to do that by the connection string? If so, a different way I could
>>>> imagine doing would be to get the ’types’ of the data from the responses
>>>> themselves. This would be more generic. The flow I could imagine ( in
>>>> OnTrigger ):
>>>>
>>>> DO READ
>>>> IF NOT HAS SCHEMA
>>>> GENERATE SCHEMA FROM RESPONSE AND CACHE IN ATOMIC WRITE WITH SCHEMA
>>>>
>>>> Maybe Chris can speak to how to get the types from the responses.
>>>>
>>>>
>>>>> On Apr 22, 2021, at 05:48, Iñigo Angulo <[email protected]> wrote:
>>>>>
>>>>> Hi Chris, Otto,
>>>>>
>>>>> Regarding the Record Processor concept, i will try to give an overview.
>>>>> In Nifi, information packages are called Flowfiles, and these are the
>>>>> actual units of information that are exchanged between Procesors, all
>>>>> along the dataflow. Flowfiles have two sections where we can manage data:
>>>>> Attributes and Content. In the "traditional" Nifi approach, you work with
>>>>> both sections, extracting information from the Content to the Attributes
>>>>> and viceversa to perform operations. This approach could have one
>>>>> limitation when you are processing batch data (lines from a CSV file for
>>>>> instance), where you need to split each of the lines into different
>>>>> Flowfiles. Thus, a 1000 line CSV file leads to 1000 Flowfiles to process,
>>>>> each of them containing a single record.
>>>>>
>>>>> On later versions of the product, they introduced the Record oriented
>>>>> approach. This approach allows you to manage multiple records on a single
>>>>> Flowfile's Content, as long as these records have all the same schema.
>>>>> This means that the operations defined by the Processors are applied
>>>>> simultaneously to the whole content at once. Following with the previous
>>>>> example, a 1000 line CSV file could produce a single Flowfile with a
>>>>> content of 1000 records.
>>>>>
>>>>> To do this, Nifi uses Avro, to serialize the Flowfile's Content. Then,
>>>>> the Record Oriented Processors use Writers and Readers to present this
>>>>> information in the desired format (such as Avro, Json, CSV, etc).
>>>>> Basically, with the record oriented approach, Nifi introduced multiple
>>>>> new Processors, and also included the Record version of many of the "old"
>>>>> ones. Using this Record approach, Nifi perfomance enhances notably,
>>>>> specially when working with large structured information.
>>>>>
>>>>> The work we did was creating a Record Oriented Processor, based on the
>>>>> previously existing one Plc4xSourceProcessor, to read values from the
>>>>> devices. We have also included a README on the
>>>>> plc4x/plc4j/integrations/apache-nifi module explaining the Processor
>>>>> configuration and giving an example. Moreover, we put a nifi template
>>>>> with a dataflow for testing these processors, if useful.
>>>>>
>>>>> Otto, regarding the idea behind this new Processor, that is right. We
>>>>> added the writer capability to the existing PLC4XSourceProcessor, so that
>>>>> it formats the output to the desired configuration in a record manner. At
>>>>> the actual implementation, we did this "protocol adaptation" from the
>>>>> sintax of the particular properties on Processor's configuration. For
>>>>> example, from connection string 's7://IP:PORT', we extract the S7
>>>>> idenifier and map variable datatypes to the actual Avro datatypes for
>>>>> build the record output schema. However, here we dont have vast
>>>>> experience with PLC4X libraries, and for sure there will be better ways
>>>>> for doing this.
>>>>> Also about the Base Processor, we were thinking that maybe the best
>>>>> approach could be to have this Base Processor, and then implement readers
>>>>> for particular protocols as Controller Services. But here also, it could
>>>>> be very helpful to have your opinion.
>>>>>
>>>>> Lastly, regarding the pull request, do you have any documentation on
>>>>> how to do this? I mean, maybe you have defined some naming
>>>>> conventions, or expected structure to facilitate later work. At the
>>>>> present, we have a fork of the project where we have been working on
>>>>> these Nifi changes. We updated the content of our fork (fetch/merge
>>>>> upstream) about 2 weeks ago, and commited our changes to the 'develop'
>>>>> branch. Do we better create a new branch with our commits? how do you
>>>>> prefer to receive the code? (we are not very experts on git, just in
>>>>> case we could cause some problems...)
>>>>>
>>>>> thank you in advance
>>>>>
>>>>> iñigo
>>>>>
>>>>>
>>>>> -----------------------------------------
>>>>> Iñigo Angulo
>>>>>
>>>>> ZYLK.net :: consultoría.openSource
>>>>> telf.: 747412337
>>>>> Ribera de Axpe, 11
>>>>> Edificio A, modulo 201-203
>>>>> 48950 Erandio (Bizkaia)
>>>>> -----------------------------------------
>>>>>
>>>>> ----- Mensaje original -----
>>>>> De: "Christofer Dutz" <[email protected]>
>>>>> Para: "dev" <[email protected]>
>>>>> Enviados: Miércoles, 21 de Abril 2021 20:01:15
>>>>> Asunto: AW: Nifi integration record oriented processor for reading
>>>>>
>>>>> The more I think of it,
>>>>>
>>>>> Perhaps we should also think of potentially providing some information on
>>>>> supported configuration options.
>>>>> Wouldn't it be cool if the driver could say: "I generally have these
>>>>> options and they have these datatypes and mean this"
>>>>> Additionally, the transports could too say: "I generally have these
>>>>> options and they have these datatypes and mean this"
>>>>>
>>>>> I would be our StreamPipes friends would love something like that? Right?
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> -----Ursprüngliche Nachricht-----
>>>>> Von: Otto Fowler <[email protected]>
>>>>> Gesendet: Mittwoch, 21. April 2021 17:46
>>>>> An: [email protected]
>>>>> Betreff: Re: Nifi integration record oriented processor for reading
>>>>>
>>>>> Hi Inigo,
>>>>>
>>>>> I’m a committer on Apache Nifi as well as PLC4X, I would be happy to
>>>>> review your processor.
>>>>> If I understand what you are saying correctly, you have a single
>>>>> processor which supports record writing output?
>>>>>
>>>>> plc4x -> records
>>>>>
>>>>> And that you have, for configuration purposes for that processor created
>>>>> support on a per protocol basis for configuration and validation?
>>>>>
>>>>> If there is per protocol configuration / validation etc, it may be better
>>>>> to have a base processor, and derived processors per protocol to handle
>>>>> those differences.
>>>>>
>>>>> I look forward to seeing the code.
>>>>>
>>>>>
>>>>>> On Apr 21, 2021, at 04:05, Iñigo Angulo <[email protected]> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am writing as we have been working on the Apache Nifi integration part
>>>>>> of the project. We have created a Record oriented processor for reading
>>>>>> PLC data. It is based on the previous existing SourceProcessor, but
>>>>>> works with records, using a Nifi Writer (such as Avro, Json, and so on)
>>>>>> to write data on flowfiles content.
>>>>>>
>>>>>> We updated the code on our fork with the actual PLC4X git repo about 2
>>>>>> weeks ago, and tested it reading values with S7 from a S7-1200 CPU from
>>>>>> Nifi. Also, one of our customers has recently started to use it for
>>>>>> validation.
>>>>>>
>>>>>> Currently, it works with S7 and Modbus over TCP. This is because we had
>>>>>> to write some classes to map connectionString and variableList
>>>>>> properties (sintax) of the processor to the actual protocol, to be able
>>>>>> to build then avro schema for output flowfile, taking into account
>>>>>> variable datatypes, etc. We only did this for S7 and Modbus. I am sure
>>>>>> that there is a better way to do this, so at this point you maybe could
>>>>>> take a look to find the best solution and avoid needing to do this
>>>>>> mapping.
>>>>>>
>>>>>> If you find this useful, we could do a pull request to the main PLC4x
>>>>>> repo. Let us know what you think.
>>>>>>
>>>>>> best regards,
>>>>>> iñigo
>>>>>>
>>>>>> -----------------------------------------
>>>>>> Iñigo Angulo
>>>>>>
>>>>>> ZYLK.net :: consultoría.openSource
>>>>>> telf.: 747412337
>>>>>> Ribera de Axpe, 11
>>>>>> Edificio A, modulo 201-203
>>>>>> 48950 Erandio (Bizkaia)
>>>>>> -----------------------------------------