Hey Andre,

Thanks for the feedback, very useful. I've been way more busy than expected
so I'm not sure when / if I'll get the chance to submit a PR.

Thanks,
Pierre

Le lun. 13 nov. 2023 à 00:39, Andre <andre-li...@fucs.org> a écrit :

> Pierre,
>
> I will be happy to review a PR but I suspect this should be seen as a
> breaking change.
>
> Reason being we would be deviating from original schema and downstream
> systems may need to have tables updated. My suggestion is to bump ParCEFone
> and move all ambiguous int to bigint.
>
> We can then absorb this in NiFI.
>
> For reference:
>
> I have faced similar deviations from the schema myself and handled them by
> routing parsing failures to separate processors in a way similar to what
> Lehel suggested.
>
> My rationale was: these failures tend to be related to specific to vendors
> that simply failed to implement the standard properly and a deployment
> specific series of processors are better able to handle those (since they
> would impact downstream systems like indexers, tables, columnar formats,
> etc).
>
> Why do I say this is vendor specific? Because in face of a data point that
> is larger than 32 bit integer, they could have used custom fields like
> `cn1` which are defined as Long in the standard for storing that data.
>
> This is the same reason I disagree with what has been stated in the linked
> Greylog issue: CEF has very clearly defined long and int fields. There
> should be no ambiguity around the length of it, it can be clearly deducted
> to be 32 bits.
>
> That view is supported by the fact 64 bit integer fields have been defined
> and clearly diferentiated as part of CEF 1.2.
>
> Cheers
>
> On Thu, 9 Nov 2023, 02:36 Pierre Villard, <pierre.villard...@gmail.com>
> wrote:
>
>> I may be able to submit a PR against ParseCEF as I did a few improvements
>> in the past but not sure when I'll be able to get to it and how fast a new
>> release would be made available for use in NiFi.
>>
>> Will try to block some time for this over the weekend.
>>
>> Le mer. 8 nov. 2023 à 16:22, <ma...@burkon.cz> a écrit :
>>
>>> OK, sounds good, I will try it.
>>>
>>> Thank you
>>> M.
>>>
>>> ---------- Původní e-mail ----------
>>> Od: Lehel Boér <lehe...@hotmail.com>
>>> Komu: users@nifi.apache.org <users@nifi.apache.org>
>>> Datum: 8. 11. 2023 15:39:43
>>> Předmět: Re: CEF parsing type error
>>>
>>> I can't see a good workaround for this. The problem is if you remove the
>>> out=[integer] from the log message, the CEF format becomes invalid. After
>>> finding a solution for this, I'd go with text manipulation with the
>>> following processors:
>>>
>>>    - ReplaceText to remove the unwanted part
>>>    - ExtractText to get the 'out' as a FlowFile attribute
>>>    - UpdateAttribute to later update the FlowFile with the extracted
>>>    attribute
>>>
>>> ------------------------------
>>> *From:* ma...@burkon.cz <ma...@burkon.cz>
>>> *Sent:* Wednesday, November 8, 2023 7:22
>>> *To:* users@nifi.apache.org <users@nifi.apache.org>
>>> *Subject:* Re: CEF parsing type error
>>>
>>> Hi,
>>> I understand and thank you for the information, but how to solve this
>>> problem in NiFi?
>>>
>>> Own Python script and extra parse failure output of CEF parser ?
>>>
>>> Marek
>>>
>>> P.S.
>>> https://github.com/fluenda/ParCEFone/issues/30
>>>
>>>
>>> ---------- Původní e-mail ----------
>>> Od: Lehel Boér <lehe...@hotmail.com>
>>> Komu: ma...@burkon.cz <ma...@burkon.cz>, users@nifi.apache.org <
>>> users@nifi.apache.org>
>>> Datum: 7. 11. 2023 22:22:33
>>> Předmět: Re: CEF parsing type error
>>>
>>> Hi,
>>>
>>> The official implementation suggests to use Integer for the *out* key
>>> although by definition
>>> it can exceed the size of an integer.
>>>
>>>
>>>    - out: bytesOut Integer Number of bytes transferred outbound
>>>    relative to the source to destination relationship. For example, the byte
>>>    number of data flowing from the destination to the source.
>>>
>>> This issue was also emerged with graylog here
>>> <https://github.com/Graylog2/graylog2-server/issues/7371>. They even
>>> got a reply from Fortinet indicating that the root cause of the issue was
>>> that the official documentation of CEF did not specify integer range. Later
>>> graylog updated their code to expand the range for bigger numerical
>>> values.
>>>
>>> Best Regards,
>>> Lehel
>>> ------------------------------
>>> *From:* Otto Fowler <ottobackwa...@gmail.com>
>>> *Sent:* Tuesday, November 7, 2023 16:35
>>> *To:* ma...@burkon.cz <ma...@burkon.cz>; users@nifi.apache.org <
>>> users@nifi.apache.org>
>>> *Subject:* Re: CEF parsing type error
>>>
>>> You should open an issue upstream :
>>> https://github.com/fluenda/ParCEFone/issues
>>>
>>>
>>> On November 7, 2023 at 9:47:06 AM, ma...@burkon.cz (ma...@burkon.cz)
>>> wrote:
>>>
>>> Hello, Im using CEFParser and I'm new to Nifi.
>>>
>>> I have a problem, sometimes a parser error occurs when the numberf is
>>> exceeded Integer
>>> Is there any way to solve it, for example by adding LONG type for the
>>> key "out" somewhere and so on?
>>>
>>> Please
>>> Kind Regards
>>> Marek
>>>
>>> *### CEF Message example from Fortigate (Key: *out was an bigger than
>>> Integer)* ### :*
>>> <165>Oct 23 22:10:20 FGT-DEV-FW1 CEF:
>>> 0|Fortinet|Fortigate|v7.0.12|00020|traffic:forward
>>> accept|3|deviceExternalId=FGXXXXXXX012 FTNTFGTeventtime=1698091820252030526
>>> FTNTFGTtz=+0200 FTNTFGTlogid=0000000020 cat=traffic:forward
>>> FTNTFGTsubtype=forward FTNTFGTlevel=notice FTNTFGTvd=root src=172.37.1.1
>>> spt=9004 deviceInboundInterface=VPN-DEV_Off-1 FTNTFGTsrcintfrole=undefined
>>> dst=172.30.2.180 dpt=514 deviceOutboundInterface=741_CZ_Srv
>>> FTNTFGTdstintfrole=lan FTNTFGTsrccountry=Reserved
>>> FTNTFGTdstcountry=Reserved externalId=573022232 proto=17 act=accept
>>> FTNTFGTpolicyid=527 FTNTFGTpolicytype=policy
>>> FTNTFGTpoluuid=73816fb2-6720-51ec-c859-c84211230e24
>>> FTNTFGTpolicyname=Office-2 app=udp/514 FTNTFGTtrandisp=noop
>>> FTNTFGTduration=331878 out=3443586134 in=0 FTNTFGTsentpkt=3420478
>>> FTNTFGTrcvdpkt=0 FTNTFGTvpntype=ipsecvpn FTNTFGTappcat=unscanned
>>> FTNTFGTsentdelta=959006 FTNTFGTrcvddelta=0
>>>
>>> *### CEFParser type ERROR ### :*
>>> 2023-10-23 20:10:18,127 INFO [FileSystemRepository Workers Thread-1]
>>> o.a.n.c.repository.FileSystemRepository
>>> <http://o.a.n.c.repository.filesystemrepository/> Successfully archived
>>> 4 Resource Claims for Container default in 10 millis
>>> 2023-10-23 20:10:21,003 ERROR [Timer-Driven Process Thread-4]
>>> o.a.nifi.processors.standard.ParseCEF
>>> <http://o.a.nifi.processors.standard.parsecef/> 
>>> ParseCEF[id=100411d1-1e6d-12bc-5347-9553a96ec9a5]
>>> CEF Parsing Failed:
>>> StandardFlowFileRecord[uuid=6198fa4d-69a9-4a60-9062-21dff7a16a05,claim=StandardContentClaim
>>> [resourceClaim=StandardResourceClaim[id=1698091820924-6175,
>>> container=default, section=31], offset=13986,
>>> length=911],offset=0,name=6198fa4d-69a9-4a60-9062-21dff7a16a05,size=911]
>>> java.lang.NumberFormatException
>>> <http://java.lang.numberformatexception/>: For input string:
>>> "3443586134"
>>> at java.base/…own
>>> <http://java.base/java.lang.NumberFormatException.forInputString(Unknown>
>>>  Source)
>>> at java.base/…own <http://java.base/java.lang.Integer.parseInt(Unknown>
>>>  Source)
>>> at java.base/…own <http://java.base/java.lang.Integer.valueOf(Unknown>
>>>  Source)
>>> at com.fluenda.parcefone.event.CefRev23.setExtension(CefRev23.java:660
>>> <http://com.fluenda.parcefone.event.cefrev23.setextension%28cefrev23.java:660/>
>>> )
>>> at com.fluenda.parcefone.parser.CEFParser.parse(CEFParser.java:235
>>> <http://com.fluenda.parcefone.parser.cefparser.parse%28cefparser.java:235/>
>>> )
>>> at com.fluenda.parcefone.parser.CEFParser.parse(CEFParser.java:109
>>> <http://com.fluenda.parcefone.parser.cefparser.parse%28cefparser.java:109/>
>>> )
>>> at
>>> org.apache.nifi.processors.standard.ParseCEF.onTrigger(ParseCEF.java:277
>>> <http://org.apache.nifi.processors.standard.parsecef.ontrigger%28parsecef.java:277/>
>>> )
>>> at
>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27
>>> <http://org.apache.nifi.processor.abstractprocessor.ontrigger%28abstractprocessor.java:27/>
>>> )
>>> at
>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1361
>>> <http://org.apache.nifi.controller.standardprocessornode.ontrigger%28standardprocessornode.java:1361/>
>>> )
>>> at
>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:247
>>> <http://org.apache.nifi.controller.tasks.connectabletask.invoke%28connectabletask.java:247/>
>>> )
>>> at
>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102
>>> <http://org.apache.nifi.controller.scheduling.timerdrivenschedulingagent%241.run%28timerdrivenschedulingagent.java:102/>
>>> )
>>> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110
>>> <http://org.apache.nifi.engine.flowengine%242.run%28flowengine.java:110/>
>>> )
>>> at java.base/…own
>>> <http://java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown>
>>>  Source)
>>> at java.base/…own
>>> <http://java.base/java.util.concurrent.FutureTask.runAndReset(Unknown>
>>>  Source)
>>> at java.base/…own
>>> <http://java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown>
>>>  Source)
>>> at java.base/…own
>>> <http://java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown>
>>>  Source)
>>> at java.base/…own
>>> <http://java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown>
>>>  Source)
>>> at java.base/…own <http://java.base/java.lang.Thread.run(Unknown>
>>>  Source)
>>>
>>>

Reply via email to