Hey Andre, Thanks for the feedback, very useful. I've been way more busy than expected so I'm not sure when / if I'll get the chance to submit a PR.
Thanks, Pierre Le lun. 13 nov. 2023 à 00:39, Andre <andre-li...@fucs.org> a écrit : > Pierre, > > I will be happy to review a PR but I suspect this should be seen as a > breaking change. > > Reason being we would be deviating from original schema and downstream > systems may need to have tables updated. My suggestion is to bump ParCEFone > and move all ambiguous int to bigint. > > We can then absorb this in NiFI. > > For reference: > > I have faced similar deviations from the schema myself and handled them by > routing parsing failures to separate processors in a way similar to what > Lehel suggested. > > My rationale was: these failures tend to be related to specific to vendors > that simply failed to implement the standard properly and a deployment > specific series of processors are better able to handle those (since they > would impact downstream systems like indexers, tables, columnar formats, > etc). > > Why do I say this is vendor specific? Because in face of a data point that > is larger than 32 bit integer, they could have used custom fields like > `cn1` which are defined as Long in the standard for storing that data. > > This is the same reason I disagree with what has been stated in the linked > Greylog issue: CEF has very clearly defined long and int fields. There > should be no ambiguity around the length of it, it can be clearly deducted > to be 32 bits. > > That view is supported by the fact 64 bit integer fields have been defined > and clearly diferentiated as part of CEF 1.2. > > Cheers > > On Thu, 9 Nov 2023, 02:36 Pierre Villard, <pierre.villard...@gmail.com> > wrote: > >> I may be able to submit a PR against ParseCEF as I did a few improvements >> in the past but not sure when I'll be able to get to it and how fast a new >> release would be made available for use in NiFi. >> >> Will try to block some time for this over the weekend. >> >> Le mer. 8 nov. 2023 à 16:22, <ma...@burkon.cz> a écrit : >> >>> OK, sounds good, I will try it. >>> >>> Thank you >>> M. >>> >>> ---------- Původní e-mail ---------- >>> Od: Lehel Boér <lehe...@hotmail.com> >>> Komu: users@nifi.apache.org <users@nifi.apache.org> >>> Datum: 8. 11. 2023 15:39:43 >>> Předmět: Re: CEF parsing type error >>> >>> I can't see a good workaround for this. The problem is if you remove the >>> out=[integer] from the log message, the CEF format becomes invalid. After >>> finding a solution for this, I'd go with text manipulation with the >>> following processors: >>> >>> - ReplaceText to remove the unwanted part >>> - ExtractText to get the 'out' as a FlowFile attribute >>> - UpdateAttribute to later update the FlowFile with the extracted >>> attribute >>> >>> ------------------------------ >>> *From:* ma...@burkon.cz <ma...@burkon.cz> >>> *Sent:* Wednesday, November 8, 2023 7:22 >>> *To:* users@nifi.apache.org <users@nifi.apache.org> >>> *Subject:* Re: CEF parsing type error >>> >>> Hi, >>> I understand and thank you for the information, but how to solve this >>> problem in NiFi? >>> >>> Own Python script and extra parse failure output of CEF parser ? >>> >>> Marek >>> >>> P.S. >>> https://github.com/fluenda/ParCEFone/issues/30 >>> >>> >>> ---------- Původní e-mail ---------- >>> Od: Lehel Boér <lehe...@hotmail.com> >>> Komu: ma...@burkon.cz <ma...@burkon.cz>, users@nifi.apache.org < >>> users@nifi.apache.org> >>> Datum: 7. 11. 2023 22:22:33 >>> Předmět: Re: CEF parsing type error >>> >>> Hi, >>> >>> The official implementation suggests to use Integer for the *out* key >>> although by definition >>> it can exceed the size of an integer. >>> >>> >>> - out: bytesOut Integer Number of bytes transferred outbound >>> relative to the source to destination relationship. For example, the byte >>> number of data flowing from the destination to the source. >>> >>> This issue was also emerged with graylog here >>> <https://github.com/Graylog2/graylog2-server/issues/7371>. They even >>> got a reply from Fortinet indicating that the root cause of the issue was >>> that the official documentation of CEF did not specify integer range. Later >>> graylog updated their code to expand the range for bigger numerical >>> values. >>> >>> Best Regards, >>> Lehel >>> ------------------------------ >>> *From:* Otto Fowler <ottobackwa...@gmail.com> >>> *Sent:* Tuesday, November 7, 2023 16:35 >>> *To:* ma...@burkon.cz <ma...@burkon.cz>; users@nifi.apache.org < >>> users@nifi.apache.org> >>> *Subject:* Re: CEF parsing type error >>> >>> You should open an issue upstream : >>> https://github.com/fluenda/ParCEFone/issues >>> >>> >>> On November 7, 2023 at 9:47:06 AM, ma...@burkon.cz (ma...@burkon.cz) >>> wrote: >>> >>> Hello, Im using CEFParser and I'm new to Nifi. >>> >>> I have a problem, sometimes a parser error occurs when the numberf is >>> exceeded Integer >>> Is there any way to solve it, for example by adding LONG type for the >>> key "out" somewhere and so on? >>> >>> Please >>> Kind Regards >>> Marek >>> >>> *### CEF Message example from Fortigate (Key: *out was an bigger than >>> Integer)* ### :* >>> <165>Oct 23 22:10:20 FGT-DEV-FW1 CEF: >>> 0|Fortinet|Fortigate|v7.0.12|00020|traffic:forward >>> accept|3|deviceExternalId=FGXXXXXXX012 FTNTFGTeventtime=1698091820252030526 >>> FTNTFGTtz=+0200 FTNTFGTlogid=0000000020 cat=traffic:forward >>> FTNTFGTsubtype=forward FTNTFGTlevel=notice FTNTFGTvd=root src=172.37.1.1 >>> spt=9004 deviceInboundInterface=VPN-DEV_Off-1 FTNTFGTsrcintfrole=undefined >>> dst=172.30.2.180 dpt=514 deviceOutboundInterface=741_CZ_Srv >>> FTNTFGTdstintfrole=lan FTNTFGTsrccountry=Reserved >>> FTNTFGTdstcountry=Reserved externalId=573022232 proto=17 act=accept >>> FTNTFGTpolicyid=527 FTNTFGTpolicytype=policy >>> FTNTFGTpoluuid=73816fb2-6720-51ec-c859-c84211230e24 >>> FTNTFGTpolicyname=Office-2 app=udp/514 FTNTFGTtrandisp=noop >>> FTNTFGTduration=331878 out=3443586134 in=0 FTNTFGTsentpkt=3420478 >>> FTNTFGTrcvdpkt=0 FTNTFGTvpntype=ipsecvpn FTNTFGTappcat=unscanned >>> FTNTFGTsentdelta=959006 FTNTFGTrcvddelta=0 >>> >>> *### CEFParser type ERROR ### :* >>> 2023-10-23 20:10:18,127 INFO [FileSystemRepository Workers Thread-1] >>> o.a.n.c.repository.FileSystemRepository >>> <http://o.a.n.c.repository.filesystemrepository/> Successfully archived >>> 4 Resource Claims for Container default in 10 millis >>> 2023-10-23 20:10:21,003 ERROR [Timer-Driven Process Thread-4] >>> o.a.nifi.processors.standard.ParseCEF >>> <http://o.a.nifi.processors.standard.parsecef/> >>> ParseCEF[id=100411d1-1e6d-12bc-5347-9553a96ec9a5] >>> CEF Parsing Failed: >>> StandardFlowFileRecord[uuid=6198fa4d-69a9-4a60-9062-21dff7a16a05,claim=StandardContentClaim >>> [resourceClaim=StandardResourceClaim[id=1698091820924-6175, >>> container=default, section=31], offset=13986, >>> length=911],offset=0,name=6198fa4d-69a9-4a60-9062-21dff7a16a05,size=911] >>> java.lang.NumberFormatException >>> <http://java.lang.numberformatexception/>: For input string: >>> "3443586134" >>> at java.base/…own >>> <http://java.base/java.lang.NumberFormatException.forInputString(Unknown> >>> Source) >>> at java.base/…own <http://java.base/java.lang.Integer.parseInt(Unknown> >>> Source) >>> at java.base/…own <http://java.base/java.lang.Integer.valueOf(Unknown> >>> Source) >>> at com.fluenda.parcefone.event.CefRev23.setExtension(CefRev23.java:660 >>> <http://com.fluenda.parcefone.event.cefrev23.setextension%28cefrev23.java:660/> >>> ) >>> at com.fluenda.parcefone.parser.CEFParser.parse(CEFParser.java:235 >>> <http://com.fluenda.parcefone.parser.cefparser.parse%28cefparser.java:235/> >>> ) >>> at com.fluenda.parcefone.parser.CEFParser.parse(CEFParser.java:109 >>> <http://com.fluenda.parcefone.parser.cefparser.parse%28cefparser.java:109/> >>> ) >>> at >>> org.apache.nifi.processors.standard.ParseCEF.onTrigger(ParseCEF.java:277 >>> <http://org.apache.nifi.processors.standard.parsecef.ontrigger%28parsecef.java:277/> >>> ) >>> at >>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27 >>> <http://org.apache.nifi.processor.abstractprocessor.ontrigger%28abstractprocessor.java:27/> >>> ) >>> at >>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1361 >>> <http://org.apache.nifi.controller.standardprocessornode.ontrigger%28standardprocessornode.java:1361/> >>> ) >>> at >>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:247 >>> <http://org.apache.nifi.controller.tasks.connectabletask.invoke%28connectabletask.java:247/> >>> ) >>> at >>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102 >>> <http://org.apache.nifi.controller.scheduling.timerdrivenschedulingagent%241.run%28timerdrivenschedulingagent.java:102/> >>> ) >>> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110 >>> <http://org.apache.nifi.engine.flowengine%242.run%28flowengine.java:110/> >>> ) >>> at java.base/…own >>> <http://java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown> >>> Source) >>> at java.base/…own >>> <http://java.base/java.util.concurrent.FutureTask.runAndReset(Unknown> >>> Source) >>> at java.base/…own >>> <http://java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown> >>> Source) >>> at java.base/…own >>> <http://java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown> >>> Source) >>> at java.base/…own >>> <http://java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown> >>> Source) >>> at java.base/…own <http://java.base/java.lang.Thread.run(Unknown> >>> Source) >>> >>>