SV: Replacing a base64-encoded field in a JSON-document with its decoded/converted value

2020-07-01 Thread Myklebust , Bjørn Magnar
Thanks, Andy.
No, I’m sure there is no reason for that – it’s just that I’m fairly new to 
NiFi and don’t know it too well yet.

Thanks,
Bjørn


Fra: Andy LoPresto 
Sendt: tirsdag 30. juni 2020 18:37
Til: users@nifi.apache.org
Emne: Re: Replacing a base64-encoded field in a JSON-document with its 
decoded/converted value

You should not need to explicitly set the additional module directory to cover 
that location. Is there a reason you can’t use the native Groovy JSON [1] 
parsing? That way you don’t have to download any additional libraries.

[1] http://groovy-lang.org/json.html#

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Jun 29, 2020, at 7:41 AM, Myklebust, Bjørn Magnar 
mailto:bjorn.mykleb...@skatteetaten.no>> wrote:

Andy, just a quick followup on this.

I wanted to test a groovy-script with this code (not finished by far yet), and 
the script is placed in the Script Body part of an ExecuteGroovyScript-process 
in NiFi:


import org.json.JSONObject
import org.json.XML
import org.apache.commons.io.IOUtils
import java.nio.charset.*

def flowFile = session.get()
if (!flowFile) return

flowFile = session.write(flowFile,
  {inputStream, outputStream ->
  def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
  def xmlJSONObj = XML.toJSONObject(text);
  def json = xmlJSONObj.toString();
  outputStream.write(json.getBytes(StandardCharsets.UTF_8))
  } as StreamCallback)

session.transfer(flowFile, ExecuteScript.REL_SUCCESS)

But when trying to run this I get the message «unable to resolve class 
org.json.JSONObject @ line 1»
I have downloaded the jar file from this site:  
https://repo1.maven.org/maven2/org/json/json/20200518/json-20200518.jar
And placed it in my nifi/lib-directory.
And the content of this jar you can see in the enclosed png-picture.

Do I need to set a value for the property Additional Classpath when the 
jar-file is stored in the lib-directory?

Thanks,
Bjørn


Fra: Andy LoPresto mailto:alopre...@apache.org>>
Sendt: torsdag 25. juni 2020 19:20
Til: users@nifi.apache.org
Emne: Re: Replacing a base64-encoded field in a JSON-document with its 
decoded/converted value

Hi Bjørn,

No, XML to JSON conversion is not an Expression Language feature. You’ll need 
to either get this data into a flowfile as the complete content to perform the 
conversion with existing built-in tools, or add that step to your Groovy script.

With that additional requirement, I think using the Groovy script to perform 
those steps in tandem is probably the most performant and logical approach here.


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69



On Jun 24, 2020, at 11:25 PM, Myklebust, Bjørn Magnar 
mailto:bjorn.mykleb...@skatteetaten.no>> wrote:

Thanks Andy.
The XML-content is around 5 kB-ish.  But I also need to convert the XML to JSON 
before replacing it back into the original JSON-file.  Can this be done with 
e.g a ConvertAttribute before the ReplaceText?

Thanks,
Bjørn



Fra: Andy LoPresto mailto:alopre...@apache.org>>
Sendt: onsdag 24. juni 2020 17:24
Til: users@nifi.apache.org
Emne: Re: Replacing a base64-encoded field in a JSON-document with its 
decoded/converted value

Hello Bjørn,

If the size of the encoded XML document is small (under ~1 KB), you can extract 
the Base64-encoded value to a flowfile attribute using EvaluateJSONPath, 
perform the decoding using the base64Decode Expression Language function [1], 
and then replace it into the flowfile JSON content using ReplaceText (using 
some regex like "content": ".*" -> “content": ”${decodedXML}” where decodedXML 
is the name of the attribute you are using).

If the XML content could be very large, this will negatively affect your 
performance, as attributes are stored directly in memory and handling large 
amounts of data will impact the heap. In this case, I would recommend writing a 
Groovy script in ExecuteScript processor to leverage Groovy’s very friendly 
JSON handling and extract the value, Base64 decode it, and replace it in a 
couple lines.

Hope this helps.


[1] 
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#base64decode

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69




On Jun 24, 2020, at 4:24 AM, Myklebust, Bjørn Magnar 
mailto:bjorn.mykleb...@skatteetaten.no>> wrote:


Hi.
I have a set of Json-files which contain a base64-coded field (Jsonpath to this 
field is $.data.content), and this field contains a XML-document.  

SV: Replacing a base64-encoded field in a JSON-document with its decoded/converted value

2020-06-26 Thread Myklebust , Bjørn Magnar
Ok, I see.
Thanks, Andy.

Bjørn

Fra: Andy LoPresto 
Sendt: torsdag 25. juni 2020 19:20
Til: users@nifi.apache.org
Emne: Re: Replacing a base64-encoded field in a JSON-document with its 
decoded/converted value

Hi Bjørn,

No, XML to JSON conversion is not an Expression Language feature. You’ll need 
to either get this data into a flowfile as the complete content to perform the 
conversion with existing built-in tools, or add that step to your Groovy script.

With that additional requirement, I think using the Groovy script to perform 
those steps in tandem is probably the most performant and logical approach here.


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Jun 24, 2020, at 11:25 PM, Myklebust, Bjørn Magnar 
mailto:bjorn.mykleb...@skatteetaten.no>> wrote:

Thanks Andy.
The XML-content is around 5 kB-ish.  But I also need to convert the XML to JSON 
before replacing it back into the original JSON-file.  Can this be done with 
e.g a ConvertAttribute before the ReplaceText?

Thanks,
Bjørn



Fra: Andy LoPresto mailto:alopre...@apache.org>>
Sendt: onsdag 24. juni 2020 17:24
Til: users@nifi.apache.org
Emne: Re: Replacing a base64-encoded field in a JSON-document with its 
decoded/converted value

Hello Bjørn,

If the size of the encoded XML document is small (under ~1 KB), you can extract 
the Base64-encoded value to a flowfile attribute using EvaluateJSONPath, 
perform the decoding using the base64Decode Expression Language function [1], 
and then replace it into the flowfile JSON content using ReplaceText (using 
some regex like "content": ".*" -> “content": ”${decodedXML}” where decodedXML 
is the name of the attribute you are using).

If the XML content could be very large, this will negatively affect your 
performance, as attributes are stored directly in memory and handling large 
amounts of data will impact the heap. In this case, I would recommend writing a 
Groovy script in ExecuteScript processor to leverage Groovy’s very friendly 
JSON handling and extract the value, Base64 decode it, and replace it in a 
couple lines.

Hope this helps.


[1] 
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#base64decode

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69



On Jun 24, 2020, at 4:24 AM, Myklebust, Bjørn Magnar 
mailto:bjorn.mykleb...@skatteetaten.no>> wrote:


Hi.
I have a set of Json-files which contain a base64-coded field (Jsonpath to this 
field is $.data.content), and this field contains a XML-document.  Decoding the 
field works as expected, so does the conversion from xml to json,  and I'm able 
to write the content from this field to a file in a bucket in S3.  But what I 
would like to do is to be able to replace the coded value for this field in the 
original file with the decoded/converted value in stead of writing the 
decoded/converted value to file. And after replacing the json-value then I can 
write the updated Json-file to a new S3 bucket.
My process look like this at the moment, and works fine for getting the data to 
file, but it's missing the last part of replacing $.data.content with the 
decoded/converted data.

So how can I do the last part?



The EvaluedJsonPath looks like this:



The ReplaceText looks like this:


The Base64EncodeContent looks like this:


and finally, the CovertRecord looks like this:




This is a testfile for that I'm working with:

{
  "header": {
"dokumentidentifikator": null,
"dokumentidentifikatorV2": "dcff985b-c652-4085-b8f1-45a2f4b6d150",
"revisjonsnummer": 1,
"dokumentnavn": 
"Engangsavgiftfastsettelse:55TEST661122334455:44BIL1:2017-10-20",
"dokumenttype": "SKATTEMELDING_ENGANGSAVGIFT",
"dokumenttilstand": "OPPRETTET",
"gyldig": true,
"gjelderInntektsaar": 2017,
"gjelderPeriode": "2017_10",
"gjelderPart": {
  "partsnummer": 5544332211,
  "identifiseringstype": "MASKINELL",
  "identifikator": null
},
"opphavspart": {
  "partsnummer": 5544332211,
  "identifikator": null
},
"kildereferanse": {
  "kildesystem": "ENGANGSAVGIFTFASTSETTELSE",
  "gruppe": "",
  "referanse": "aef147fb-8ce8-43ef-833b-7aa3bac1ece0",
  "tidspunkt": "2018-01-16T13:28:02.49+01:00"
}
  },
  "data": {
"metadata": {
  "format": "ske:fastsetting:motorvogn:motorvognavgift:v1",
  "bytes": 4420,
  "mimeType": "application/xml",
  "sha1": "c0AowOsTdNdo6VufeSsZqTphc0Y="
},
"content": 

SV: Replacing a base64-encoded field in a JSON-document with its decoded/converted value

2020-06-25 Thread Myklebust , Bjørn Magnar
Thanks Andy.
The XML-content is around 5 kB-ish.  But I also need to convert the XML to JSON 
before replacing it back into the original JSON-file.  Can this be done with 
e.g a ConvertAttribute before the ReplaceText?

Thanks,
Bjørn



Fra: Andy LoPresto 
Sendt: onsdag 24. juni 2020 17:24
Til: users@nifi.apache.org
Emne: Re: Replacing a base64-encoded field in a JSON-document with its 
decoded/converted value

Hello Bjørn,

If the size of the encoded XML document is small (under ~1 KB), you can extract 
the Base64-encoded value to a flowfile attribute using EvaluateJSONPath, 
perform the decoding using the base64Decode Expression Language function [1], 
and then replace it into the flowfile JSON content using ReplaceText (using 
some regex like "content": ".*" -> “content": ”${decodedXML}” where decodedXML 
is the name of the attribute you are using).

If the XML content could be very large, this will negatively affect your 
performance, as attributes are stored directly in memory and handling large 
amounts of data will impact the heap. In this case, I would recommend writing a 
Groovy script in ExecuteScript processor to leverage Groovy’s very friendly 
JSON handling and extract the value, Base64 decode it, and replace it in a 
couple lines.

Hope this helps.


[1] 
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#base64decode

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Jun 24, 2020, at 4:24 AM, Myklebust, Bjørn Magnar 
mailto:bjorn.mykleb...@skatteetaten.no>> wrote:


Hi.
I have a set of Json-files which contain a base64-coded field (Jsonpath to this 
field is $.data.content), and this field contains a XML-document.  Decoding the 
field works as expected, so does the conversion from xml to json,  and I'm able 
to write the content from this field to a file in a bucket in S3.  But what I 
would like to do is to be able to replace the coded value for this field in the 
original file with the decoded/converted value in stead of writing the 
decoded/converted value to file. And after replacing the json-value then I can 
write the updated Json-file to a new S3 bucket.
My process look like this at the moment, and works fine for getting the data to 
file, but it's missing the last part of replacing $.data.content with the 
decoded/converted data.

So how can I do the last part?



The EvaluedJsonPath looks like this:



The ReplaceText looks like this:


The Base64EncodeContent looks like this:


and finally, the CovertRecord looks like this:




This is a testfile for that I'm working with:

{
  "header": {
"dokumentidentifikator": null,
"dokumentidentifikatorV2": "dcff985b-c652-4085-b8f1-45a2f4b6d150",
"revisjonsnummer": 1,
"dokumentnavn": 
"Engangsavgiftfastsettelse:55TEST661122334455:44BIL1:2017-10-20",
"dokumenttype": "SKATTEMELDING_ENGANGSAVGIFT",
"dokumenttilstand": "OPPRETTET",
"gyldig": true,
"gjelderInntektsaar": 2017,
"gjelderPeriode": "2017_10",
"gjelderPart": {
  "partsnummer": 5544332211,
  "identifiseringstype": "MASKINELL",
  "identifikator": null
},
"opphavspart": {
  "partsnummer": 5544332211,
  "identifikator": null
},
"kildereferanse": {
  "kildesystem": "ENGANGSAVGIFTFASTSETTELSE",
  "gruppe": "",
  "referanse": "aef147fb-8ce8-43ef-833b-7aa3bac1ece0",
  "tidspunkt": "2018-01-16T13:28:02.49+01:00"
}
  },
  "data": {
"metadata": {
  "format": "ske:fastsetting:motorvogn:motorvognavgift:v1",
  "bytes": 4420,
  "mimeType": "application/xml",
  "sha1": "c0AowOsTdNdo6VufeSsZqTphc0Y="
},
"content":