from:"James McMahon"

Re: Error in conversion of seconds since epoch to a date-time

2024-06-20 Thread James McMahon

After adding extensive logging I identified the problem: the combined regex
pattern was not matching the entirety of values that are unix seconds since
the epoch. I fixed that problem in the Groovy script, and it ran as
expected.

Thank you both for your comments, Paul and Christopher.

On Thu, Jun 20, 2024 at 6:19 PM Paul King  wrote:

> This would be my expectation:
>
> import java.time.Instant
> import java.time.ZoneId
> import groovy.json.JsonBuilder
>
> def lastModifiedView = '1652135219'.toLong()
> def zoneId = ZoneId.of('America/Los_Angeles')
> def date =
> Instant.ofEpochSecond(lastModifiedView).atZone(zoneId).toLocalDate()
> def result = [lastModifiedView: date]
> assert new JsonBuilder(result).toPrettyString() == '''{
> "lastModifiedView": {
> "year": 2022,
> "month": "MAY",
> "chronology": {
> "calendarType": "iso8601",
> "id": "ISO"
> },
> "dayOfMonth": 9,
> "dayOfWeek": "MONDAY",
> "dayOfYear": 129,
> "era": "CE",
> "leapYear": false,
> "monthValue": 5
> }
> }'''
>
> And works fine for me. It wasn't clear if you wanted different
> information in the serialization or just flagging that somewhere your
> code is differing from above because of the different values in the
> output.
>
> Paul.
>
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> Virus-free.www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Fri, Jun 21, 2024 at 7:25 AM James McMahon 
> wrote:
> >
> > Hello. I have a json key named viewLastModified. It has a value of
> 1652135219. Using an Epoch Converter manually (
> https://www.epochconverter.com/), I expect to convert this with my Groovy
> script to something in this ballpark:
> > GMT: Monday, May 9, 2022 10:26:59 PM
> > Your time zone: Monday, May 9, 2022 6:26:59 PM GMT-04:00 DST
> > Relative: 2 years ago
> >
> > But my code fails, and I'm not sure why.
> > Using the code I wrote, I process it and get this result:
> > "viewLastModified": [
> > {
> >   "chronology": {
> > "calendarType": "iso8601",
> > "id": "ISO",
> > "isoBased": true
> >   },
> >   "dayOfMonth": 11,
> >   "dayOfWeek": "SATURDAY",
> >   "dayOfYear": 192,
> >   "era": "CE",
> >   "leapYear": false,
> >   "month": "JULY",
> >   "monthValue": 7,
> >   "year": 1970
> > }
> >   ]
> >
> > Can anyone see where I have an error when I try to process a pattern
> that is seconds since the epoch?
> >
> > My code:
> > import java.util.regex.Pattern
> > import java.time.LocalDate
> > import java.time.LocalDateTime
> > import java.time.format.DateTimeFormatter
> > import java.time.format.DateTimeParseException
> > import java.time.Instant
> > import java.time.ZoneId
> > import groovy.json.JsonSlurper
> > import groovy.json.JsonBuilder
> > import org.apache.nifi.processor.io.StreamCallback
> > import org.apache.nifi.flowfile.FlowFile
> >
> > // Combined regex pattern to match various date formats including Unix
> timestamp
> > def combinedPattern = Pattern.compile(/\b(\d{8})|\b(\d{4}['
> ,-\\/]+\d{2}[' ,-\\/]+\d{2})|\b(\d{2}[' ,-\\/]+\d{2}['
> ,-\\/]+\d{4})|\b(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)['
> ,-\\/]+\d{2}['
> ,-\\/]+\d{4}|\b(?:January|February|March|April|May|June|July|August|September|October|November|December)['
> ,-\\/]+\d{2}[' ,-\\/]+\d{4}\b|\b\d{10}\b/)
> >
> > // Precompile date formats for faster reuse
> > def dateFormats = [
> > DateTimeFormatter.ofPattern('MMdd'),
> > DateTimeFormatter.ofPattern('dd MMM, '),
> > DateTimeFormatter.ofPattern('MMM dd, '),
> > DateTimeFormatter.ofPattern(' MMM dd'),
> > DateTimeFormatter.ofPattern(' dd, ')
> > ]
> >
> > // Helpe

Error in conversion of seconds since epoch to a date-time

2024-06-20 Thread James McMahon

Hello. I have a json key named viewLastModified. It has a value
of 1652135219. Using an Epoch Converter manually (
https://www.epochconverter.com/), I expect to convert this with my Groovy
script to something in this ballpark:
GMT: Monday, May 9, 2022 10:26:59 PM
Your time zone: Monday, May 9, 2022 6:26:59 PM GMT-04:00
 DST
Relative: 2 years ago

But my code fails, and I'm not sure why.
Using the code I wrote, I process it and get this result:
"viewLastModified": [
{
  "chronology": {
"calendarType": "iso8601",
"id": "ISO",
"isoBased": true
  },
  "dayOfMonth": 11,
  "dayOfWeek": "SATURDAY",
  "dayOfYear": 192,
  "era": "CE",
  "leapYear": false,
  "month": "JULY",
  "monthValue": 7,
  "year": 1970
}
  ]

Can anyone see where I have an error when I try to process a pattern that
is seconds since the epoch?

My code:
import java.util.regex.Pattern
import java.time.LocalDate
import java.time.LocalDateTime
import java.time.format.DateTimeFormatter
import java.time.format.DateTimeParseException
import java.time.Instant
import java.time.ZoneId
import groovy.json.JsonSlurper
import groovy.json.JsonBuilder
import org.apache.nifi.processor.io.StreamCallback
import org.apache.nifi.flowfile.FlowFile

// Combined regex pattern to match various date formats including Unix
timestamp
def combinedPattern = Pattern.compile(/\b(\d{8})|\b(\d{4}[' ,-\\/]+\d{2}['
,-\\/]+\d{2})|\b(\d{2}[' ,-\\/]+\d{2}['
,-\\/]+\d{4})|\b(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)['
,-\\/]+\d{2}['
,-\\/]+\d{4}|\b(?:January|February|March|April|May|June|July|August|September|October|November|December)['
,-\\/]+\d{2}[' ,-\\/]+\d{4}\b|\b\d{10}\b/)

// Precompile date formats for faster reuse
def dateFormats = [
DateTimeFormatter.ofPattern('MMdd'),
DateTimeFormatter.ofPattern('dd MMM, '),
DateTimeFormatter.ofPattern('MMM dd, '),
DateTimeFormatter.ofPattern(' MMM dd'),
DateTimeFormatter.ofPattern(' dd, ')
]

// Helper function to parse a date string using predefined formats
def parseDate(String dateStr, List dateFormats) {
for (format in dateFormats) {
try {
return LocalDate.parse(dateStr, format)
} catch (DateTimeParseException e) {
// Continue trying other formats if the current one fails
}
}
return null
}

// Helper function to parse a Unix timestamp
def parseUnixTimestamp(String timestampStr) {
try {
long timestamp = Long.parseLong(timestampStr)
// Validate if the timestamp is in a reasonable range
if (timestamp >= 0 && timestamp <= Instant.now().getEpochSecond()) {
return
Instant.ofEpochSecond(timestamp).atZone(ZoneId.systemDefault()).toLocalDateTime().toLocalDate()
}
} catch (NumberFormatException e) {
// If parsing fails, return null
}
return null
}

// Helper function to validate date within a specific range
boolean validateDate(LocalDate date) {
def currentYear = LocalDate.now().year
def year = date.year
return year >= currentYear - 120 && year <= currentYear + 40
}

// Function to process and normalize dates
def processDates(List dates, List dateFormats) {
dates.collect { dateStr ->
def parsedDate = parseDate(dateStr, dateFormats)
if (parsedDate == null) {
parsedDate = parseUnixTimestamp(dateStr)
}
log.info("Parsed date: ${parsedDate}")
parsedDate
}.findAll { it != null && validateDate(it) }
 .unique()
 .sort()
}

// Define the list of substrings to check in key names
def dateRelatedSubstrings = ['birth', 'death', 'dob', 'date', 'updated',
'modified', 'created', 'deleted', 'registered', 'times', 'datetime', 'day',
'month', 'year', 'week', 'epoch', 'period']

// Start of NiFi script execution
def ff = session.get()
if (!ff) return

try {
log.info("Starting processing of FlowFile: ${ff.getId()}")

// Extract JSON content for processing
String jsonKeys = ff.getAttribute('payload.json.keys')
log.info("JSON keys: ${jsonKeys}")
def keysMap = new JsonSlurper().parseText(jsonKeys)
def results = [:]

// Process each key-value pair in the JSON map
keysMap.each { key, value ->
def datesForThisKey = []
log.info("Processing key: ${key}")

// Check if the key contains any of the specified substrings
if (dateRelatedSubstrings.any { key.toLowerCase().contains(it) }) {
// Read and process the content of the FlowFile
ff = session.write(ff, { inputStream, outputStream ->
def bufferedReader = new BufferedReader(new
InputStreamReader(inputStream))
def bufferedWriter = new BufferedWriter(new
OutputStreamWriter(outputStream))
String line

// Read each line of the input stream
while ((line = bufferedReader.readLine()) !=

Re: Size of nifi flowfile in Groovy script

2024-06-17 Thread James McMahon

It appears the best way to do this is to apply the java getSize() method to
the ff object, like this:
def ffSize = ff.getSize()

This seems to output the file size to the log in bytes.

def file_size = (ffSize != null) ? ffSize :0

if (file_size == 0) {
log.error("file_size is undefined or zero, which prevents division.")
session.transfer(ff, REL_FAILURE)
return
}

log.info("File size: ${file_size}")

On Sat, Jun 15, 2024 at 10:43 PM James McMahon  wrote:

> It turns out you can access it in a Groovy script directly through ff, as
> ff.size. It appears to return a result in bytes.
>
> import org.apache.nifi.processor.ProcessContext
> import org.apache.nifi.processor.ProcessSession
> import org.apache.nifi.processor.io.InputStreamCallback
> import org.apache.nifi.flowfile.FlowFile
>
> def ff = session.get()
> if (!ff) return  // Exit if no flow file is available
>
> try {
> // Retrieve and log all attributes
> ff.getAttributes().each { key, value ->
> log.info("Attribute: ${key} = ${value}")
> }
>
> // Log the fileSize system attribute
> log.info("System Attribute - fileSize: ${ff.size}")
>
> ff = session.putAttribute(ff, 'file_size', ff.size.toString())
>
> // Transfer the flow file to the next stage in the flow
> session.transfer(ff, REL_SUCCESS)
> } catch (Exception e) {
>     log.error("Error processing flow file", e)
> session.transfer(ff, REL_FAILURE)
> }
>
> On Sat, Jun 15, 2024 at 5:14 PM James McMahon 
> wrote:
>
>> Hello. I am trying to determine a way to get the size of a NiFi flowfile
>> within a Groovy script. Has anyone done this before, and can tell me how to
>> do this from within Groovy?
>>
>> This
>> https://jameswing.net/nifi/nifi-internal-fields.html
>> describes fileSize as a hidden field that can be accessed via expression
>> language, but I do not see how I can reference that in a Groovy script.
>>
>> Thanks in advance for any help.
>>
>

Re: Size of nifi flowfile in Groovy script

2024-06-15 Thread James McMahon

It turns out you can access it in a Groovy script directly through ff, as
ff.size. It appears to return a result in bytes.

import org.apache.nifi.processor.ProcessContext
import org.apache.nifi.processor.ProcessSession
import org.apache.nifi.processor.io.InputStreamCallback
import org.apache.nifi.flowfile.FlowFile

def ff = session.get()
if (!ff) return  // Exit if no flow file is available

try {
// Retrieve and log all attributes
ff.getAttributes().each { key, value ->
log.info("Attribute: ${key} = ${value}")
}

// Log the fileSize system attribute
log.info("System Attribute - fileSize: ${ff.size}")

ff = session.putAttribute(ff, 'file_size', ff.size.toString())

// Transfer the flow file to the next stage in the flow
session.transfer(ff, REL_SUCCESS)
} catch (Exception e) {
log.error("Error processing flow file", e)
session.transfer(ff, REL_FAILURE)
}

On Sat, Jun 15, 2024 at 5:14 PM James McMahon  wrote:

> Hello. I am trying to determine a way to get the size of a NiFi flowfile
> within a Groovy script. Has anyone done this before, and can tell me how to
> do this from within Groovy?
>
> This
> https://jameswing.net/nifi/nifi-internal-fields.html
> describes fileSize as a hidden field that can be accessed via expression
> language, but I do not see how I can reference that in a Groovy script.
>
> Thanks in advance for any help.
>

Size of nifi flowfile in Groovy script

2024-06-15 Thread James McMahon

Hello. I am trying to determine a way to get the size of a NiFi flowfile
within a Groovy script. Has anyone done this before, and can tell me how to
do this from within Groovy?

This
https://jameswing.net/nifi/nifi-internal-fields.html
describes fileSize as a hidden field that can be accessed via expression
language, but I do not see how I can reference that in a Groovy script.

Thanks in advance for any help.

Re: Where do you find a community?

2024-05-26 Thread James McMahon

I can tell you from firsthand experience that this Groovy community has
always responded very quickly to questions. I am only a modest level
programmer - a C- at best - and the members here have never looked down on
any question, and have helped me solve some very challenging ones. I have a
lot of respect for the people at groovy.apache.org and nifi.apache.org who
have helped me through the years. Any time I've faced a configuration
impediment or a coding challenge, they've been there.

On Sun, May 26, 2024 at 8:01 AM OCsite  wrote:

> Martin,
>
> I'd say the community is right here. Whenever I needed a help and neither
> the (excellent, in my opinion) documentation nor sites like Groovy Goodness
> (mrhaki, definitely worth checking whenever in doubt) helped, I've asked
> here, and almost always I've got helpful and very knowledgeable answers.
>
> I would hate it if I had to use something with a terrible GUI like the
> Discord thing instead of a convenient, practical and nice maillist. Besides
> a maillist is conceptually worlds better than any kind of IRC for these
> things, for it sort of endorses thinking through before sending; both your
> questions and the answers tend to be well formulated, while IRCs endorse
> the very opposite. Even if there was an |RC with a good GUI — so far, I
> haven't seen one, but well, in theory such thing might exist — I'd still
> strongly prefer a maillist.
>
> I suggest you try to ask those questions you need help with here and see
> whether you find this list as excellent for learning Groovy and as helpful
> as I did.
>
> All the best,
> OC
>
> On 26. 5. 2024, at 4:13, Polgár Márton  wrote:
>
> Hello,
>
> I have been experimenting with the thought of learning an accessible,
> reliable and concise scripting language and considered Groovy a worthy
> candidate. To decide whether this is the case, I started doing these little
> exercises online which usually spawns a lot of micro-questions that are
> hard to answer from the docs, no matter that they look alright. This is
> where we arrive at the elephant in the room with Groovy: the striking lack
> of living, interactive, low-barrier communities.
>
> Groovy might not be a trendy language but it has plenty of visibility and
> stakeholders compared to what I was used to with Raku. The big difference
> is that Raku has a vivid IRC network, it has a Discord server, and in
> addition it also has a blog, a subreddit, a legacy mailing list, a Mastodon
> and so on.
>
> Obviously I'm not running around investigating the communities of all
> sorts of niche languages but on Discord I've seen servers for languages
> from Pascal and Prolog to Factor and Uiua. The older languages usually have
> a dedicated IRC channel, some have both. There is also Zig with the
> principle of a distributed community which is to my understanding mostly
> about allowing and encouraging people to create spaces across various
> platforms, with a loose set of rules.
>
> For Groovy, the only real-time platform would be the Slack - if Slack
> being a hassle wasn't enough, it's hidden behind a kind of survey that
> seems to serve some sort of gatekeeping. There is a semi-active subreddit
> and this mailing list. Grails stuff operates under similar terms, except
> half dead. It seems clear that this is not how you get people involved with
> the language in 2024 - honestly, not even having good old IRC with a bunch
> of available people really raises some questions.
>
> Where is the Groovy community? Is there even one? Who are the target
> audience if there is one? Why is there no visible effort to make the
> language more accessible to newcomers, some place they could go and
> practice? Is it that the people running the business are running out of
> motivation or is this Apache project somehow uninterested in extending the
> user/contributor base, unlike most indie projects?
>
> I am really curious about an answer because for me these are questions
> that determine both the practical feasibility to learn a language and the
> overall state and potential of a community.
>
> Sincerely
> Martin Burger
>
>
>

Re: Cannot process zip file with Groovy

2024-02-17 Thread James McMahon

Here is the challenge I am trying to work around. In NiFi, a processor
called UnpackContent can be used to extract from a range of compressed
formats - tars and zips among them.

I need to access the file metadata of the extracted files. If the
compressed parent file is a tar, UnpackContent exposes the file metadata of
the extracted files. But if the file is a zip, UnpackContent does not.

As a workaround I want a Groovy script that extracts the files from a zip
preserving file metadata, placing each extracted file in the output stream
of the ExecuteGroovyScript processor with the metadata as attributes.

I cannot get the groovy script to successfully extract from the zip. Today
I will continue to try. My first change will be to switch my import to the
correct Apache lib. MG, thank you for that recommendation.

Is there a better way to do this? I would welcome any help.

Does this explain a little more clearly the what and why?

Jim

On Sat, Feb 17, 2024 at 5:10 AM Bob Brown  wrote:

> Not entirely sure that is what James is looking for…I THINK he’s more
> interested in reading than creating.
>
>
>
> Commons compress has some example code at
> https://commons.apache.org/proper/commons-compress/examples.html:
>
>
>
> ===
>
> InputStream fin = Files.newInputStream(Paths.get("some-file"));
>
> BufferedInputStream in = new BufferedInputStream(fin);
>
> OutputStream out = Files.newOutputStream(Paths.get("archive.tar"));
>
> Deflate64CompressorInputStream defIn = new
> Deflate64CompressorInputStream(in);
>
> final byte[] buffer = new byte[buffersize];
>
> int n = 0;
>
> while (-1 != (n = defIn.read(buffer))) {
>
> out.write(buffer, 0, n);
>
> }
>
> out.close();
>
> defIn.close();
>
> ===
>
>
>
> BOB
>
>
>
> *From:* MG 
> *Sent:* Saturday, February 17, 2024 10:37 AM
> *To:* users@groovy.apache.org; Bob Brown 
> *Subject:* Re: Cannot process zip file with Groovy
>
>
>
> I agree, would also recommend using Apache libs, we use e.g. the ZIP
> classes that come with the ant lib in the Groovy distribution
> (org.apache.tools.zip.*):
>
> Here is a quickly sanitzed version of our code (disclaimer: Not
> compiled/tested; Zip64Mode.Always is important if you expect larger files):
>
> InputStream zipInputStream(String compressedFilename) {
> final zipFile = new ZipFile(new File(compressedFilename))
> final zipEntry = (ZipEntry) zipFile.entries.nextElement()
> if(zipEntry === null) { throw new Exception("${zipFile.name} has no
> entries") }
> final zis = zipFile.getInputStream(zipEntry)
> return zis
> }
>
> OutputStream zipOutputStream(String filename, String
> compressedFileExtension = "zip") {
> final fos = new FileOutputStream(filename + '.' +
> compressedFileExtension)
> final zos = new ZipOutputStream(fos)
> zos.useZip64 = Zip64Mode.Always // To avoid
> org.apache.tools.zip.Zip64RequiredException: ... exceeds the limit of
> 4GByte.
> final zipFileName =
> org.apache.commons.io.FilenameUtils.getName(filename)
> final zipEntry = new ZipEntry(zipFileName)
> zos.putNextEntry(zipEntry)
> return zos
> }
>
> Cheers,
> mg
>
>
> On 17/02/2024 00:52, Bob Brown wrote:
>
> MY first thought was “are you SURE it is a kosher Zip file?”
>
>
>
> Sometimes one gets ‘odd’ gzip files masquerading as plain zip files.
>
>
>
> Also, apparently “java.util.Zip does not support DEFLATE64 compression
> method.” :
> https://www.ibm.com/support/pages/zip-file-fails-route-invalid-compression-method-error
>
>
>
> IF this is the case, you may need to use:
> https://commons.apache.org/proper/commons-compress/zip.html
>
> (maybe worth looking at the “Known Interoperability Problems” section of
> the above doc)
>
>
>
> May be helpful: https://stackoverflow.com/a/76321625
>
>
>
> HTH
>
>
>
> BOB
>
>
>
> *From:* James McMahon  
> *Sent:* Saturday, February 17, 2024 4:20 AM
> *To:* users@groovy.apache.org
> *Subject:* Re: Cannot process zip file with Groovy
>
>
>
> Hello Paul, and thanks again for taking a moment to look at this. I tried
> as you suggested:
>
> - - - - - - - - - -
>
> import java.util.zip.ZipInputStream
>
> def ff = session.get()
> if (!ff) return
>
> try {
> ff = session.write(ff, { inputStream, outputStream ->
> def zipInputStream = new ZipInputStream(inputStream)
> def entry = zipInputStream.getNextEntry()
> while (entry != null) {
> entry = zipInputStream.getNextEntry()
> }
> *outputStream = inputStream*
> } as Strea

Re: Cannot process zip file with Groovy

2024-02-16 Thread James McMahon

Hello Paul, and thanks again for taking a moment to look at this. I tried
as you suggested:
- - - - - - - - - -
import java.util.zip.ZipInputStream

def ff = session.get()
if (!ff) return

try {
ff = session.write(ff, { inputStream, outputStream ->
def zipInputStream = new ZipInputStream(inputStream)
def entry = zipInputStream.getNextEntry()
while (entry != null) {
entry = zipInputStream.getNextEntry()
}
*outputStream = inputStream*
} as StreamCallback)

session.transfer(ff, REL_SUCCESS)
} catch (Exception e) {
log.error('Error occurred processing FlowFile', e)
session.transfer(ff, REL_FAILURE)
}
- - - - - - - - - -

Once again it threw this error and failed:

ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error occurred
processing FlowFile:
org.apache.nifi.processor.exception.ProcessException: IOException
thrown from ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]:
java.util.zip.ZipException: invalid compression method
- Caused by: java.util.zip.ZipException: invalid compression method


It bears repeating: I am able to list and unzip the file at the linux
command line, but cannot get it to work from the script.


What is interesting (and a little frustrating) is that the NiFi
UnpackContent *will *successfully unzip the zip file. However, the
reason I am trying to do it in Groovy is that UnpackContent exposes
the file metadata for each file in a tar archive - lastModifiedDate,
for example - but it does *not* do so for files extracted from zips.
And I need that metadata. So here I be.


Can I explicitly set my (de)compression in the Groovy script? Where
would I do that, and what values does one typically encounter for zip
compression?


Jim


On Thu, Feb 15, 2024 at 9:26 PM Paul King  wrote:

> What you are doing to read the zip looks okay.
>
> Just a guess, but it could be that because you haven't written to the
> output stream, it is essentially a corrupt data stream as far as NiFi
> processing is concerned. What happens if you set "outputStream =
> inputStream" as the last line of your callback?
>
> Paul.
>
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> Virus-free.www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Fri, Feb 16, 2024 at 8:48 AM James McMahon 
> wrote:
> >
> > I am struggling to build a Groovy scri[t I can run from a NiFi
> ExecuteScript processor to extract from a zip file and stream to a tar
> archive.
> >
> > I tried to tackle it all at once and made little progress.
> > I am now just trying to read the zip file, and am getting this error:
> >
> > ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error occurred
> processing FlowFile: org.apache.nifi.processor.exception.ProcessException:
> IOException thrown from
> ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]:
> java.util.zip.ZipException: invalid compression method
> > - Caused by: java.util.zip.ZipException: invalid compression method
> >
> >
> > This is my simplified code:
> >
> >
> > import java.util.zip.ZipInputStream
> >
> > def ff = session.get()
> > if (!ff) return
> >
> > try {
> > ff = session.write(ff, { inputStream, outputStream ->
> > def zipInputStream = new ZipInputStream(inputStream)
> > def entry = zipInputStream.getNextEntry()
> > while (entry != null) {
> > entry = zipInputStream.getNextEntry()
> > }
> > } as StreamCallback)
> >
> > session.transfer(ff, REL_SUCCESS)
> > } catch (Exception e) {
> > log.error('Error occurred processing FlowFile', e)
> > session.transfer(ff, REL_FAILURE)
> > }
> >
> >
> > I am able to list and unzip the file at the linux command line, but
> cannot get it to work from the script.
> >
> >
> > Has anyone had success doing this? Can anyone help me get past this
> error?
> >
> >
> > Thanks in advance.
> >
> > Jim
> >
> >
>

Cannot process zip file with Groovy

2024-02-15 Thread James McMahon

I am struggling to build a Groovy scri[t I can run from a NiFi
ExecuteScript processor to extract from a zip file and stream to a tar
archive.

I tried to tackle it all at once and made little progress.
I am now just trying to read the zip file, and am getting this error:

ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error occurred
processing FlowFile:
org.apache.nifi.processor.exception.ProcessException: IOException
thrown from ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]:
java.util.zip.ZipException: invalid compression method
- Caused by: java.util.zip.ZipException: invalid compression method


This is my simplified code:


import java.util.zip.ZipInputStream

def ff = session.get()
if (!ff) return

try {
ff = session.write(ff, { inputStream, outputStream ->
def zipInputStream = new ZipInputStream(inputStream)
def entry = zipInputStream.getNextEntry()
while (entry != null) {
entry = zipInputStream.getNextEntry()
}
} as StreamCallback)

session.transfer(ff, REL_SUCCESS)
} catch (Exception e) {
log.error('Error occurred processing FlowFile', e)
session.transfer(ff, REL_FAILURE)
}


I am able to list and unzip the file at the linux command line, but
cannot get it to work from the script.


Has anyone had success doing this? Can anyone help me get past this error?


Thanks in advance.

Jim

Why is my global variable not visible to my function?

2023-08-10 Thread James McMahon

Hello. I am trying to use a variable I define globally in a function that
follows within my script. Here is my code:

def currentYear = LocalDate.now().getYear()
def yearValue
def monthValue
def dayValue

boolean validateDate(String my, String myMM, String myDD) {

// Acceptable range for years is assumed to be 120 years in the past to
40 years into the future

log.error("in validateDate, my is ${my}")
log.error("in validateDate, myMM is ${myMM}")
log.error("in validateDate, myDD is ${myDD}")
log.error("in validateDate, currentYear is 2023")
log.error("in validateDate, currentYear is ${currentYear}")

yearValue = Integer.parseInt(my)
monthValue = Integer.parseInt(myMM.replaceFirst(/^0+/, ''))
dayValue = Integer.parseInt(myDD.replaceFirst(/^0+/, ''))


//if ((yearValue >= currentYear - 120 && yearValue <= currentYear + 40)
&& \
//(monthValue >= 1 && monthValue <= 12) && \
//(dayValue >= 1 && dayValue <= 31) \
//   ) { return true }
//   else { return false }

//log.error("about to validate in validateDate, using
${yearValue.toString()} and ${currentYear.toString()}")
log.error("about to validate in validateDate, using
${yearValue.toString()} and 2023")

//if ((yearValue >= (currentYear - 120)) && (yearValue <= (currentYear
+ 40))) { return true } else { return false }
if ((yearValue >= (2023 - 120)) && (yearValue <= (2023 + 40))) { return
true } else { return false }
}

In my log I find the first four log.error() output statements, but I do not
see the line that I expect:
in validateDate, currentYear is ${currentYear}

currentYear does not seem to be known to my function validateDate(). What
ios my error here? How can I correct this?

Thank you in advance for your help.

Re: matcher is failing

2023-07-10 Thread James McMahon

Thank you much, Spencer and Paul. This was very helpful and got me rolling
again. I've got some re-work and cleanup to do, but will post my final
working logic here.
Cheers,
Jim

On Sun, Jul 9, 2023 at 7:58 PM Paul King  wrote:

> Yes, Spencer's info is correct. This script gives an example for a date in
> dd-mm-yy[yy] format:
>
> candidate = '14-06-2023'
> matcher = candidate =~ /(?x) # enable whitespace and comments
>   ^  # start of line
>   (0?[1-9]|[12]\d|3[01]) # capture day, e.g. 1, 01, 12, 30
>   [\-\/]+# ignore separator
>   (\d{1,2})  # capture month, e.g. 1, 01, 12
>   [\-\/]+# ignore separator
>   (\d{4}|\d{2})  # capture year, e.g. 1975, 23
>   $  # end of line
> /
> (_, day, month, year) = matcher[0]
> assert [year, month, day] == ['2023', '06', '14']
>
> You'd need a slight tweak to instead/also handle USA dates in mm-dd-yy[yy]
> format.
>
> Cheers, Paul.
>
>
> On Mon, Jul 10, 2023 at 6:53 AM Spencer Allain via users <
> users@groovy.apache.org> wrote:
>
>> You have changed the first grouping into a non-capture group with ?:, so
>> matcher[0][3] will be null, and matcher[0][1] will be the month and
>> matcher[0][2] will be the year.
>>
>> I also believe that you should be able to reduce [\\\-\\\/] to be simply
>> [-\/] because of using slashy-string for the regex (as only forward slash
>> needs to be escaped)
>>
>> -Spencer
>>
>> On Sunday, July 9, 2023 at 12:08:17 PM EDT, James McMahon <
>> jsmcmah...@gmail.com> wrote:
>>
>>
>> Correction: this is my code...
>>
>> else if ( candidate =~
>> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
>> ) {
>>
>>   log.error("BINGO!")
>>
>>   matcher = candidate =~
>> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
>>   matchedSubstring = matcher[0][0]
>>   log.error("Matcher: ${matcher.toString()}")
>>   log.error("Matched substring: ${matchedSubstring}")
>>   day = matchedSubstring[matcher[0][1]]
>>   month = matchedSubstring[matcher[0][2]]
>>   year = matcherSubstriong[matcher[0][3]]
>>
>>   log.error("Day: ${day}")
>>   log.error("Month: ${month}")
>>   log.error("Year: ${year}")
>>
>>   log.error("Length of Day: ${day.length()}")
>>   log.error("Length of Month: ${month.length()}")
>>   log.error("Length of Year: ${year.length()}")
>>
>>}
>>
>> I suspect I need to look at how I'm setting day, month, and year from
>> matcher.
>>
>> The log output:
>> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
>> o.a.nifi.processors.script.ExecuteScript
>> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Candidate: 06-14-2023
>> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
>> o.a.nifi.processors.script.ExecuteScript
>> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] BINGO!
>> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
>> o.a.nifi.processors.script.ExecuteScript
>> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matcher:
>> java.util.regex.Matcher[pattern=^(?:0?[1-9]|[12]\d|3[01])[\\\-\\/]+(\d{2}|\d{1})[\\\-\\/]+(\d{4}|\d{2})$
>> region=0,10 lastmatch=06-14-2023]
>> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
>> o.a.nifi.processors.script.ExecuteScript
>> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matched substring:
>> 06-14-2023
>> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
>> o.a.nifi.processors.script.ExecuteScript
>> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Could not parse:
>> 06-14-2023
>>
>> On Sun, Jul 9, 2023 at 11:59 AM James McMahon 
>> wrote:
>>
>> Hello. I have a conditional clause in my Groovy script that attempts to
>> parse a date pattern of this form: 06-14-2023. It fails - I believe in the
>> matcher.
>>
>> I am running from a NiFi ExecuteScript processor. Here is my conditional:
>>
>>   } else if ( candidate =~
>> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
>> ) {
>>
>>   log.error("BINGO!")
&g

Re: matcher is failing

2023-07-09 Thread James McMahon

Correction: this is my code...

else if ( candidate =~
/^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
) {

  log.error("BINGO!")

  matcher = candidate =~
/^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
  matchedSubstring = matcher[0][0]
  log.error("Matcher: ${matcher.toString()}")
  log.error("Matched substring: ${matchedSubstring}")
  day = matchedSubstring[matcher[0][1]]
  month = matchedSubstring[matcher[0][2]]
  year = matcherSubstriong[matcher[0][3]]

  log.error("Day: ${day}")
  log.error("Month: ${month}")
  log.error("Year: ${year}")

  log.error("Length of Day: ${day.length()}")
  log.error("Length of Month: ${month.length()}")
  log.error("Length of Year: ${year.length()}")

   }

I suspect I need to look at how I'm setting day, month, and year from
matcher.

The log output:
2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Candidate: 06-14-2023
2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] BINGO!
2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matcher:
java.util.regex.Matcher[pattern=^(?:0?[1-9]|[12]\d|3[01])[\\\-\\/]+(\d{2}|\d{1})[\\\-\\/]+(\d{4}|\d{2})$
region=0,10 lastmatch=06-14-2023]
2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matched substring:
06-14-2023
2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Could not parse:
06-14-2023

On Sun, Jul 9, 2023 at 11:59 AM James McMahon  wrote:

> Hello. I have a conditional clause in my Groovy script that attempts to
> parse a date pattern of this form: 06-14-2023. It fails - I believe in the
> matcher.
>
> I am running from a NiFi ExecuteScript processor. Here is my conditional:
>
>   } else if ( candidate =~
> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
> ) {
>
>   log.error("BINGO!")
>
>   matcher = candidate =~
> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
>   log.error("Matcher: ${matcher.toString()}")
>   log.error("Matched substring: ${matchedSubstring}")
>   day = matchedSubstring[matcher[0][1]]
>   month = matchedSubstring[matcher[0][2]]
>   year = matcherSubstriong[matcher[0][3]]
>
>   log.error("Day: ${day}")
>   log.error("Month: ${month}")
>   log.error("Year: ${year}")
>
>   log.error("Length of Day: ${day.length()}")
>   log.error("Length of Month: ${month.length()}")
>   log.error("Length of Year: ${year.length()}")
>
>}
>
> My log output tells me I make it into the conditional, but then I fail on
> the matcher:
>
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Candidate: 06-14-2023
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] BINGO!
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matcher:
> java.util.regex.Matcher[pattern=^(?:0?[1-9]|[12]\d|3[01])[\\\-\\/]+(\d{2}|\d{1})[\\\-\\/]+(\d{4}|\d{2})$
> region=0,10 lastmatch=]
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matched substring:
> *null*
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Could not parse:
> 06-14-2023
>
> Can anyone help me get this matcher to work?
>
> Thanks in advance for any help.
>
> Jim
>

matcher is failing

2023-07-09 Thread James McMahon

Hello. I have a conditional clause in my Groovy script that attempts to
parse a date pattern of this form: 06-14-2023. It fails - I believe in the
matcher.

I am running from a NiFi ExecuteScript processor. Here is my conditional:

  } else if ( candidate =~
/^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
) {

  log.error("BINGO!")

  matcher = candidate =~
/^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
  log.error("Matcher: ${matcher.toString()}")
  log.error("Matched substring: ${matchedSubstring}")
  day = matchedSubstring[matcher[0][1]]
  month = matchedSubstring[matcher[0][2]]
  year = matcherSubstriong[matcher[0][3]]

  log.error("Day: ${day}")
  log.error("Month: ${month}")
  log.error("Year: ${year}")

  log.error("Length of Day: ${day.length()}")
  log.error("Length of Month: ${month.length()}")
  log.error("Length of Year: ${year.length()}")

   }

My log output tells me I make it into the conditional, but then I fail on
the matcher:

2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Candidate: 06-14-2023
2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] BINGO!
2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matcher:
java.util.regex.Matcher[pattern=^(?:0?[1-9]|[12]\d|3[01])[\\\-\\/]+(\d{2}|\d{1})[\\\-\\/]+(\d{4}|\d{2})$
region=0,10 lastmatch=]
2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matched substring:
*null*
2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Could not parse:
06-14-2023

Can anyone help me get this matcher to work?

Thanks in advance for any help.

Jim

Re: Working with Calendar object in Groovy

2023-06-19 Thread James McMahon

If I want to set the month and day in the date|calendar object to 00 when I
only have a year, it seems that DateUtil forces those to be 01 even when
all it gets is a year such as 1999. I can't allow it to default to 19990101
in such cases - there will be legit occurrences of 19990101. I want to
force it to be 1999. The Calendar object comes close, because it sets
cal.MONTH to 0, but it does that if the month is 01 legitimately, or if it
is missing. That seems unfortunate.

This is my code so far:

  //
https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html
final String[] PATTERNS = new String[] {
" dd, ",
"",
"// //  dd, ",  // I am not sure this one is a legit
pattern, and may drop this
"MM/dd/",
"-MM-dd", "-mm"
}

 // https://stackoverflow.com/a/54952272, reference credit: Bob Brown
datesToNormalize.split(/(?ms)::-::/, -1).collect { it.trim() }.each {
candidate ->
   try {
   parsed = DateUtils.parseDateStrictly(candidate, PATTERNS)
   def Calendar cal = Calendar.getInstance()
   cal.setTime(parsed)

   log.info("Given: ${candidate}; parsed: ${cal} month:
${cal.get(Calendar.MONTH)}")
   } catch (Exception e) {
   log.error("Could not parse: ${candidate}")
   }
}

What I find when I look closely at the details of the calendar object for a
year only, such as 1991..

2023-06-19 17:21:26,235 INFO [Timer-Driven Process Thread-8]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Given: 1991; parsed:
java.util.GregorianCalendar[time=66268800,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="UTC",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,
*YEAR=1991*,*MONTH=0*,WEEK_OF_YEAR=1,WEEK_OF_MONTH=1,*DAY_OF_MONTH=1*,DAY_OF_YEAR=1,DAY_OF_WEEK=3,DAY_OF_WEEK_IN_MONTH=1,AM_PM=0,HOUR=0,HOUR_OF_DAY=0,MINUTE=0,SECOND=0,MILLISECOND=0,ZONE_OFFSET=0,DST_OFFSET=0]
month: 0

Why it sets DAY_OF_MONTH to 1 when we only present a  is unfortunate
too.

I wish to get my final normalized output to be 1991 in such cases.

On Mon, Jun 19, 2023 at 8:03 PM Bob Brown  wrote:

> I guess the key thing to bear in mind is:
>
>
> https://docs.oracle.com/en/java/javase/20/docs/api/java.base/java/util/Calendar.html
> """
> The calendar field values can be set by calling the set methods. Any
> field values set in a Calendar will not be interpreted until it needs to
> calculate its time value (milliseconds from the Epoch) or values of the
> calendar fields. Calling the get, getTimeInMillis, getTime, add and roll 
> involves
> such calculation.
> """
>
> Calendar doesn't provide a way to differentiate, in other words.
>
> See default values for fields at
> https://docs.oracle.com/en/java/javase/20/docs/api/java.base/java/util/GregorianCalendar.html
>
> Calendar is often used as a ' bucket' for timey-wimey things: you stuff
> things into fields that contextually make sense and ignore those that
> don't. Not very nice OO or helpful as an API.
>
> It MAY be (speaking off the top of my head here) that you don't need/want
> Calendar...the newer java.time package has many "finer-grained" classes for
> things like Instance, Period, Duration, etc. that might be a better fit for
> a specific use-case: https://www.baeldung.com/java-8-date-time-intro
>
> BOB
> --
> *From:* James McMahon 
> *Sent:* Tuesday, 20 June 2023 8:53 AM
> *To:* users@groovy.apache.org 
> *Subject:* Working with Calendar object in Groovy
>
> If I have a Calendar object created for 1999-01-01, a get() of
> calendar.MONTH will return 0. From references I’ve found through Google, we
> have to add 1 to MONTH to get the 1 for January. The calendar object has 0
> for MONTH.
>
> Now let’s take the case where we set our calendar object from “1999”,.
> When we get MONTH it is also 0 - but not because our month was January, but
> because it was not present.
>
> How does the calendar instance differentiate between those two cases? Is
> there another calendar object element that tells me “hey, I could set no
> day or month from a date like 1999”?
>

Working with Calendar object in Groovy

2023-06-19 Thread James McMahon

If I have a Calendar object created for 1999-01-01, a get() of
calendar.MONTH will return 0. From references I’ve found through Google, we
have to add 1 to MONTH to get the 1 for January. The calendar object has 0
for MONTH.

Now let’s take the case where we set our calendar object from “1999”,. When
we get MONTH it is also 0 - but not because our month was January, but
because it was not present.

How does the calendar instance differentiate between those two cases? Is
there another calendar object element that tells me “hey, I could set no
day or month from a date like 1999”?

Re: Existing resources to seek date patterns from raw data and normalize them

2023-06-14 Thread James McMahon

BOB, Jochen, and Paul, you've given me a lot of strong suggestions to
consider. I wanted to respond and tell you I'm playing around with some
now. Paul, your thoughts have given me pause to think more about
performance. I'm going to try this first with a Groovy script in an
ExecuteScript processor, see how the performance looks, and keep your other
suggestions in mind if performance is not good. I suspect a buffered line
reader approach should suffice: my incoming files are typically on the
border of tens or hundreds of megabytes. An occasional monster of a few
gigabytes does appear, but nothing on the order of hundreds of gigabytes or
a terabyte.

I'm working now to build my regex. My first effort will be to try and
correctly parse from a test file a wide variety of date formats in string
representation. I will post that here as my starting point once I get
something working.

Jim

On Wed, Jun 14, 2023 at 5:18 AM Bob Brown  wrote:

> Just wondering if this will help you:
>
>
> https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/DateUtils.html#parseDate-java.lang.String-java.util.Locale-java.lang.String...-
>
> You'll still need to extract the candidate date strings but once you have
> them, this can parse them using various formats.
>
> Perhaps...since we are now in "The Age of AI" (:-)), you could use Apache
> OpenNLP, per this:
>
>
> https://stackoverflow.com/questions/27182040/how-to-detect-dates-with-opennlp
>
> I've used NLP in other situations...it's not popular but it does the job
> nicely.
>
> A bit more of a general discussion:
>
> https://www.baeldung.com/cs/finding-dates-addresses-in-emails
>
> Hope this helps.
>
> BOB
>
> --
> *From:* Jochen Theodorou 
> *Sent:* Wednesday, 14 June 2023 4:42 AM
> *To:* users@groovy.apache.org 
> *Subject:* Re: Existing resources to seek date patterns from raw data and
> normalize them
>
> On 13.06.23 16:52, James McMahon wrote:
> > Hello.  I have a task to parse dates out of incoming raw content. Of
> > course the date patterns can assume any number of forms -   -MM-DD,
> > /MM/DD, MMDD, MMDD, etc etc etc. I can build myself a robust
> > regex to match a broad set of such patterns in the raw data, but I
> > wonder if there is a project or library available for Groovy that
> > already offes this?
>
> I always wanted to try one time
> https://github.com/joestelmach/natty/tree/master or at least
> https://github.com/sisyphsu/dateparser... never came to it ;)
>
> > Assuming I get pattern matches parsed out of my raw data, I will have a
> > collection of strings representing year-month-days in a variety of
> > formats. I'd then like to normalize them to a standard form so that I
> > can sort and compare them. I intend to identify the range of dates in
> > the raw data as a sorted Groovy list.
>
> once you have the library identified the format this is the easy step
>
>   [...]
> > I intend to write a Groovy script that will run from an Apache NiFi
> > ExecuteScript processor. I'll read in my data flowfile content using a
> > buffered reader so I can handle flowfiles that may be large.
>
> what does large mean? 1TB? Then BufferedReader may not be the right
> choice ;)
>
> bye Jochen
>

Existing resources to seek date patterns from raw data and normalize them

2023-06-13 Thread James McMahon

Hello.  I have a task to parse dates out of incoming raw content. Of course
the date patterns can assume any number of forms -   -MM-DD,
/MM/DD, MMDD, MMDD, etc etc etc. I can build myself a robust
regex to match a broad set of such patterns in the raw data, but I wonder
if there is a project or library available for Groovy that already offes
this?

Assuming I get pattern matches parsed out of my raw data, I will have a
collection of strings representing year-month-days in a variety of formats.
I'd then like to normalize them to a standard form so that I can sort and
compare them. I intend to identify the range of dates in the raw data as a
sorted Groovy list.

I anticipate I will miss many pattern variations with my initial cut at
this. I do have one thing going for me: as I test through volumes of raw
data, I'll be able to improve the pattern net I cast to catch an
ever-improving percentage of year-month-day expressions.

I intend to write a Groovy script that will run from an Apache NiFi
ExecuteScript processor. I'll read in my data flowfile content using a
buffered reader so I can handle flowfiles that may be large.

Any recommendations or suggestions?

Re: Unable to properly tally all keys in nested json

2023-05-13 Thread James McMahon

I tried parsing the stream without forcing it to text, like you had done
Paul. I thought maybe that was the source of my problem, so I changed that
to this:

def root = new
JsonSlurper().setType(JsonParserType.LAX).parse(inputStream)

I continue to get this error:

ExecuteScript[id=028b1d40-33a5-1766-b659-31b3faaf13f5] Error
processing json fields: groovy.lang.MissingMethodException: No
signature of method: Script87$_run_closure3.call() is applicable for
argument types: (Node, String) values: [te=0.63847, ]
Possible solutions: any(), any(), doCall(java.util.Map,
java.lang.String), collect(), find(), dump()


I notice the content of the error statement shows this (Node, String)
values: [te=0.63847, ], which is not what I see in my input, which is
this: "te": "0.9494",


Is this perhaps the source of my problem - the fact that the key and
value are not being read in as strings?


On Sat, May 13, 2023 at 12:01 PM James McMahon  wrote:

> Thank you Paul. I have integrated this approach into the framework of my
> NiFi ExecuteScript code, which reads the flowfile content from the stream.
> I am seeing an undefined method error, and it seems to indicate it cannot
> digest the json. Does anything jump out at you as an obvious error in my
> approach?
>
> Here is the simple sample json in the flowfile:
>
> {"id": "20230508215236_4447cd0a-9dca-47cb-90b1-6562cf34155a_Timer-Driven
> Process Thread-9",
>
> "te": "0.9494",
>
> "diskusage": "0.2776125422110003.3 MB",
>
> "memory": 77,
>
> "cpu": 0.58,
>
> "host": "172.31.73.197/ip-172-31-73-197.ec2.internal",
>
> "temperature": "97",
>
> "macaddress": "f417ead3-4fa9-4cee-a14b-7172e9ecd3ea",
>
> "end": "61448816405795",
> "systemtime": "05/08/2023 16:52:36"}
>
>
> Here is the output error:
>
> ExecuteScript[id=028b1d40-33a5-1766-b659-31b3faaf13f5] Error processing json 
> fields: groovy.lang.MissingMethodException: No signature of method: 
> Script83$_run_closure3.call() is applicable for argument types: (Node, 
> String) values: [te=0.9494, ]
>
> Possible solutions: any(), any(), doCall(java.util.Map, java.lang.String), 
> collect(), find(), dump()
>
>
> Here is the current implementation of my code:
>
> import groovy.json.JsonSlurper
> import groovy.json.JsonParserType
> import org.apache.commons.io.IOUtils
> import java.nio.charset.StandardCharsets
>
> def keys = []
> def topValuesMap = [:].withDefault{ [:].withDefault{ 0 } }
> def tallyMap = [:].withDefault{ 0 }
> def tally
> tally = { Map json, String prefix ->
> json.each { k, v ->
> String key = prefix + k
> if (v instanceof List) {
>   tallyMap[key] += 1
>   v.each{ tally(it, key + '.') }
> } else {
> def val = v?.toString().trim()
> if (v) {
> tallyMap[key] += 1
> topValuesMap[key][v] += 1
> if (v instanceof Map) tally(v, key + '.')
> }
> }
> }
> }
>
>
> def ff = session.get()
> if (!ff) return
>
> try {
> session.read(ff, { inputStream ->
>
> def root = new
> JsonSlurper().setType(JsonParserType.LAX).parseText(IOUtils.toString(inputStream,
> StandardCharsets.UTF_8))
>
> root.each {
>  tally(it, '')
> }
> } as InputStreamCallback)
>
> keys = tallyMap.keySet().toList()
> def tallyMapString = tallyMap.collectEntries { k, v -> [(k): v]
> }.toString()
> def topValuesMapString = topValuesMap.collectEntries { k, v -> [(k):
> v.sort{ -it.value }.take(10)] }.toString()
>
> ff = session.putAttribute(ff, 'triage.json.fields', keys.join(","))
> ff = session.putAttribute(ff, 'triage.json.tallyMap', tallyMapString)
> ff = session.putAttribute(ff, 'triage.json.topValuesMap',
> topValuesMapString)
>
> session.transfer(ff, REL_SUCCESS)
> } catch (Exception e) {
> log.error('Error processing json fields', e)
> session.transfer(ff, REL_FAILURE)
> }
>
> On Fri, May 12, 2023 at 8:54 AM Paul King  wrote:
>
>> Something like this worked for me:
>>
>> def topValuesMap = [:].withDefault{ [:].withDefault{ 0 } }
>> def tallyMap = [:].withDefault{ 0 }
>> def tally
>> tally = { Map json, String prefix ->
>> json.each { k, v ->
>> String key = prefix + k
>> if (v instanceof List) {
>>   tallyMap[key] += 1
>>   v.eac

Re: Unable to properly tally all keys in nested json

2023-05-13 Thread James McMahon

Thank you Paul. I have integrated this approach into the framework of my
NiFi ExecuteScript code, which reads the flowfile content from the stream.
I am seeing an undefined method error, and it seems to indicate it cannot
digest the json. Does anything jump out at you as an obvious error in my
approach?

Here is the simple sample json in the flowfile:

{"id": "20230508215236_4447cd0a-9dca-47cb-90b1-6562cf34155a_Timer-Driven
Process Thread-9",

"te": "0.9494",

"diskusage": "0.2776125422110003.3 MB",

"memory": 77,

"cpu": 0.58,

"host": "172.31.73.197/ip-172-31-73-197.ec2.internal",

"temperature": "97",

"macaddress": "f417ead3-4fa9-4cee-a14b-7172e9ecd3ea",

"end": "61448816405795",
"systemtime": "05/08/2023 16:52:36"}


Here is the output error:

ExecuteScript[id=028b1d40-33a5-1766-b659-31b3faaf13f5] Error
processing json fields: groovy.lang.MissingMethodException: No
signature of method: Script83$_run_closure3.call() is applicable for
argument types: (Node, String) values: [te=0.9494, ]

Possible solutions: any(), any(), doCall(java.util.Map,
java.lang.String), collect(), find(), dump()


Here is the current implementation of my code:

import groovy.json.JsonSlurper
import groovy.json.JsonParserType
import org.apache.commons.io.IOUtils
import java.nio.charset.StandardCharsets

def keys = []
def topValuesMap = [:].withDefault{ [:].withDefault{ 0 } }
def tallyMap = [:].withDefault{ 0 }
def tally
tally = { Map json, String prefix ->
json.each { k, v ->
String key = prefix + k
if (v instanceof List) {
  tallyMap[key] += 1
  v.each{ tally(it, key + '.') }
} else {
def val = v?.toString().trim()
if (v) {
tallyMap[key] += 1
topValuesMap[key][v] += 1
if (v instanceof Map) tally(v, key + '.')
}
}
}
}


def ff = session.get()
if (!ff) return

try {
session.read(ff, { inputStream ->

def root = new
JsonSlurper().setType(JsonParserType.LAX).parseText(IOUtils.toString(inputStream,
StandardCharsets.UTF_8))

root.each {
 tally(it, '')
}
} as InputStreamCallback)

keys = tallyMap.keySet().toList()
def tallyMapString = tallyMap.collectEntries { k, v -> [(k): v]
}.toString()
def topValuesMapString = topValuesMap.collectEntries { k, v -> [(k):
v.sort{ -it.value }.take(10)] }.toString()

ff = session.putAttribute(ff, 'triage.json.fields', keys.join(","))
ff = session.putAttribute(ff, 'triage.json.tallyMap', tallyMapString)
ff = session.putAttribute(ff, 'triage.json.topValuesMap',
topValuesMapString)

session.transfer(ff, REL_SUCCESS)
} catch (Exception e) {
log.error('Error processing json fields', e)
session.transfer(ff, REL_FAILURE)
}

On Fri, May 12, 2023 at 8:54 AM Paul King  wrote:

> Something like this worked for me:
>
> def topValuesMap = [:].withDefault{ [:].withDefault{ 0 } }
> def tallyMap = [:].withDefault{ 0 }
> def tally
> tally = { Map json, String prefix ->
> json.each { k, v ->
> String key = prefix + k
> if (v instanceof List) {
>   tallyMap[key] += 1
>   v.each{ tally(it, key + '.') }
> } else {
> def val = v?.toString().trim()
> if (v) {
> tallyMap[key] += 1
> topValuesMap[key][v] += 1
> if (v instanceof Map) tally(v, key + '.')
> }
> }
> }
> }
>
> def root = new JsonSlurper().parse(inputStream)
> root.each { // each allows json to be a list, not needed if always a map
> tally(it, '')
> }
> println tallyMap
> println topValuesMap.collectEntries{ k, m -> [(k), m.sort{ _, v ->
> -v.value }] }.take(10)
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_6659555815292156536_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Fri, May 12, 2023 at 8:27 PM James McMahon 
> wrote:
>
>> Thank you for the response, Paul. I will integrate and try these
>> suggestions within my Groovy code that runs in a nifi ExecuteScript
>> processor. I'll be working on this once again tonight.
>>
>> The map topValuesMap is intended to capture this: for each key identified
>> in the json, cross-tabulate for each value associated with that key how
>> many t

Re: Unable to properly tally all keys in nested json

2023-05-12 Thread James McMahon

Thank you for the response, Paul. I will integrate and try these
suggestions within my Groovy code that runs in a nifi ExecuteScript
processor. I'll be working on this once again tonight.

The map topValuesMap is intended to capture this: for each key identified
in the json, cross-tabulate for each value associated with that key how
many times it occurs. After the json is fully processed for key, sort the
resulting map and retain only the top ten values found in the json. If a
set has a lastName key, the topValuesMap that results might look something
like this after all keys have been cross-tabulated:

["lastName": ["Smith" : 1023, "Jones" : 976, "Chang": 899, "Doe": 511, ...],
 "address.street": [.],
.
.
.
"a final key": [.]
]

Each key would have ten values in its value map, unless it cross-tabulates
to less than ten in total, in which case it will be sorted by count value
and all values accepted.
Again, many thanks.
Jim

On Fri, May 12, 2023 at 2:19 AM Paul King  wrote:

> I am not 100% sure what you are trying to capture in topValuesMap but for
> tallyMap you probably want something like:
>
> def tallyMap = [:].withDefault{ 0 }
> def tally
> tally = { Map json, String prefix ->
> json.each { k, v ->
> if (v instanceof List) {
>   tallyMap[prefix + k] += 1
>   v.each{ tally(it, "$prefix${k}.") }
> } else if (v?.toString().trim()) {
> tallyMap[prefix + k] += 1
> if (v instanceof Map) tally(v, "$prefix${k}.")
> }
> }
> }
>
> def root = new JsonSlurper().parse(inputStream)
> def initialPrefix = ''
> tally(root, initialPrefix)
> println tallyMap
>
> Output:
> [name:1, age:1, address:1, address.street:1, address.city:1,
> address.state:1, address.zip:1, phoneNumbers:1, phoneNumbers.type:2,
> phoneNumbers.number:2]
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_3707991732434518544_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Fri, May 12, 2023 at 10:38 AM James McMahon 
> wrote:
>
>> I have this incoming json: { "name": "John Doe", "age": 42, "address": {
>> "street": "123 Main St", "city": "Anytown", "state": "CA", "zip": "12345"
>> }, "phoneNumbers": [ { "type": "home", "number": "555-1234" }, { "type":
>> "work", "number": "555-5678" } ] } I wish to tally all the keys in this
>> json in a map that gives me the key name as its key, and a count of the
>> number of times the key occurs in the json as its value. For this example,
>> the keys I expect in my output should include name, age, address,
>> address.street, address.city, address.state, address.zip, phoneNumbers,
>> phoneNumbers.type, and phoneNumbers.number. But I do not get that. Instead,
>> I get this for the list of fields: triage.json.fields
>> name,age,address,phoneNumbers And I get this for my tally count by key:
>> triage.json.tallyMap [name:1, age:1, address:1, phoneNumbers:1]
>>
>> I am close, but not quite there. I don't capture all the keys. Here is my
>> code. How must I modify this to get the result I require? import
>> groovy.json.JsonSlurper import org.apache.commons.io.IOUtils import
>> java.nio.charset.StandardCharsets def keys = [] def tallyMap = [:] def
>> topValuesMap = [:] def ff = session.get() if (!ff) return try {
>> session.read(ff, { inputStream -> def json = new
>> JsonSlurper().parseText(IOUtils.toString(inputStream,
>> StandardCharsets.UTF_8)) json.each { k, v -> if (v != null &&
>> !v.toString().trim().isEmpty()) { tallyMap[k] = tallyMap.containsKey(k) ?
>> tallyMap[k] + 1 : 1 if (topValuesMap.containsKey(k)) { def valuesMap =
>> topValuesMap[k] valuesMap[v] = valuesMap.containsKey(v) ? valuesMap[v] + 1
>> : 1 topValuesMap[k] = valuesMap } else { topValuesMap[k] = [:].withDefault{
>> 0 }.plus([v: 1]) } } } } as InputStreamCallback) keys =
>> tallyMap.keySet().toList() def tallyMapString = tallyMap.collectEntries {
>> k, v -> [(k): v] }.toString() def topValuesMapString =
>> topValuesMap.collectEntries { k, v -> [(k): v.sort{ -it.value }.take(10)]
>> }.toString() ff = session.putAttribute(ff, 'triage.json.fields',
>> keys.join(",")) ff = session.putAttribute(ff, 'triage.json.tallyMap',
>> tallyMapString) ff = session.putAttribute(ff, 'triage.json.topValuesMap',
>> topValuesMapString) session.transfer(ff, REL_SUCCESS) } catch (Exception e)
>> { log.error('Error processing json fields', e) session.transfer(ff,
>> REL_FAILURE) }
>>
>

Unable to properly tally all keys in nested json

2023-05-11 Thread James McMahon

I have this incoming json: { "name": "John Doe", "age": 42, "address": {
"street": "123 Main St", "city": "Anytown", "state": "CA", "zip": "12345"
}, "phoneNumbers": [ { "type": "home", "number": "555-1234" }, { "type":
"work", "number": "555-5678" } ] } I wish to tally all the keys in this
json in a map that gives me the key name as its key, and a count of the
number of times the key occurs in the json as its value. For this example,
the keys I expect in my output should include name, age, address,
address.street, address.city, address.state, address.zip, phoneNumbers,
phoneNumbers.type, and phoneNumbers.number. But I do not get that. Instead,
I get this for the list of fields: triage.json.fields
name,age,address,phoneNumbers And I get this for my tally count by key:
triage.json.tallyMap [name:1, age:1, address:1, phoneNumbers:1]

I am close, but not quite there. I don't capture all the keys. Here is my
code. How must I modify this to get the result I require? import
groovy.json.JsonSlurper import org.apache.commons.io.IOUtils import
java.nio.charset.StandardCharsets def keys = [] def tallyMap = [:] def
topValuesMap = [:] def ff = session.get() if (!ff) return try {
session.read(ff, { inputStream -> def json = new
JsonSlurper().parseText(IOUtils.toString(inputStream,
StandardCharsets.UTF_8)) json.each { k, v -> if (v != null &&
!v.toString().trim().isEmpty()) { tallyMap[k] = tallyMap.containsKey(k) ?
tallyMap[k] + 1 : 1 if (topValuesMap.containsKey(k)) { def valuesMap =
topValuesMap[k] valuesMap[v] = valuesMap.containsKey(v) ? valuesMap[v] + 1
: 1 topValuesMap[k] = valuesMap } else { topValuesMap[k] = [:].withDefault{
0 }.plus([v: 1]) } } } } as InputStreamCallback) keys =
tallyMap.keySet().toList() def tallyMapString = tallyMap.collectEntries {
k, v -> [(k): v] }.toString() def topValuesMapString =
topValuesMap.collectEntries { k, v -> [(k): v.sort{ -it.value }.take(10)]
}.toString() ff = session.putAttribute(ff, 'triage.json.fields',
keys.join(",")) ff = session.putAttribute(ff, 'triage.json.tallyMap',
tallyMapString) ff = session.putAttribute(ff, 'triage.json.topValuesMap',
topValuesMapString) session.transfer(ff, REL_SUCCESS) } catch (Exception e)
{ log.error('Error processing json fields', e) session.transfer(ff,
REL_FAILURE) }

Re: Groovy script error, JsonSlurper

2023-05-09 Thread James McMahon

I will try this first thing when I get back on this system this evening.
Thank you for your reply Matt.
One followup: are there any imports i am overlooking to use json.each? I
don't think so, but wanted to check now. Currently I import these json
libraries, which I believe are baked into nifi groovy in my version, 1.16.3:

import groovy.json.JsonSlurper
import groovy.json.JsonOutput


On Mon, May 8, 2023 at 10:38 PM Matt Burgess  wrote:

> I believe it's your "json.each { jsonObj ->" line, with a JSON object
> it's going to return a key/value pair so try "json.each { k,v ->"
> instead and use the key k and value v in your script.
>
> On Mon, May 8, 2023 at 9:32 PM James McMahon  wrote:
> >
> > Hello. I have incoming data that is json. An example of one case looks
> like this:
> >
> > {"id": "20230508215236_4447cd0a-9dca-47cb-90b1-6562cf34155a_Timer-Driven
> Process Thread-9",
> > "te": "0.9494",
> > "diskusage": "0.2776125422110003.3 MB",
> > "memory": 77,
> > "cpu": 0.58,
> > "host": "172.31.73.197/ip-172-31-73-197.ec2.internal",
> > "temperature": "97",
> > "macaddress": "f417ead3-4fa9-4cee-a14b-7172e9ecd3ea",
> > "end": "61448816405795",
> > "systemtime": "05/08/2023 16:52:36"}
> >
> >
> > I wrote a Groovy script running in a nifi ExecuteScript processor to
> tally the keys from the json, reporting the results as flowfile attributes.
> I am getting this error:
> >
> > ExecuteScript[id=028b1d40-33a5-1766-b659-31b3faaf13f5] Error processing
> json fields: groovy.lang.MissingMethodException: No signature of method:
> Script9$_run_closure1$_closure2$_closure4.doCall() is applicable for
> argument types: (Entry) values:
> [id=20230508215236_4447cd0a-9dca-47cb-90b1-6562cf34155a_Timer-Driven
> Process Thread-9]
> > Possible solutions: doCall(java.lang.Object, java.lang.Object),
> findAll(), findAll(), isCase(java.lang.Object), isCase(java.lang.Object)
> >
> >
> >
> > What is causing this error? Am I reading in the data incorrectly using
> the JsonSlurper?
> >
> >
> >
> > Here is my code:
> >
> > import groovy.json.JsonSlurper
> > import groovy.json.JsonOutput
> > import org.apache.commons.io.IOUtils
> > import java.nio.charset.StandardCharsets
> >
> > def originalFile = ''
> > def keys = []
> > def tallyMap = [:]
> > def topValuesMap = [:]
> > def lineCount = 0
> >
> > def ff = session.get()
> > if (!ff) return
> > try {
> > session.read(ff, { inputStream ->
> > def json = new
> JsonSlurper().parseText(IOUtils.toString(inputStream,
> StandardCharsets.UTF_8))
> >
> >
> > def keySet = new HashSet()
> >
> > if (json instanceof List) {
> > keySet.addAll(((Map)json[0]).keySet())
> > } else {
> > keySet.addAll(json.keySet())
> > }
> >
> > keys = keySet.toList()
> > originalFile = ff.getAttribute('filename')
> >
> >
> > json.each { jsonObj ->
> > jsonObj.each { key, value ->
> > if (value != null && !value.toString().trim().isEmpty())
> {
> > tallyMap[key] = tallyMap.containsKey(key) ?
> tallyMap[key] + 1 : 1
> > if (topValuesMap.containsKey(key)) {
> > def valuesMap = topValuesMap[key]
> > valuesMap[value] = valuesMap.containsKey(value)
> ? valuesMap[value] + 1 : 1
> > topValuesMap[key] = valuesMap
> > } else {
> > topValuesMap[key] = [:].withDefault{ 0
> }.plus([value: 1])
> > }
> > // Remove the "value: 1" entry from the topValuesMap
> > topValuesMap[key].remove("value")
> > }
> > }
> > }
> >
> > // Sort the topValuesMap for each key based on the frequency of
> values
> > topValuesMap.each { key, valuesMap ->
> > topValuesMap[key] = valuesMap.sort{ -it.value }.take(10)
> > }
> >
> >
> > // Count the number of JSON records
> > lineCount += json.size()
> > } as InputStreamCallback)
> >
> > def tallyMapString = JsonOutput.toJson(tallyMap)
> > def topValuesMapString = JsonOutput.toJson(topValuesMap)
> >
> > ff = session.putAttribute(ff, 'triage.json.file', originalFile)
> > ff = session.putAttribute(ff, 'triage.json.fields', keys.join(","))
> > ff = session.putAttribute(ff, 'triage.json.tallyMap', tallyMapString)
> > ff = session.putAttribute(ff, 'triage.json.topValuesMap',
> topValuesMapString)
> > ff = session.putAttribute(ff, 'triage.json.lineCount',
> lineCount.toString())
> > session.transfer(ff, REL_SUCCESS)
> >
> > } catch (Exception e) {
> > log.error('Error processing json fields', e)
> > session.transfer(ff, REL_FAILURE)
> > }
> >
> >
> >
>

Groovy script error, JsonSlurper

2023-05-08 Thread James McMahon

Hello. I have incoming data that is json. An example of one case looks like
this:

{"id": "20230508215236_4447cd0a-9dca-47cb-90b1-6562cf34155a_Timer-Driven
Process Thread-9",
"te": "0.9494",
"diskusage": "0.2776125422110003.3 MB",
"memory": 77,
"cpu": 0.58,
"host": "172.31.73.197/ip-172-31-73-197.ec2.internal",
"temperature": "97",
"macaddress": "f417ead3-4fa9-4cee-a14b-7172e9ecd3ea",
"end": "61448816405795",
"systemtime": "05/08/2023 16:52:36"}


I wrote a Groovy script running in a nifi ExecuteScript processor to tally
the keys from the json, reporting the results as flowfile attributes. I am
getting this error:

ExecuteScript[id=028b1d40-33a5-1766-b659-31b3faaf13f5] Error
processing json fields: groovy.lang.MissingMethodException: No
signature of method:
Script9$_run_closure1$_closure2$_closure4.doCall() is applicable for
argument types: (Entry) values:
[id=20230508215236_4447cd0a-9dca-47cb-90b1-6562cf34155a_Timer-Driven
Process Thread-9]
Possible solutions: doCall(java.lang.Object, java.lang.Object),
findAll(), findAll(), isCase(java.lang.Object),
isCase(java.lang.Object)



What is causing this error? Am I reading in the data incorrectly using
the JsonSlurper?



Here is my code:

import groovy.json.JsonSlurper
import groovy.json.JsonOutput
import org.apache.commons.io.IOUtils
import java.nio.charset.StandardCharsets

def originalFile = ''
def keys = []
def tallyMap = [:]
def topValuesMap = [:]
def lineCount = 0

def ff = session.get()
if (!ff) return
try {
session.read(ff, { inputStream ->
def json = new
JsonSlurper().parseText(IOUtils.toString(inputStream,
StandardCharsets.UTF_8))


def keySet = new HashSet()

if (json instanceof List) {
keySet.addAll(((Map)json[0]).keySet())
} else {
keySet.addAll(json.keySet())
}

keys = keySet.toList()
originalFile = ff.getAttribute('filename')


json.each { jsonObj ->
jsonObj.each { key, value ->
if (value != null && !value.toString().trim().isEmpty()) {
tallyMap[key] = tallyMap.containsKey(key) ?
tallyMap[key] + 1 : 1
if (topValuesMap.containsKey(key)) {
def valuesMap = topValuesMap[key]
valuesMap[value] =
valuesMap.containsKey(value) ? valuesMap[value] + 1 : 1
topValuesMap[key] = valuesMap
} else {
topValuesMap[key] = [:].withDefault{ 0
}.plus([value: 1])
}
// Remove the "value: 1" entry from the topValuesMap
topValuesMap[key].remove("value")
}
}
}

// Sort the topValuesMap for each key based on the frequency of values
topValuesMap.each { key, valuesMap ->
topValuesMap[key] = valuesMap.sort{ -it.value }.take(10)
}


// Count the number of JSON records
lineCount += json.size()
} as InputStreamCallback)

def tallyMapString = JsonOutput.toJson(tallyMap)
def topValuesMapString = JsonOutput.toJson(topValuesMap)

ff = session.putAttribute(ff, 'triage.json.file', originalFile)
ff = session.putAttribute(ff, 'triage.json.fields', keys.join(","))
ff = session.putAttribute(ff, 'triage.json.tallyMap', tallyMapString)
ff = session.putAttribute(ff, 'triage.json.topValuesMap',
topValuesMapString)
ff = session.putAttribute(ff, 'triage.json.lineCount', lineCount.toString())
session.transfer(ff, REL_SUCCESS)

} catch (Exception e) {
log.error('Error processing json fields', e)
session.transfer(ff, REL_FAILURE)
}

Improving what I extract for my xml tag

2023-04-02 Thread James McMahon

Hello. I am developing a Groovy script to work through all the tags in an
incoming text representation of an xml file. I've read my xml from the
content of a NiFi flowfile, like so:

session.read(ff, {inputStream ->
  text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
  def xml = new XmlParser().parseText(text)

Later in my code I iterate through the xml to find all the values
associated with that tag. Ultimately I will build a unique sorted list,
tagValues.

This is the code section that I have working so far:

incomingTagsMap.each {
   tagValues.clear()
  * xml.'**'.each { itt ->*
log.warn('Found this tag: {}, with this value: {}',
itt.name(), itt.text())
// If itt.text() is not on tagValues list, add it...
if ( !tagValues.contains(itt.text()) ) {
tagValues.add(itt.text()) }
log.warn('Length of tagValues list : {}',
tagValues.size())
   }
   log.warn('Tag {} has values: {}', ["$it.key",
tagValues.toListString()] as Object[])
  }

$it.key comes from an outer iterative loop.

Where I am failing:
I was hoping someone could help me tune this because in cases like this
 file-user-group-provider
...I only get *property *as the tag. What I really want to identify as my
tag is *User Group Provider*  .

I've been unable to find any examples. Can anyone show me how to accomplish
this?

Thanks very much in advance.
Cheers,
Jim

Re: Dynamic assignment of list name in iterator statement?

2023-03-05 Thread James McMahon

Yes indeed. A careless error one makes in the early hours of the morning.
Thank you very much  Søren.
I am also going to look into Locale as Rachel G. encouraged 😉. And Bob B.
has replied with a sample I want to explore too. Thank you Bob and thank
you Rachel.
Jim

On Sun, Mar 5, 2023 at 6:32 AM Søren Berg Glasius  wrote:

> Hi Jim,
>
> If your switch hits "English" it will also set the rest of the cases. You
> need a "break" after "containsEnglish = true" - just like in Java
>
>
> Med venlig hilsen,
> Søren Berg Glasius
>
> Hedevej 1, Gl. Rye, 8680 Ry
> Mobile: +45 40 44 91 88
> --- Press ESC once to quit - twice to save the changes.
>
>
> Den søn. 5. mar. 2023 kl. 09.37 skrev James McMahon  >:
>
>> Was trying to come up with a Groovy way to collapse a lengthy switch
>> statement to dynamically building the variable name. I've failed at that.
>> Instead, I've fallen back on this option:
>>
>>  switch("$k") {
>>case "English":
>> containsEnglish = true
>>case "Spanish":
>> containsSpanish = true
>>case "French":
>> containsFrench = true
>>case "Japanese":
>> containsJapanese = true
>>case "German":
>> containsGerman = true
>>.
>>.
>>.
>>default:
>>     break
>>   }
>>
>> I initialize each of my "containsXYZ" variables to false at the beginning
>> of my Groovy script. It works well, though it seems to lack elegance and
>> brevity to me.
>>
>> Thanks again.
>> Jim
>>
>> On Sat, Mar 4, 2023 at 5:10 PM James McMahon 
>> wrote:
>>
>>> Søren  ,
>>> May I ask you a follow up? I am trying what I thought I read in your
>>> reply (thank you for that, by the way). But I continue to get this error:
>>> "The LHS of an assignment should be a variable or a field accessing
>>> expression @ "
>>>
>>> This is what I currently have, attempting to set my variable name to
>>> include the key drawn from my Groovy map. How must I change this to get it
>>> to work?
>>>
>>>  mapLanguages.each { k, x ->
>>>   log.warn('mapLanguages entry is this: {} {}', ["$k", "$x"] as
>>> Object[])
>>>   x.each {
>>>languageChar -> log.warn('language char in {} is this:
>>> {}', ["$k", "$languageChar"] as Object[])
>>>   }
>>>   "contains${k}" = true
>>>  }
>>>
>>> Many thanks again,
>>> Jim
>>>
>>> On Thu, Feb 23, 2023 at 3:01 AM Søren Berg Glasius 
>>> wrote:
>>>
>>>> Hi Jim,
>>>>
>>>> It is possible:
>>>>
>>>> languages = ['english', 'french', 'spanish']
>>>> englishCharsList = ['a','b']
>>>> frenchCharsList = ['c','d']
>>>> spanishCharsList = ['e','f']
>>>>
>>>> languages.each { lang ->
>>>> this."${lang}CharsList".each { ch ->
>>>> println "$lang -> $ch"
>>>> }
>>>> }
>>>>
>>>> Check it out here:
>>>> https://gwc-experiment.appspot.com/?g=groovy_3_0&codez=eJxVjkEKwyAQRfeeYhDBTZobtJtue4PShbVGBRmCY1fBu2e0ppBZDMN__38mGfRf4x3BFZ7aoU-Rgp5AL9mh7RetBpv4EgPfg8n0iFR6xuhJvxn-AmdmmX2YjYozdAwXhiIdP8zO2AAbNAEuNwE8JUSapdqaVv8F8rDyGsY2a45YEoJUowKUDbLjKqrYAZXRSNo
>>>>
>>>>
>>>> Best regards,
>>>> Søren Berg Glasius
>>>>
>>>> Hedevej 1, Gl. Rye, 8680 Ry
>>>> Mobile: +45 40 44 91 88
>>>> --- Press ESC once to quit - twice to save the changes.
>>>>
>>>>
>>>> Den tor. 23. feb. 2023 kl. 01.52 skrev James McMahon <
>>>> jsmcmah...@gmail.com>:
>>>>
>>>>> Good evening. I have a list named languageCharactersList. I begin my
>>>>> iteration through elements in that list with this:
>>>>>
>>>>> languageCharactersList.eachWithIndex( it, i ->
>>>>>
>>>>> I hope to make this more generic, so that I can build a variable name
>>>>> that points to the appropriate list, which then allows me to keep my
>>>>> iteration loop generic.
>>>>>
>>>>> I'd like to do this:
>>>>> def languages = ['english', 'french', 'spanish']
>>>>> def englishCharsList = []
>>>>> def frenchCharsList = [.]
>>>>> def spanishCharsList = []
>>>>>
>>>>> I'll set up an iterator to grab each of the languages. Within that
>>>>> iterative loop I will set a general variable like so:
>>>>> def CharsList = "english"+"CharsList" (then "french", then
>>>>> "spanish",.)
>>>>>
>>>>> I was hoping I could then set up the generic iterator like so:
>>>>> *"$CharsList"*.eachWithIndex{ it, i ->
>>>>> or like so
>>>>> *$CharsList*.eachWithIndex{ it, i ->
>>>>>
>>>>> But Groovy doesn't allow this approach, and throws a stack trace.
>>>>>
>>>>> How can we employ a variable assignment in that list iterator
>>>>> statement so it can be generalized?
>>>>>
>>>>> Thanks in advance.
>>>>> Jim
>>>>>
>>>>>

Re: Dynamic assignment of list name in iterator statement?

2023-03-05 Thread James McMahon

Was trying to come up with a Groovy way to collapse a lengthy switch
statement to dynamically building the variable name. I've failed at that.
Instead, I've fallen back on this option:

 switch("$k") {
   case "English":
containsEnglish = true
   case "Spanish":
containsSpanish = true
   case "French":
containsFrench = true
   case "Japanese":
containsJapanese = true
   case "German":
containsGerman = true
   .
   .
   .
   default:
break
  }

I initialize each of my "containsXYZ" variables to false at the beginning
of my Groovy script. It works well, though it seems to lack elegance and
brevity to me.

Thanks again.
Jim

On Sat, Mar 4, 2023 at 5:10 PM James McMahon  wrote:

> Søren  ,
> May I ask you a follow up? I am trying what I thought I read in your reply
> (thank you for that, by the way). But I continue to get this error:
> "The LHS of an assignment should be a variable or a field accessing
> expression @ "
>
> This is what I currently have, attempting to set my variable name to
> include the key drawn from my Groovy map. How must I change this to get it
> to work?
>
>  mapLanguages.each { k, x ->
>   log.warn('mapLanguages entry is this: {} {}', ["$k", "$x"] as
> Object[])
>   x.each {
>languageChar -> log.warn('language char in {} is this: {}',
> ["$k", "$languageChar"] as Object[])
>   }
>   "contains${k}" = true
>  }
>
> Many thanks again,
> Jim
>
> On Thu, Feb 23, 2023 at 3:01 AM Søren Berg Glasius 
> wrote:
>
>> Hi Jim,
>>
>> It is possible:
>>
>> languages = ['english', 'french', 'spanish']
>> englishCharsList = ['a','b']
>> frenchCharsList = ['c','d']
>> spanishCharsList = ['e','f']
>>
>> languages.each { lang ->
>> this."${lang}CharsList".each { ch ->
>> println "$lang -> $ch"
>> }
>> }
>>
>> Check it out here:
>> https://gwc-experiment.appspot.com/?g=groovy_3_0&codez=eJxVjkEKwyAQRfeeYhDBTZobtJtue4PShbVGBRmCY1fBu2e0ppBZDMN__38mGfRf4x3BFZ7aoU-Rgp5AL9mh7RetBpv4EgPfg8n0iFR6xuhJvxn-AmdmmX2YjYozdAwXhiIdP8zO2AAbNAEuNwE8JUSapdqaVv8F8rDyGsY2a45YEoJUowKUDbLjKqrYAZXRSNo
>>
>>
>> Best regards,
>> Søren Berg Glasius
>>
>> Hedevej 1, Gl. Rye, 8680 Ry
>> Mobile: +45 40 44 91 88
>> --- Press ESC once to quit - twice to save the changes.
>>
>>
>> Den tor. 23. feb. 2023 kl. 01.52 skrev James McMahon <
>> jsmcmah...@gmail.com>:
>>
>>> Good evening. I have a list named languageCharactersList. I begin my
>>> iteration through elements in that list with this:
>>>
>>> languageCharactersList.eachWithIndex( it, i ->
>>>
>>> I hope to make this more generic, so that I can build a variable name
>>> that points to the appropriate list, which then allows me to keep my
>>> iteration loop generic.
>>>
>>> I'd like to do this:
>>> def languages = ['english', 'french', 'spanish']
>>> def englishCharsList = []
>>> def frenchCharsList = [.]
>>> def spanishCharsList = []
>>>
>>> I'll set up an iterator to grab each of the languages. Within that
>>> iterative loop I will set a general variable like so:
>>> def CharsList = "english"+"CharsList" (then "french", then
>>> "spanish",.)
>>>
>>> I was hoping I could then set up the generic iterator like so:
>>> *"$CharsList"*.eachWithIndex{ it, i ->
>>> or like so
>>> *$CharsList*.eachWithIndex{ it, i ->
>>>
>>> But Groovy doesn't allow this approach, and throws a stack trace.
>>>
>>> How can we employ a variable assignment in that list iterator statement
>>> so it can be generalized?
>>>
>>> Thanks in advance.
>>> Jim
>>>
>>>

Re: Dynamic assignment of list name in iterator statement?

2023-03-04 Thread James McMahon

Søren  ,
May I ask you a follow up? I am trying what I thought I read in your reply
(thank you for that, by the way). But I continue to get this error:
"The LHS of an assignment should be a variable or a field accessing
expression @ "

This is what I currently have, attempting to set my variable name to
include the key drawn from my Groovy map. How must I change this to get it
to work?

 mapLanguages.each { k, x ->
  log.warn('mapLanguages entry is this: {} {}', ["$k", "$x"] as
Object[])
  x.each {
   languageChar -> log.warn('language char in {} is this: {}',
["$k", "$languageChar"] as Object[])
  }
  "contains${k}" = true
 }

Many thanks again,
Jim

On Thu, Feb 23, 2023 at 3:01 AM Søren Berg Glasius 
wrote:

> Hi Jim,
>
> It is possible:
>
> languages = ['english', 'french', 'spanish']
> englishCharsList = ['a','b']
> frenchCharsList = ['c','d']
> spanishCharsList = ['e','f']
>
> languages.each { lang ->
> this."${lang}CharsList".each { ch ->
> println "$lang -> $ch"
> }
> }
>
> Check it out here:
> https://gwc-experiment.appspot.com/?g=groovy_3_0&codez=eJxVjkEKwyAQRfeeYhDBTZobtJtue4PShbVGBRmCY1fBu2e0ppBZDMN__38mGfRf4x3BFZ7aoU-Rgp5AL9mh7RetBpv4EgPfg8n0iFR6xuhJvxn-AmdmmX2YjYozdAwXhiIdP8zO2AAbNAEuNwE8JUSapdqaVv8F8rDyGsY2a45YEoJUowKUDbLjKqrYAZXRSNo
>
>
> Best regards,
> Søren Berg Glasius
>
> Hedevej 1, Gl. Rye, 8680 Ry
> Mobile: +45 40 44 91 88
> --- Press ESC once to quit - twice to save the changes.
>
>
> Den tor. 23. feb. 2023 kl. 01.52 skrev James McMahon  >:
>
>> Good evening. I have a list named languageCharactersList. I begin my
>> iteration through elements in that list with this:
>>
>> languageCharactersList.eachWithIndex( it, i ->
>>
>> I hope to make this more generic, so that I can build a variable name
>> that points to the appropriate list, which then allows me to keep my
>> iteration loop generic.
>>
>> I'd like to do this:
>> def languages = ['english', 'french', 'spanish']
>> def englishCharsList = []
>> def frenchCharsList = [.]
>> def spanishCharsList = []
>>
>> I'll set up an iterator to grab each of the languages. Within that
>> iterative loop I will set a general variable like so:
>> def CharsList = "english"+"CharsList" (then "french", then
>> "spanish",.)
>>
>> I was hoping I could then set up the generic iterator like so:
>> *"$CharsList"*.eachWithIndex{ it, i ->
>> or like so
>> *$CharsList*.eachWithIndex{ it, i ->
>>
>> But Groovy doesn't allow this approach, and throws a stack trace.
>>
>> How can we employ a variable assignment in that list iterator statement
>> so it can be generalized?
>>
>> Thanks in advance.
>> Jim
>>
>>

Re: Dynamic assignment of list name in iterator statement?

2023-02-23 Thread James McMahon

I see how this differs from my two initial attempts. Thank you very much
Søren. This will work well.
I'll visit this link and read more this morning.
Jim

On Thu, Feb 23, 2023 at 3:01 AM Søren Berg Glasius 
wrote:

> Hi Jim,
>
> It is possible:
>
> languages = ['english', 'french', 'spanish']
> englishCharsList = ['a','b']
> frenchCharsList = ['c','d']
> spanishCharsList = ['e','f']
>
> languages.each { lang ->
> this."${lang}CharsList".each { ch ->
> println "$lang -> $ch"
> }
> }
>
> Check it out here:
> https://gwc-experiment.appspot.com/?g=groovy_3_0&codez=eJxVjkEKwyAQRfeeYhDBTZobtJtue4PShbVGBRmCY1fBu2e0ppBZDMN__38mGfRf4x3BFZ7aoU-Rgp5AL9mh7RetBpv4EgPfg8n0iFR6xuhJvxn-AmdmmX2YjYozdAwXhiIdP8zO2AAbNAEuNwE8JUSapdqaVv8F8rDyGsY2a45YEoJUowKUDbLjKqrYAZXRSNo
>
>
> Best regards,
> Søren Berg Glasius
>
> Hedevej 1, Gl. Rye, 8680 Ry
> Mobile: +45 40 44 91 88
> --- Press ESC once to quit - twice to save the changes.
>
>
> Den tor. 23. feb. 2023 kl. 01.52 skrev James McMahon  >:
>
>> Good evening. I have a list named languageCharactersList. I begin my
>> iteration through elements in that list with this:
>>
>> languageCharactersList.eachWithIndex( it, i ->
>>
>> I hope to make this more generic, so that I can build a variable name
>> that points to the appropriate list, which then allows me to keep my
>> iteration loop generic.
>>
>> I'd like to do this:
>> def languages = ['english', 'french', 'spanish']
>> def englishCharsList = []
>> def frenchCharsList = [.]
>> def spanishCharsList = []
>>
>> I'll set up an iterator to grab each of the languages. Within that
>> iterative loop I will set a general variable like so:
>> def CharsList = "english"+"CharsList" (then "french", then
>> "spanish",.)
>>
>> I was hoping I could then set up the generic iterator like so:
>> *"$CharsList"*.eachWithIndex{ it, i ->
>> or like so
>> *$CharsList*.eachWithIndex{ it, i ->
>>
>> But Groovy doesn't allow this approach, and throws a stack trace.
>>
>> How can we employ a variable assignment in that list iterator statement
>> so it can be generalized?
>>
>> Thanks in advance.
>> Jim
>>
>>

Dynamic assignment of list name in iterator statement?

2023-02-22 Thread James McMahon

Good evening. I have a list named languageCharactersList. I begin my
iteration through elements in that list with this:

languageCharactersList.eachWithIndex( it, i ->

I hope to make this more generic, so that I can build a variable name that
points to the appropriate list, which then allows me to keep my iteration
loop generic.

I'd like to do this:
def languages = ['english', 'french', 'spanish']
def englishCharsList = []
def frenchCharsList = [.]
def spanishCharsList = []

I'll set up an iterator to grab each of the languages. Within that
iterative loop I will set a general variable like so:
def CharsList = "english"+"CharsList" (then "french", then "spanish",.)

I was hoping I could then set up the generic iterator like so:
*"$CharsList"*.eachWithIndex{ it, i ->
or like so
*$CharsList*.eachWithIndex{ it, i ->

But Groovy doesn't allow this approach, and throws a stack trace.

How can we employ a variable assignment in that list iterator statement
so it can be generalized?

Thanks in advance.
Jim

Re: Foreign language chars as map keys

2023-02-22 Thread James McMahon

I will be happy to do that, Guillame. And thank you in advance for any
help.
I'll be at the site where I have this code in about seven hours. I'll send
it then.
Jim

On Wed, Feb 22, 2023 at 9:29 AM Guillaume Laforge 
wrote:

> Hi James,
>
> Perhaps you could share with us how you're building the crosstabulation
> map?
> Somehow, some spacing is introduced, and that would probably in that map
> creation that this takes place.
>
> Guillaume
>
> On Wed, Feb 22, 2023 at 3:24 PM James McMahon 
> wrote:
>
>> Thank you Rachel. I will look at employing .charAt(0) on the key.
>> I do believe it is indeed a string. To your point, I suspect I am comparing
>> a character to a string.
>> What of the need to clean up the keys - stripping them of any extraneous
>> spaces so I get just the character itself in the key string? Anyone know a
>> way to work through the map and trim keys?
>>
>>
>> On Wed, Feb 22, 2023 at 7:57 AM Rachel Greenham  wrote:
>>
>>> Are you sure you’re not comparing Strings to Characters at some point?
>>> Going @TypeChecked might reveal if and where that’s happening...
>>>
>>> --
>>> Rachel Greenham
>>> rac...@merus.eu
>>>
>>> On 22 Feb 2023, at 11:58, James McMahon  wrote:
>>>
>>> I have a Groovy list that holds the unicode representation of select
>>> foreign language characters, something like this simplified version:
>>>
>>> def myList = ['*\u00E4*','\u00D6','\u00F8']
>>>
>>> I have built myself a Groovy map that is the crosstabulation of
>>> characters by count in an incoming document, so my map looks something like
>>> this:
>>>
>>> crossTab = ["a" : "16736", "b" : "192", " * ä  *" : "18"]
>>>
>>> The foreign language characters in this map that are in the set of keys
>>> often have extra whitespace around them, and for certain languages there is
>>> a weird "right to left" thing going on that I don't quite fully understand.
>>>
>>> My objective: iterate through my list, return true if the element from
>>> the list is found as a key in the map, and return the count - the map value
>>> for that key - if the key is found. My problem: my lookup is failing to
>>> return any hits right now. I know that some of these foreign language
>>> characters are in my data. I suspect my lookup is failing because the keys
>>> are not clean representations of the foreign language characters.
>>>
>>> How do I modify my keys using Groovy to trim them of leading and
>>> trailing whitespace?
>>>
>>> Since my element from my list is expressed as unicode, how would I
>>> convert the trimmed key representation to unicode using Groovy?
>>>
>>> Thank you in advance for any help.
>>>
>>>
>>>
>
> --
> Guillaume Laforge
> Apache Groovy committer
> Developer Advocate @ Google Cloud Platform
>
> Blog: http://glaforge.appspot.com/
> Twitter: @glaforge <http://twitter.com/glaforge>
>

Re: Foreign language chars as map keys

2023-02-22 Thread James McMahon

Thank you Rachel. I will look at employing .charAt(0) on the key. I
do believe it is indeed a string. To your point, I suspect I am comparing a
character to a string.
What of the need to clean up the keys - stripping them of any extraneous
spaces so I get just the character itself in the key string? Anyone know a
way to work through the map and trim keys?


On Wed, Feb 22, 2023 at 7:57 AM Rachel Greenham  wrote:

> Are you sure you’re not comparing Strings to Characters at some point?
> Going @TypeChecked might reveal if and where that’s happening...
>
> --
> Rachel Greenham
> rac...@merus.eu
>
> On 22 Feb 2023, at 11:58, James McMahon  wrote:
>
> I have a Groovy list that holds the unicode representation of select
> foreign language characters, something like this simplified version:
>
> def myList = ['*\u00E4*','\u00D6','\u00F8']
>
> I have built myself a Groovy map that is the crosstabulation of characters
> by count in an incoming document, so my map looks something like this:
>
> crossTab = ["a" : "16736", "b" : "192", " * ä  *" : "18"]
>
> The foreign language characters in this map that are in the set of keys
> often have extra whitespace around them, and for certain languages there is
> a weird "right to left" thing going on that I don't quite fully understand.
>
> My objective: iterate through my list, return true if the element from the
> list is found as a key in the map, and return the count - the map value for
> that key - if the key is found. My problem: my lookup is failing to return
> any hits right now. I know that some of these foreign language characters
> are in my data. I suspect my lookup is failing because the keys are not
> clean representations of the foreign language characters.
>
> How do I modify my keys using Groovy to trim them of leading and trailing
> whitespace?
>
> Since my element from my list is expressed as unicode, how would I convert
> the trimmed key representation to unicode using Groovy?
>
> Thank you in advance for any help.
>
>
>

Foreign language chars as map keys

2023-02-22 Thread James McMahon

I have a Groovy list that holds the unicode representation of select
foreign language characters, something like this simplified version:

def myList = ['*\u00E4*','\u00D6','\u00F8']

I have built myself a Groovy map that is the crosstabulation of characters
by count in an incoming document, so my map looks something like this:

crossTab = ["a" : "16736", "b" : "192", " * ä  *" : "18"]

The foreign language characters in this map that are in the set of keys
often have extra whitespace around them, and for certain languages there is
a weird "right to left" thing going on that I don't quite fully understand.

My objective: iterate through my list, return true if the element from the
list is found as a key in the map, and return the count - the map value for
that key - if the key is found. My problem: my lookup is failing to return
any hits right now. I know that some of these foreign language characters
are in my data. I suspect my lookup is failing because the keys are not
clean representations of the foreign language characters.

How do I modify my keys using Groovy to trim them of leading and trailing
whitespace?

Since my element from my list is expressed as unicode, how would I convert
the trimmed key representation to unicode using Groovy?

Thank you in advance for any help.

Re: Misformatting json output

2022-06-06 Thread James McMahon

fieldC is a hash value, a mix of letters and numbers. It is not shown
surrounded by quotes as incoming data, but I could use a toString() or
equivalent in Groovy to make it a string in advance of my Json formatting,
if that helps matters.

One difference I note is that in your data[ ] map, each of the fieldA and
fieldB values you show is surrounded by [ and ] , while mine are surrounded
by { and }. Could that be causing my challenge? Does that matter?

On Mon, Jun 6, 2022 at 11:46 AM Nelson, Erick 
wrote:

> import groovy.json.*
>
>
>
> def data = [
>
>   fieldA:[lname:"Smith", fname:"John"],
>
>   fieldB:[age:21, gender:"M"],
>
>   fieldC:12345
>
> ]
>
>
>
> println JsonOutput.toJson(data)
>
>
>
>
>
> ^^^ this works for me.
>
> Running it produces this…
>
>
>
>
> {"fieldA":{"lname":"Smith","fname":"John"},"fieldB":{"age":21,"gender":"M"},"fieldC":12345}
>
>
>
> What is the source of your data look like?
>
> Is fieldC a number or a string?
>
>
>
> *From: *James McMahon 
> *Date: *Monday, June 6, 2022 at 8:36 AM
> *To: *users@groovy.apache.org 
> *Subject: *[EXT] Misformatting json output
>
> I am having problems properly formatting Json output to requirements.
>
> I have three fields, similar to this:
>
> fieldA has value {"lname":"Smith", "fname":"John"}
>
> fieldB has value  {"age":"21", "gender":"M"}
>
> fields has value 12345
>
>
>
> I put each into an empty Groovy map, result[:] .
>
>
>
> I'm required to generate Json output like this:
>
> {
>
> "fieldA": {"lname":"Smith", "fname":"John"},
>
> "fieldB": {"age":"21", "gender":"M"},
>
> "fieldC":"12345"
>
> }
>
>
>
> but am instead getting output with double quotes around the { } for fields
> A and B, like this:
>
> {
>
> "fieldA": "{"lname":"Smith", "fname":"John"}",
>
> "fieldB": "{"age":"21", "gender":"M"}",
>
> "fieldC":"12345"
>
> }
>
>
>
> Is there a way I can suppress the surrounding of the values in the map
> when I create the json?
>
>
>
> Currently this is how I process my map to json:
>
>
>
> def JsonStr = JsonOutput.prettyPrint(JsonOutput.toJson(result))
>
>
>
> which I then write to my outputStream like this:
>
>
>
> outputStream.write(JsonStr.getBytes(StandardCharsets.UTF_8))
>
>
>
> Thanks in advance for any help.
>

Misformatting json output

2022-06-06 Thread James McMahon

I am having problems properly formatting Json output to requirements.
I have three fields, similar to this:
fieldA has value {"lname":"Smith", "fname":"John"}
fieldB has value  {"age":"21", "gender":"M"}
fields has value 12345

I put each into an empty Groovy map, result[:] .

I'm required to generate Json output like this:
{
"fieldA": {"lname":"Smith", "fname":"John"},
"fieldB": {"age":"21", "gender":"M"},
"fieldC":"12345"
}

but am instead getting output with double quotes around the { } for fields
A and B, like this:
{
"fieldA": "{"lname":"Smith", "fname":"John"}",
"fieldB": "{"age":"21", "gender":"M"}",
"fieldC":"12345"
}

Is there a way I can suppress the surrounding of the values in the map when
I create the json?

Currently this is how I process my map to json:

def JsonStr = JsonOutput.prettyPrint(JsonOutput.toJson(result))

which I then write to my outputStream like this:

outputStream.write(JsonStr.getBytes(StandardCharsets.UTF_8))

Thanks in advance for any help.

Re: Checking directory state using Groovy

2021-10-21 Thread James McMahon

Rachel, thanks again for weighing in. I'm a little confused and was hoping
to ask you for clarification. Earlier in this thread, you use a File
approach and it seemed to work. Why did you mention in the follow-up
comment that it doesn't actually work? Why did it work the first time, but
not the second time? Was it because in the first case you were using the
groovy command line interpreter, and in the second case that failed you
tried to run from inside a Groovy script, maybe? Or was it something else
I'm missing entirely?
- - -
Jim Mc.

On Wed, Oct 20, 2021 at 7:35 AM Rachel Greenham  wrote:

> ah sadly i did think of that when i was writing it but it didn't work. Not
> 100% sure why, but i think mostly that many of those methods in Files take
> a varargs of stuff like LinkOption... OpenOption... and the groovy category
> support isn't resolving properties past that.
>
> --
> Rachel Greenham
> rac...@merus.eu
>
> On 20 Oct 2021, at 11:55, MG  wrote:
>
> Don't know if you already know this, but using Groovy property syntax
> makes code even more readable, e.g.:
>
> println "${it}: ${it.getOwner()} ${it.getPosixFilePermissions()}"
>
> can be written as:
>
> println "$it: $it.owner $it.posixFilePermissions"
>
> In general:
> 1. Any getter can be accessed without the "get" prefix with a lowercase
> first char
> 2. A simplified string interpolation syntax without the enclosing curly
> braces can be used in these cases
> (same goes for setters)
>
> Cheers,
> mg
>
>
> On 20/10/2021 12:14, James McMahon wrote:
>
> Many thanks to each of you who offered guidance. Redirecting back to this
> today, anticipating success given your advice. Still getting a feel for
> Groovy so this helps quite a bit.
> Cheers,
> -Jim
>
> On Fri, Oct 15, 2021 at 11:22 AM Søren Berg Glasius 
> wrote:
>
>> @Rachel Rudnick  that is a very clever use of
>> *use* - good call!
>>
>> Best regards / Med venlig hilsen,
>> Søren Berg Glasius
>>
>> Hedevej 1, Gl. Rye, 8680 Ry, Denmark
>> Mobile: +45 40 44 91 88, Skype: sbglasius
>> --- Press ESC once to quit - twice to save the changes.
>>
>>
>> Den fre. 15. okt. 2021 kl. 17.12 skrev Rachel Greenham :
>>
>>> Looks like you could pretty much use Files as an extension module and/or
>>> category for Path...
>>>
>>> Hang on, does it work?
>>>
>>> groovy> import java.nio.file.*
>>> groovy> use (Files) {
>>> groovy> Path p = Path.of("src/groovy")
>>> groovy> println "is directory? ${p.isDirectory()}"
>>> groovy> p.list().each { println "${it}: ${it.getOwner()}
>>> ${it.getPosixFilePermissions()}" }
>>> groovy> }
>>>
>>> is directory? true
>>> src/groovy/benchmark: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>>> src/groovy/xdocs: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>>> src/groovy/bootstrap: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>>> src/groovy/LICENSE: rachel [OWNER_WRITE, OTHERS_READ, GROUP_READ,
>>> OWNER_READ]
>>> ...
>>>
>>> oh yeah that works 😉
>>>
>>> --
>>> Rachel Greenham
>>> rac...@merus.eu
>>>
>>> > On 15 Oct 2021, at 15:57, Nelson, Erick 
>>> wrote:
>>> >
>>> > import java.nio.file.Path
>>> > import java.nio.file.Files
>>> >
>>> > File f = new File('test')
>>> > Path p = f.toPath()
>>> > Files.isReadable(p) // boolean
>>> > Files.isWritable(p) // boolean
>>> > Files.isExecutable(p) // boolean
>>> > Files.isDirectory(p) // boolean
>>> > Files.isRegularFile(p) // boolean
>>> >
>>> >
>>> > From: James McMahon 
>>> > Date: Friday, October 15, 2021 at 4:50 AM
>>> > To: users@groovy.apache.org 
>>> > Subject: Checking directory state using Groovy
>>> >
>>> > Hello. I am trying to convert an existing script from python to
>>> Groovy. It executes a number of os.path and os.access commands, which I've
>>> not yet been able to find examples of that are written in Groovy. I have
>>> found similar implementations that employ "add on" Jenkins libraries for
>>> Groovy, but I will not have access to such libraries.Here is a brief
>>> excerpt from what I now do in python. Has anyone done similarly in Groovy?
>>> Can I impose for an example?
>>> >
>>> > Thanks very much in advance. Here is my python:
>>> >
>>> > if ( os.path.exists(result['thisURL']) and
>>> os.path.isfile(result['thisURL']) ) :
>>> >  if ( os.access(result['thisURL'], os.F_OK)
>>> >   and os.access(result['thisURL'], os.R_OK)
>>> >   and os.access(thisDri, os.W_OK)
>>> >   and os.access(thisDir, os.X_OK) ) :
>>> >   # do some stuff
>>> >   else :
>>> >   # dir and file not accessible, do some different stuff
>>>
>>>
>
>

Re: Checking directory state using Groovy

2021-10-20 Thread James McMahon

Very cool - thank you MG. No, I sure don't yet know the finer points such
as that which you mention. I think I've blindly adopted code examples from
Groovy scripts out on the internet that are similar, often not really
understanding. I still have so much to learn about Groovy. As best as I can
tell so far, Groovy appears to be java based, with refinements and
enhancements.

One thing I've been thinking about recently is whether these refinements
always make things easier to read and maintain, or more cryptic and
difficult? Some groovy things I've seen sometimes make it a little more
difficult to unwrap what the code is doing. But as I said, I suspect that's
more a function of my lack of experience rather than a Groovy problem.

This users group is very helpful. Thank you all again.
Cheers,
Jim

On Wed, Oct 20, 2021 at 6:55 AM MG  wrote:

> Don't know if you already know this, but using Groovy property syntax
> makes code even more readable, e.g.:
>
> println "${it}: ${it.getOwner()} ${it.getPosixFilePermissions()}"
>
> can be written as:
>
> println "$it: $it.owner $it.posixFilePermissions"
>
> In general:
> 1. Any getter can be accessed without the "get" prefix with a lowercase
> first char
> 2. A simplified string interpolation syntax without the enclosing curly
> braces can be used in these cases
> (same goes for setters)
>
> Cheers,
> mg
>
>
> On 20/10/2021 12:14, James McMahon wrote:
>
> Many thanks to each of you who offered guidance. Redirecting back to this
> today, anticipating success given your advice. Still getting a feel for
> Groovy so this helps quite a bit.
> Cheers,
> -Jim
>
> On Fri, Oct 15, 2021 at 11:22 AM Søren Berg Glasius 
> wrote:
>
>> @Rachel Rudnick  that is a very clever use of
>> *use* - good call!
>>
>> Best regards / Med venlig hilsen,
>> Søren Berg Glasius
>>
>> Hedevej 1, Gl. Rye, 8680 Ry, Denmark
>> Mobile: +45 40 44 91 88, Skype: sbglasius
>> --- Press ESC once to quit - twice to save the changes.
>>
>>
>> Den fre. 15. okt. 2021 kl. 17.12 skrev Rachel Greenham :
>>
>>> Looks like you could pretty much use Files as an extension module and/or
>>> category for Path...
>>>
>>> Hang on, does it work?
>>>
>>> groovy> import java.nio.file.*
>>> groovy> use (Files) {
>>> groovy> Path p = Path.of("src/groovy")
>>> groovy> println "is directory? ${p.isDirectory()}"
>>> groovy> p.list().each { println "${it}: ${it.getOwner()}
>>> ${it.getPosixFilePermissions()}" }
>>> groovy> }
>>>
>>> is directory? true
>>> src/groovy/benchmark: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>>> src/groovy/xdocs: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>>> src/groovy/bootstrap: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>>> src/groovy/LICENSE: rachel [OWNER_WRITE, OTHERS_READ, GROUP_READ,
>>> OWNER_READ]
>>> ...
>>>
>>> oh yeah that works 😉
>>>
>>> --
>>> Rachel Greenham
>>> rac...@merus.eu
>>>
>>> > On 15 Oct 2021, at 15:57, Nelson, Erick 
>>> wrote:
>>> >
>>> > import java.nio.file.Path
>>> > import java.nio.file.Files
>>> >
>>> > File f = new File('test')
>>> > Path p = f.toPath()
>>> > Files.isReadable(p) // boolean
>>> > Files.isWritable(p) // boolean
>>> > Files.isExecutable(p) // boolean
>>> > Files.isDirectory(p) // boolean
>>> > Files.isRegularFile(p) // boolean
>>> >
>>> >
>>> > From: James McMahon 
>>> > Date: Friday, October 15, 2021 at 4:50 AM
>>> > To: users@groovy.apache.org 
>>> > Subject: Checking directory state using Groovy
>>> >
>>> > Hello. I am trying to convert an existing script from python to
>>> Groovy. It executes a number of os.path and os.access commands, which I've
>>> not yet been able to find examples of that are written in Groovy. I have
>>> found similar implementations that employ "add on" Jenkins libraries for
>>> Groovy, but I will not have access to such libraries.Here is a brief
>>> excerpt from what I now do in python. Has anyone done similarly in Groovy?
>>> Can I impose for an example?
>>> >
>>> > Thanks very much in advance. Here is my python:
>>> >
>>> > if ( os.path.exists(result['thisURL']) and
>>> os.path.isfile(result['thisURL']) ) :
>>> >  if ( os.access(result['thisURL'], os.F_OK)
>>> >   and os.access(result['thisURL'], os.R_OK)
>>> >   and os.access(thisDri, os.W_OK)
>>> >   and os.access(thisDir, os.X_OK) ) :
>>> >   # do some stuff
>>> >   else :
>>> >   # dir and file not accessible, do some different stuff
>>>
>>>
>

Re: Checking directory state using Groovy

2021-10-20 Thread James McMahon

Many thanks to each of you who offered guidance. Redirecting back to this
today, anticipating success given your advice. Still getting a feel for
Groovy so this helps quite a bit.
Cheers,
-Jim

On Fri, Oct 15, 2021 at 11:22 AM Søren Berg Glasius 
wrote:

> @Rachel Rudnick  that is a very clever use of
> *use* - good call!
>
> Best regards / Med venlig hilsen,
> Søren Berg Glasius
>
> Hedevej 1, Gl. Rye, 8680 Ry, Denmark
> Mobile: +45 40 44 91 88, Skype: sbglasius
> --- Press ESC once to quit - twice to save the changes.
>
>
> Den fre. 15. okt. 2021 kl. 17.12 skrev Rachel Greenham :
>
>> Looks like you could pretty much use Files as an extension module and/or
>> category for Path...
>>
>> Hang on, does it work?
>>
>> groovy> import java.nio.file.*
>> groovy> use (Files) {
>> groovy> Path p = Path.of("src/groovy")
>> groovy> println "is directory? ${p.isDirectory()}"
>> groovy> p.list().each { println "${it}: ${it.getOwner()}
>> ${it.getPosixFilePermissions()}" }
>> groovy> }
>>
>> is directory? true
>> src/groovy/benchmark: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>> src/groovy/xdocs: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>> src/groovy/bootstrap: rachel [OWNER_WRITE, OTHERS_READ, OWNER_EXECUTE,
>> GROUP_READ, GROUP_EXECUTE, OTHERS_EXECUTE, OWNER_READ]
>> src/groovy/LICENSE: rachel [OWNER_WRITE, OTHERS_READ, GROUP_READ,
>> OWNER_READ]
>> ...
>>
>> oh yeah that works 😉
>>
>> --
>> Rachel Greenham
>> rac...@merus.eu
>>
>> > On 15 Oct 2021, at 15:57, Nelson, Erick 
>> wrote:
>> >
>> > import java.nio.file.Path
>> > import java.nio.file.Files
>> >
>> > File f = new File('test')
>> > Path p = f.toPath()
>> > Files.isReadable(p) // boolean
>> > Files.isWritable(p) // boolean
>> > Files.isExecutable(p) // boolean
>> > Files.isDirectory(p) // boolean
>> > Files.isRegularFile(p) // boolean
>> >
>> >
>> > From: James McMahon 
>> > Date: Friday, October 15, 2021 at 4:50 AM
>> > To: users@groovy.apache.org 
>> > Subject: Checking directory state using Groovy
>> >
>> > Hello. I am trying to convert an existing script from python to Groovy.
>> It executes a number of os.path and os.access commands, which I've not yet
>> been able to find examples of that are written in Groovy. I have found
>> similar implementations that employ "add on" Jenkins libraries for Groovy,
>> but I will not have access to such libraries.Here is a brief excerpt from
>> what I now do in python. Has anyone done similarly in Groovy? Can I impose
>> for an example?
>> >
>> > Thanks very much in advance. Here is my python:
>> >
>> > if ( os.path.exists(result['thisURL']) and
>> os.path.isfile(result['thisURL']) ) :
>> >  if ( os.access(result['thisURL'], os.F_OK)
>> >   and os.access(result['thisURL'], os.R_OK)
>> >   and os.access(thisDri, os.W_OK)
>> >   and os.access(thisDir, os.X_OK) ) :
>> >   # do some stuff
>> >   else :
>> >   # dir and file not accessible, do some different stuff
>>
>>

Checking directory state using Groovy

2021-10-15 Thread James McMahon

Hello. I am trying to convert an existing script from python to Groovy. It
executes a number of os.path and os.access commands, which I've not yet
been able to find examples of that are written in Groovy. I have found
similar implementations that employ "add on" Jenkins libraries for Groovy,
but I will not have access to such libraries.Here is a brief excerpt from
what I now do in python. Has anyone done similarly in Groovy? Can I impose
for an example?

Thanks very much in advance. Here is my python:

if ( os.path.exists(result['thisURL']) and
os.path.isfile(result['thisURL']) ) :
 if ( os.access(result['thisURL'], os.F_OK)
  and os.access(result['thisURL'], os.R_OK)
  and os.access(thisDri, os.W_OK)
  and os.access(thisDir, os.X_OK) ) :
  # do some stuff
  else :
  # dir and file not accessible, do some different stuff

Re: Groovy equivalent to python urllib.quote?

2021-09-23 Thread James McMahon

Addendum, these examples illustrate what the use of encode and quote is
doing for us in the legacy python...

from urllib.parse import quote
myString = "/Ted \\Las$o & Roy Kent, @ AFC Richmond,  !"
print(myString) # result is /Ted \Las$o & Roy Kent, @ AFC Richmond,  !

myString_quoted = quote(myString)
print(myString_quoted) # result is
/Ted%20%5CLas%24o%20%26%20Roy%20Kent%2C%20%40%20AFC%20Richmond%2C%20%C3%B6%C3%B6%C3%B6%C3%B6%20%21

myString_encoded = myString.encode('utf8')
print(myString_encoded) # result is b'/Ted \\Las$o & Roy Kent, @ AFC
Richmond, \xc3\xb6\xc3\xb6\xc3\xb6\xc3\xb6 !'

myString_encoded_then_quoted = quote(myString.encode('utf8'))
print(myString_encoded_then_quoted) # result is same as when we just
quote(myString)

myString_quoted_then_encoded = quote(myString).encode('utf8')
print(myString_quoted_then_encoded) # result is
b'/Ted%20%5CLas%24o%20%26%20Roy%20Kent%2C%20%40%20AFC%20Richmond%2C%20%C3%B6%C3%B6%C3%B6%C3%B6%20%21'

On Thu, Sep 23, 2021 at 6:24 AM James McMahon  wrote:

> Hello. I am new to groovy, assigned an effort to convert legacy python
> scripts to groovy replacements. We are doing this in an effort to decouple
> from dependencies on the jython engine in NiFi for scripts we run from its
> ExecuteScript processor.
>
> In one of these python scripts, this gets done to an encoded string that
> represents a disk file path:
>
> import urllib
> .
> .
> .
> result['fileURL'] = urllib.quote(temp_result['fileURL'].encode('utf8'))
>
> Based on my initial research (though very limited understanding), it
> appears that the urllib.quote() is intended to quote reserved characters in
> a path. I haven't been able to find the function that serves an equivalent
> purpose in Groovy. Can anyone show me how this should be accomplished?
>
> I should add that I have successfully replaced the python dictionary
> structures you see above and calls to manipulate the same with Groovy Map
> operations.
>
> Thanks in advance for any help.  - Jim
>

Groovy equivalent to python urllib.quote?

2021-09-23 Thread James McMahon

Hello. I am new to groovy, assigned an effort to convert legacy python
scripts to groovy replacements. We are doing this in an effort to decouple
from dependencies on the jython engine in NiFi for scripts we run from its
ExecuteScript processor.

In one of these python scripts, this gets done to an encoded string that
represents a disk file path:

import urllib
.
.
.
result['fileURL'] = urllib.quote(temp_result['fileURL'].encode('utf8'))

Based on my initial research (though very limited understanding), it
appears that the urllib.quote() is intended to quote reserved characters in
a path. I haven't been able to find the function that serves an equivalent
purpose in Groovy. Can anyone show me how this should be accomplished?

I should add that I have successfully replaced the python dictionary
structures you see above and calls to manipulate the same with Groovy Map
operations.

Thanks in advance for any help.  - Jim

Re: Determine whether a text string is valid JSON representation

2021-09-21 Thread James McMahon

Thank you Rachel and Paul. You have been a big help, and I can run with
this. I did a quick test, and it works:
import groovy.json.JsonSlurper

String goodJsonString = '''{"menu": {
"id": "file",
"tools": {
"actions": [
{"id": "new", "title": "New File"},
{"id": "open", "title": "Open File"},
{"id": "close", "title": "Close File"}
],
"errors": []
}}}'''

String badJsonString = '''{"menu": {
"id": ,
"tools": {
"actions": [
{"id": "new", "title": "New File"},
{"id": "open", "title": "Open File"},
{"id": "close", "title": "Close File"}
],
"errors": []
}}}'''

//String thisString = badJsonString;
String thisString = goodJsonString;

 try {
JsonSlurper slurper = new JsonSlurper();
Map parsedJson = slurper.parseText(thisString);
//assert parsedJson instanceof Map;

String idValue = parsedJson.menu.id;
println("The value of idValue is " + idValue + "\nThe value of
jsonString is \n" + thisString);
println 'Valid JSON'
  } catch(error) {
println "Invalid: $error.message"
  }

On Tue, Sep 21, 2021 at 7:52 AM Paul King  wrote:

> Yes, I also agree with Rachel's comment about groovyConsole. I hit send
> before seeing her reply.
>
>
> On Tue, Sep 21, 2021 at 9:06 PM Rachel Greenham  wrote:
>
>>
>>
>> > On 21 Sep 2021, at 11:35, James McMahon  wrote:
>> >
>> > Hello. Newbie to Groovy. Have a text string that I need to verify is a
>> representation of valid JSON, or not. What is an effective means to do this
>> in Groovy? I'm having difficulty determining what a method like JSONSlurper
>> will return to me if the string can't be parsed because it is not valid
>> JSON? I've found plenty of examples of results it returns when it works,
>> but nothing yet showing me what it returns when the string isn't valid
>> JSON.
>> > Hope this is a valid way to post such a question. As I said, this is my
>> first time trying this or any Groovy forum. Thanks in advance for any help.
>> -Jim
>> >
>>
>> It doesn’t return a value, it throws groovy.json.JsonException, which is
>> a RuntimeException.
>>
>> groovyConsole is your friend in such times…
>>
>> groovy> import groovy.json.*
>> groovy> def slurp = new JsonSlurper()
>> groovy> def parsed = slurp.parseText("}{")
>>
>> Exception thrown
>>
>> groovy.json.JsonException: Unable to determine the current character, it
>> is not a string, number, array, or object
>>
>> The current character read is '}' with an int value of 125
>> Unable to determine the current character, it is not a string, number,
>> array, or object
>> line number 1
>> index number 0
>> }{
>> ^
>>
>> --
>> Rachel Greenham
>> rac...@merus.eu
>>
>>

Determine whether a text string is valid JSON representation

2021-09-21 Thread James McMahon

 Hello. Newbie to Groovy. Have a text string that I need to verify is a
representation of valid JSON, or not. What is an effective means to do this
in Groovy? I'm having difficulty determining what a method like JSONSlurper
will return to me if the string can't be parsed because it is not valid
JSON? I've found plenty of examples of results it returns when it works,
but nothing yet showing me what it returns when the string isn't valid
JSON.
Hope this is a valid way to post such a question. As I said, this is my
first time trying this or any Groovy forum. Thanks in advance for any help.
-Jim

42 matches

Mail list logo