I should mention that I have also tried to access fileSize like this
without success:
//def file_size = (ff.size != null) ? (ff.size as Integer) : 0
def file_size = ff.getAttribute('fileSize')
This returns null.
I tried this approach based on a post in cloudera, which said this:
|The default FlowFile attributes include:
entryDate
lineageStartDate
fileSize
filename
path
uuid
On Mon, Jun 17, 2024 at 7:50 AM James McMahon <[email protected]> wrote:
> I am trying to use the file size. On the DEtails tab for my flowfile in
> queue, I see that my File Size is 8.01 GB.
>
> I log the following from this section of a Groovy script, running in an
> ExecuteGroovyScript processor:
>
> def ff = session.get()
> if (!ff) return
>
> def jsonFactory = new JsonFactory()
> log.info("JsonFactory created successfully.")
>
> def numberOfObjects = 0
> def lineCounter = 0 // Initialize a counter to track the number of lines
>
> // Ensure file_size is not null and cast to Integer
> def file_size = (ff.size != null) ? (ff.size as Integer) : 0
>
> if (file_size == 0) {
> log.error("file_size is undefined or zero, which prevents division.")
> session.transfer(ff, REL_FAILURE)
> return
> }
>
> log.info("File size: ${file_size}")
>
>
> I want file_size to always be the number of bytes in the file so that I am
> always working with a consistent representation.
>
> But in my log, the result is this:
> ExecuteGroovyScript[id=1110134d-1ea1-1565-f962-eee47a3fc654] File size:
> 15408597
>
> That isn't 8.01 GB. Where am I making my error?
>
>