Re: ExecuteStreamCommand failing to unzip incoming flowfiles

2024-01-31 Thread James McMahon
If anyone can show me how to get my ExecuteStreamCommand configured
properly as a workaround, I am still interested in that.
Jim

On Wed, Jan 31, 2024 at 12:39 PM James McMahon  wrote:

> I tried to find a Create option for tickets here,
> https://issues.apache.org/jira/projects/NIFI/issues/NIFI-11859?filter=allopenissues
> .
> I did not find one, and suspect maybe I have no such privilege perhaps?
> In any case, thank you for creating that.
> Jim
>
> On Wed, Jan 31, 2024 at 12:37 PM Joe Witt  wrote:
>
>> I went ahead and wrote it up here
>> https://issues.apache.org/jira/browse/NIFI-12709
>>
>> Thanks
>>
>> On Wed, Jan 31, 2024 at 10:30 AM James McMahon 
>> wrote:
>>
>>> Happy to do that Joe. How do I create and submit a JIRA for
>>> consideration? I have not done one - at least, not for years.
>>> If you get me started, I will do a concise and thorough description in
>>> the ticket.
>>> Sincerely,
>>> Jim
>>>
>>> On Wed, Jan 31, 2024 at 12:12 PM Joe Witt  wrote:
>>>
 James,

 Makes sense to create a JIRA to improve UnpackContent to extract these
 attributes in the event of a zip file that happens to present them.  The
 concept of lastModifiedDate does appear easily accessed if available in the
 metadata.  Owner/Creator/Creation information looks less standard in the
 case of a Zip but perhaps still capturable as extra fields.

 Thanks

 On Wed, Jan 31, 2024 at 10:01 AM James McMahon 
 wrote:

> I tried to use UnpackContent to extract the files within a zip file
> named ABC DEF (1).zip. (the filename has spaces in its name).
>
> UnpackContent seemed to work, but it did not preserve file attributes
> from the files in the zip. For example, the  lastModifiedTime   is not
> available so downstream I am unable to do
> this: 
> ${file.lastModifiedTime:toDate("-MM-dd'T'HH:mm:ssZ"):format("MMddHHmmss")}
>
> I did some digging and found that on the UnpackContent page, it says:
> file.lastModifiedTime  "The date and time that the unpacked file was
> last modified (*tar only*)."
>
> I need these file attributes for those files I extract from the zip.
> So as an alternative I tried configuring an ExecuteStreamCommand
> processor like this:
> Command Arguments  -c;"unzip -p -q < -"
> Command Path  /bin/bash
> Argument Delimiter   ;
>
> It throws these errors:
>
> 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
> write flow file to stdin due to Broken pipe: java.io.IOException: Broken
> pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Transferring
> flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. Executable
> command /bin/bash ended in an error: /bin/bash: -: No such file or 
> directory
>
> It does not seem to be applying the unzip to the stdin of the ESC
> processor. None of the files in the zip archive are output from ESC.
>
> What needs to be changed in my ESC configuration?
>
> Thank you in advance for any help.
>
>


Re: ExecuteStreamCommand failing to unzip incoming flowfiles

2024-01-31 Thread James McMahon
I tried to find a Create option for tickets here,
https://issues.apache.org/jira/projects/NIFI/issues/NIFI-11859?filter=allopenissues
.
I did not find one, and suspect maybe I have no such privilege perhaps?
In any case, thank you for creating that.
Jim

On Wed, Jan 31, 2024 at 12:37 PM Joe Witt  wrote:

> I went ahead and wrote it up here
> https://issues.apache.org/jira/browse/NIFI-12709
>
> Thanks
>
> On Wed, Jan 31, 2024 at 10:30 AM James McMahon 
> wrote:
>
>> Happy to do that Joe. How do I create and submit a JIRA for
>> consideration? I have not done one - at least, not for years.
>> If you get me started, I will do a concise and thorough description in
>> the ticket.
>> Sincerely,
>> Jim
>>
>> On Wed, Jan 31, 2024 at 12:12 PM Joe Witt  wrote:
>>
>>> James,
>>>
>>> Makes sense to create a JIRA to improve UnpackContent to extract these
>>> attributes in the event of a zip file that happens to present them.  The
>>> concept of lastModifiedDate does appear easily accessed if available in the
>>> metadata.  Owner/Creator/Creation information looks less standard in the
>>> case of a Zip but perhaps still capturable as extra fields.
>>>
>>> Thanks
>>>
>>> On Wed, Jan 31, 2024 at 10:01 AM James McMahon 
>>> wrote:
>>>
 I tried to use UnpackContent to extract the files within a zip file
 named ABC DEF (1).zip. (the filename has spaces in its name).

 UnpackContent seemed to work, but it did not preserve file attributes
 from the files in the zip. For example, the  lastModifiedTime   is not
 available so downstream I am unable to do
 this: 
 ${file.lastModifiedTime:toDate("-MM-dd'T'HH:mm:ssZ"):format("MMddHHmmss")}

 I did some digging and found that on the UnpackContent page, it says:
 file.lastModifiedTime  "The date and time that the unpacked file was
 last modified (*tar only*)."

 I need these file attributes for those files I extract from the zip. So
 as an alternative I tried configuring an ExecuteStreamCommand
 processor like this:
 Command Arguments  -c;"unzip -p -q < -"
 Command Path  /bin/bash
 Argument Delimiter   ;

 It throws these errors:

 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
 ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
 write flow file to stdin due to Broken pipe: java.io.IOException: Broken
 pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
 ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Transferring
 flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. Executable
 command /bin/bash ended in an error: /bin/bash: -: No such file or 
 directory

 It does not seem to be applying the unzip to the stdin of the ESC
 processor. None of the files in the zip archive are output from ESC.

 What needs to be changed in my ESC configuration?

 Thank you in advance for any help.




Re: ExecuteStreamCommand failing to unzip incoming flowfiles

2024-01-31 Thread Joe Witt
I went ahead and wrote it up here
https://issues.apache.org/jira/browse/NIFI-12709

Thanks

On Wed, Jan 31, 2024 at 10:30 AM James McMahon  wrote:

> Happy to do that Joe. How do I create and submit a JIRA for consideration?
> I have not done one - at least, not for years.
> If you get me started, I will do a concise and thorough description in the
> ticket.
> Sincerely,
> Jim
>
> On Wed, Jan 31, 2024 at 12:12 PM Joe Witt  wrote:
>
>> James,
>>
>> Makes sense to create a JIRA to improve UnpackContent to extract these
>> attributes in the event of a zip file that happens to present them.  The
>> concept of lastModifiedDate does appear easily accessed if available in the
>> metadata.  Owner/Creator/Creation information looks less standard in the
>> case of a Zip but perhaps still capturable as extra fields.
>>
>> Thanks
>>
>> On Wed, Jan 31, 2024 at 10:01 AM James McMahon 
>> wrote:
>>
>>> I tried to use UnpackContent to extract the files within a zip file
>>> named ABC DEF (1).zip. (the filename has spaces in its name).
>>>
>>> UnpackContent seemed to work, but it did not preserve file attributes
>>> from the files in the zip. For example, the  lastModifiedTime   is not
>>> available so downstream I am unable to do
>>> this: 
>>> ${file.lastModifiedTime:toDate("-MM-dd'T'HH:mm:ssZ"):format("MMddHHmmss")}
>>>
>>> I did some digging and found that on the UnpackContent page, it says:
>>> file.lastModifiedTime  "The date and time that the unpacked file was
>>> last modified (*tar only*)."
>>>
>>> I need these file attributes for those files I extract from the zip. So
>>> as an alternative I tried configuring an ExecuteStreamCommand processor
>>> like this:
>>> Command Arguments  -c;"unzip -p -q < -"
>>> Command Path  /bin/bash
>>> Argument Delimiter   ;
>>>
>>> It throws these errors:
>>>
>>> 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
>>> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
>>> write flow file to stdin due to Broken pipe: java.io.IOException: Broken
>>> pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
>>> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Transferring
>>> flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. Executable
>>> command /bin/bash ended in an error: /bin/bash: -: No such file or directory
>>>
>>> It does not seem to be applying the unzip to the stdin of the ESC
>>> processor. None of the files in the zip archive are output from ESC.
>>>
>>> What needs to be changed in my ESC configuration?
>>>
>>> Thank you in advance for any help.
>>>
>>>


Re: ExecuteStreamCommand failing to unzip incoming flowfiles

2024-01-31 Thread James McMahon
Happy to do that Joe. How do I create and submit a JIRA for consideration?
I have not done one - at least, not for years.
If you get me started, I will do a concise and thorough description in the
ticket.
Sincerely,
Jim

On Wed, Jan 31, 2024 at 12:12 PM Joe Witt  wrote:

> James,
>
> Makes sense to create a JIRA to improve UnpackContent to extract these
> attributes in the event of a zip file that happens to present them.  The
> concept of lastModifiedDate does appear easily accessed if available in the
> metadata.  Owner/Creator/Creation information looks less standard in the
> case of a Zip but perhaps still capturable as extra fields.
>
> Thanks
>
> On Wed, Jan 31, 2024 at 10:01 AM James McMahon 
> wrote:
>
>> I tried to use UnpackContent to extract the files within a zip file named
>> ABC DEF (1).zip. (the filename has spaces in its name).
>>
>> UnpackContent seemed to work, but it did not preserve file attributes
>> from the files in the zip. For example, the  lastModifiedTime   is not
>> available so downstream I am unable to do
>> this: 
>> ${file.lastModifiedTime:toDate("-MM-dd'T'HH:mm:ssZ"):format("MMddHHmmss")}
>>
>> I did some digging and found that on the UnpackContent page, it says:
>> file.lastModifiedTime  "The date and time that the unpacked file was
>> last modified (*tar only*)."
>>
>> I need these file attributes for those files I extract from the zip. So
>> as an alternative I tried configuring an ExecuteStreamCommand processor
>> like this:
>> Command Arguments  -c;"unzip -p -q < -"
>> Command Path  /bin/bash
>> Argument Delimiter   ;
>>
>> It throws these errors:
>>
>> 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
>> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
>> write flow file to stdin due to Broken pipe: java.io.IOException: Broken
>> pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
>> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Transferring
>> flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. Executable
>> command /bin/bash ended in an error: /bin/bash: -: No such file or directory
>>
>> It does not seem to be applying the unzip to the stdin of the ESC
>> processor. None of the files in the zip archive are output from ESC.
>>
>> What needs to be changed in my ESC configuration?
>>
>> Thank you in advance for any help.
>>
>>


Re: ExecuteStreamCommand failing to unzip incoming flowfiles

2024-01-31 Thread Joe Witt
James,

Makes sense to create a JIRA to improve UnpackContent to extract these
attributes in the event of a zip file that happens to present them.  The
concept of lastModifiedDate does appear easily accessed if available in the
metadata.  Owner/Creator/Creation information looks less standard in the
case of a Zip but perhaps still capturable as extra fields.

Thanks

On Wed, Jan 31, 2024 at 10:01 AM James McMahon  wrote:

> I tried to use UnpackContent to extract the files within a zip file named
> ABC DEF (1).zip. (the filename has spaces in its name).
>
> UnpackContent seemed to work, but it did not preserve file attributes from
> the files in the zip. For example, the  lastModifiedTime   is not available
> so downstream I am unable to do
> this: 
> ${file.lastModifiedTime:toDate("-MM-dd'T'HH:mm:ssZ"):format("MMddHHmmss")}
>
> I did some digging and found that on the UnpackContent page, it says:
> file.lastModifiedTime  "The date and time that the unpacked file was last
> modified (*tar only*)."
>
> I need these file attributes for those files I extract from the zip. So as
> an alternative I tried configuring an ExecuteStreamCommand processor like
> this:
> Command Arguments  -c;"unzip -p -q < -"
> Command Path  /bin/bash
> Argument Delimiter   ;
>
> It throws these errors:
>
> 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
> write flow file to stdin due to Broken pipe: java.io.IOException: Broken
> pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Transferring
> flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. Executable
> command /bin/bash ended in an error: /bin/bash: -: No such file or directory
>
> It does not seem to be applying the unzip to the stdin of the ESC
> processor. None of the files in the zip archive are output from ESC.
>
> What needs to be changed in my ESC configuration?
>
> Thank you in advance for any help.
>
>


ExecuteStreamCommand failing to unzip incoming flowfiles

2024-01-31 Thread James McMahon
I tried to use UnpackContent to extract the files within a zip file named
ABC DEF (1).zip. (the filename has spaces in its name).

UnpackContent seemed to work, but it did not preserve file attributes from
the files in the zip. For example, the  lastModifiedTime   is not available
so downstream I am unable to do
this: 
${file.lastModifiedTime:toDate("-MM-dd'T'HH:mm:ssZ"):format("MMddHHmmss")}

I did some digging and found that on the UnpackContent page, it says:
file.lastModifiedTime  "The date and time that the unpacked file was last
modified (*tar only*)."

I need these file attributes for those files I extract from the zip. So as
an alternative I tried configuring an ExecuteStreamCommand processor like
this:
Command Arguments  -c;"unzip -p -q < -"
Command Path  /bin/bash
Argument Delimiter   ;

It throws these errors:

16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
write flow file to stdin due to Broken pipe: java.io.IOException: Broken
pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Transferring
flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. Executable
command /bin/bash ended in an error: /bin/bash: -: No such file or directory

It does not seem to be applying the unzip to the stdin of the ESC
processor. None of the files in the zip archive are output from ESC.

What needs to be changed in my ESC configuration?

Thank you in advance for any help.