Re: some processors runs only once in NiFi
Hi Koji, Thanks for your explanation Many thanks prabhu On Fri, May 26, 2017 at 9:40 AM, Koji Kawamura wrote: > Hi Prabhu, > > Same as ListHDFS, GetHTTP uses ETAG HTTP header, and if server returns > NOT_MODIFIED(304), it doesn't create output FlowFile. The screenshot > indicates that GetHTTP runs 61 times but it only creates output > FlowFile once because it's not modified. > > I believe that is what's happening. > > Thanks, > Koji > > On Wed, May 24, 2017 at 2:30 PM, prabhu Mahendran > wrote: > > Pierre, > > > > Thanks for your mail, > > > > I might try to list files over and over.So that may be problem i faced.I > > just modified existing files in hdfs and then list those files using > > ListHDFS. > > > > I could be list files in which same as well as last execution of a > processor > > that's may be problem. > > > > Many thanks > > > > > > On Wed, May 24, 2017 at 10:45 AM, Pierre Villard > > wrote: > >> > >> Just a quick remark, the ListHDFS processor won't list files over and > >> over, it'll only list new files since the last execution of the > processor. > >> Do you know if new files are generated in the directory your are > listing? > >> > >> Screenshots of your configurations would definitely help. > >> > >> 2017-05-24 6:55 GMT+02:00 Joe Witt : > >>> > >>> prabhu - can you please share screenshots and or logs showing that it > >>> is running only once? > >>> > >>> Thanks > >>> > >>> On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran > >>> wrote: > >>> > Aldrin, > >>> > > >>> > Thanks for your response. > >>> > > >>> > For GetHTTP ,I have checked to download different files even it could > >>> > not > >>> > run more than once. > >>> > > >>> > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes > >>> > "Hadoop > >>> > Configuration Resources","Directory","RecurseSubDirectories" > correctly > >>> > for > >>> > Hadoop-2.5.2.This runs only once not run again. > >>> > > >>> > Note: I have checked those processors in windows. > >>> > > >>> > Can you give any suggestion to solve this? > >>> > > >>> > Many Thanks, > >>> > > >>> > > >>> > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri > >>> > wrote: > >>> >> > >>> >> For GetHTTP, this processor makes use of ETags[1] to prevent > >>> >> downloading > >>> >> the same resource repeatedly. I would speculate that this is the > case > >>> >> for > >>> >> the resource you are specifying. > >>> >> > >>> >> As for ListHDFS, could you specify what version you are using? > There > >>> >> have > >>> >> been some bugs concerning how this was handled. If the version is > the > >>> >> latest, could you please provide some more details in terms of > >>> >> structure and > >>> >> timestamps of the associated files causing the issue you are > >>> >> describing? > >>> >> > >>> >> > >>> >> [1] https://en.wikipedia.org/wiki/HTTP_ETag > >>> >> > >>> >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran > >>> >> wrote: > >>> >>> > >>> >>> Since i have faced some unexpected behaviour's in NiFi. > >>> >>> > >>> >>> I don't know why those processors which doesn't run more than once. > >>> >>> > >>> >>> > >>> >>> For example: > >>> >>> > >>> >>> 1.GetHTTP: > >>> >>> > >>> >>> I have used GetHTTP processor for download files from "HTTP" Url. > >>> >>> Initially i have scheduled 0 sec > >>> >>> > >>> >>> If i runs the processor it runs only once and not again run.Once > copy > >>> >>> the > >>> >>> same processor and paste in the UI then click run that processor it > >>> >>> again > >>> >>> runs only once. > >>> >>> > >>> >>> If i scheduling it then also not runs more than once. > >>> >>> > >>> >>> > >>> >>> > >>> >>> 2.ListHDFS: > >>> >>> > >>> >>> I have configured local cluster properties in ListHDFS. > >>> >>> > >>> >>> i have 12 files in hdfs directory.If i runs without scheduling then > >>> >>> it > >>> >>> lists 12 files correctly and after scheduling it only returns 11 > >>> >>> files > >>> >>> without 1 file and not run after first time run > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> can anyone explain the behaviour of those processsors when 1 day > >>> >>> scheduling in TimerDriven? > >>> >> > >>> >> > >>> > > >> > >> > > >
Re: some processors runs only once in NiFi
Hi Prabhu, Same as ListHDFS, GetHTTP uses ETAG HTTP header, and if server returns NOT_MODIFIED(304), it doesn't create output FlowFile. The screenshot indicates that GetHTTP runs 61 times but it only creates output FlowFile once because it's not modified. I believe that is what's happening. Thanks, Koji On Wed, May 24, 2017 at 2:30 PM, prabhu Mahendran wrote: > Pierre, > > Thanks for your mail, > > I might try to list files over and over.So that may be problem i faced.I > just modified existing files in hdfs and then list those files using > ListHDFS. > > I could be list files in which same as well as last execution of a processor > that's may be problem. > > Many thanks > > > On Wed, May 24, 2017 at 10:45 AM, Pierre Villard > wrote: >> >> Just a quick remark, the ListHDFS processor won't list files over and >> over, it'll only list new files since the last execution of the processor. >> Do you know if new files are generated in the directory your are listing? >> >> Screenshots of your configurations would definitely help. >> >> 2017-05-24 6:55 GMT+02:00 Joe Witt : >>> >>> prabhu - can you please share screenshots and or logs showing that it >>> is running only once? >>> >>> Thanks >>> >>> On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran >>> wrote: >>> > Aldrin, >>> > >>> > Thanks for your response. >>> > >>> > For GetHTTP ,I have checked to download different files even it could >>> > not >>> > run more than once. >>> > >>> > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes >>> > "Hadoop >>> > Configuration Resources","Directory","RecurseSubDirectories" correctly >>> > for >>> > Hadoop-2.5.2.This runs only once not run again. >>> > >>> > Note: I have checked those processors in windows. >>> > >>> > Can you give any suggestion to solve this? >>> > >>> > Many Thanks, >>> > >>> > >>> > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri >>> > wrote: >>> >> >>> >> For GetHTTP, this processor makes use of ETags[1] to prevent >>> >> downloading >>> >> the same resource repeatedly. I would speculate that this is the case >>> >> for >>> >> the resource you are specifying. >>> >> >>> >> As for ListHDFS, could you specify what version you are using? There >>> >> have >>> >> been some bugs concerning how this was handled. If the version is the >>> >> latest, could you please provide some more details in terms of >>> >> structure and >>> >> timestamps of the associated files causing the issue you are >>> >> describing? >>> >> >>> >> >>> >> [1] https://en.wikipedia.org/wiki/HTTP_ETag >>> >> >>> >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran >>> >> wrote: >>> >>> >>> >>> Since i have faced some unexpected behaviour's in NiFi. >>> >>> >>> >>> I don't know why those processors which doesn't run more than once. >>> >>> >>> >>> >>> >>> For example: >>> >>> >>> >>> 1.GetHTTP: >>> >>> >>> >>> I have used GetHTTP processor for download files from "HTTP" Url. >>> >>> Initially i have scheduled 0 sec >>> >>> >>> >>> If i runs the processor it runs only once and not again run.Once copy >>> >>> the >>> >>> same processor and paste in the UI then click run that processor it >>> >>> again >>> >>> runs only once. >>> >>> >>> >>> If i scheduling it then also not runs more than once. >>> >>> >>> >>> >>> >>> >>> >>> 2.ListHDFS: >>> >>> >>> >>> I have configured local cluster properties in ListHDFS. >>> >>> >>> >>> i have 12 files in hdfs directory.If i runs without scheduling then >>> >>> it >>> >>> lists 12 files correctly and after scheduling it only returns 11 >>> >>> files >>> >>> without 1 file and not run after first time run >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> can anyone explain the behaviour of those processsors when 1 day >>> >>> scheduling in TimerDriven? >>> >> >>> >> >>> > >> >> >
Re: some processors runs only once in NiFi
Pierre, Thanks for your mail, I might try to list files over and over.So that may be problem i faced.I just modified existing files in hdfs and then list those files using ListHDFS. I could be list files in which same as well as last execution of a processor that's may be problem. Many thanks On Wed, May 24, 2017 at 10:45 AM, Pierre Villard < pierre.villard...@gmail.com> wrote: > Just a quick remark, the ListHDFS processor won't list files over and > over, it'll only list new files since the last execution of the processor. > Do you know if new files are generated in the directory your are listing? > > Screenshots of your configurations would definitely help. > > 2017-05-24 6:55 GMT+02:00 Joe Witt : > >> prabhu - can you please share screenshots and or logs showing that it >> is running only once? >> >> Thanks >> >> On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran >> wrote: >> > Aldrin, >> > >> > Thanks for your response. >> > >> > For GetHTTP ,I have checked to download different files even it could >> not >> > run more than once. >> > >> > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes >> "Hadoop >> > Configuration Resources","Directory","RecurseSubDirectories" correctly >> for >> > Hadoop-2.5.2.This runs only once not run again. >> > >> > Note: I have checked those processors in windows. >> > >> > Can you give any suggestion to solve this? >> > >> > Many Thanks, >> > >> > >> > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri >> wrote: >> >> >> >> For GetHTTP, this processor makes use of ETags[1] to prevent >> downloading >> >> the same resource repeatedly. I would speculate that this is the case >> for >> >> the resource you are specifying. >> >> >> >> As for ListHDFS, could you specify what version you are using? There >> have >> >> been some bugs concerning how this was handled. If the version is the >> >> latest, could you please provide some more details in terms of >> structure and >> >> timestamps of the associated files causing the issue you are >> describing? >> >> >> >> >> >> [1] https://en.wikipedia.org/wiki/HTTP_ETag >> >> >> >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran >> >> wrote: >> >>> >> >>> Since i have faced some unexpected behaviour's in NiFi. >> >>> >> >>> I don't know why those processors which doesn't run more than once. >> >>> >> >>> >> >>> For example: >> >>> >> >>> 1.GetHTTP: >> >>> >> >>> I have used GetHTTP processor for download files from "HTTP" Url. >> >>> Initially i have scheduled 0 sec >> >>> >> >>> If i runs the processor it runs only once and not again run.Once copy >> the >> >>> same processor and paste in the UI then click run that processor it >> again >> >>> runs only once. >> >>> >> >>> If i scheduling it then also not runs more than once. >> >>> >> >>> >> >>> >> >>> 2.ListHDFS: >> >>> >> >>> I have configured local cluster properties in ListHDFS. >> >>> >> >>> i have 12 files in hdfs directory.If i runs without scheduling then it >> >>> lists 12 files correctly and after scheduling it only returns 11 files >> >>> without 1 file and not run after first time run >> >>> >> >>> >> >>> >> >>> >> >>> can anyone explain the behaviour of those processsors when 1 day >> >>> scheduling in TimerDriven? >> >> >> >> >> > >> > >
Re: some processors runs only once in NiFi
Just a quick remark, the ListHDFS processor won't list files over and over, it'll only list new files since the last execution of the processor. Do you know if new files are generated in the directory your are listing? Screenshots of your configurations would definitely help. 2017-05-24 6:55 GMT+02:00 Joe Witt : > prabhu - can you please share screenshots and or logs showing that it > is running only once? > > Thanks > > On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran > wrote: > > Aldrin, > > > > Thanks for your response. > > > > For GetHTTP ,I have checked to download different files even it could not > > run more than once. > > > > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes > "Hadoop > > Configuration Resources","Directory","RecurseSubDirectories" correctly > for > > Hadoop-2.5.2.This runs only once not run again. > > > > Note: I have checked those processors in windows. > > > > Can you give any suggestion to solve this? > > > > Many Thanks, > > > > > > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri > wrote: > >> > >> For GetHTTP, this processor makes use of ETags[1] to prevent downloading > >> the same resource repeatedly. I would speculate that this is the case > for > >> the resource you are specifying. > >> > >> As for ListHDFS, could you specify what version you are using? There > have > >> been some bugs concerning how this was handled. If the version is the > >> latest, could you please provide some more details in terms of > structure and > >> timestamps of the associated files causing the issue you are describing? > >> > >> > >> [1] https://en.wikipedia.org/wiki/HTTP_ETag > >> > >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran > >> wrote: > >>> > >>> Since i have faced some unexpected behaviour's in NiFi. > >>> > >>> I don't know why those processors which doesn't run more than once. > >>> > >>> > >>> For example: > >>> > >>> 1.GetHTTP: > >>> > >>> I have used GetHTTP processor for download files from "HTTP" Url. > >>> Initially i have scheduled 0 sec > >>> > >>> If i runs the processor it runs only once and not again run.Once copy > the > >>> same processor and paste in the UI then click run that processor it > again > >>> runs only once. > >>> > >>> If i scheduling it then also not runs more than once. > >>> > >>> > >>> > >>> 2.ListHDFS: > >>> > >>> I have configured local cluster properties in ListHDFS. > >>> > >>> i have 12 files in hdfs directory.If i runs without scheduling then it > >>> lists 12 files correctly and after scheduling it only returns 11 files > >>> without 1 file and not run after first time run > >>> > >>> > >>> > >>> > >>> can anyone explain the behaviour of those processsors when 1 day > >>> scheduling in TimerDriven? > >> > >> > > >
Re: some processors runs only once in NiFi
prabhu - can you please share screenshots and or logs showing that it is running only once? Thanks On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran wrote: > Aldrin, > > Thanks for your response. > > For GetHTTP ,I have checked to download different files even it could not > run more than once. > > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes "Hadoop > Configuration Resources","Directory","RecurseSubDirectories" correctly for > Hadoop-2.5.2.This runs only once not run again. > > Note: I have checked those processors in windows. > > Can you give any suggestion to solve this? > > Many Thanks, > > > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri wrote: >> >> For GetHTTP, this processor makes use of ETags[1] to prevent downloading >> the same resource repeatedly. I would speculate that this is the case for >> the resource you are specifying. >> >> As for ListHDFS, could you specify what version you are using? There have >> been some bugs concerning how this was handled. If the version is the >> latest, could you please provide some more details in terms of structure and >> timestamps of the associated files causing the issue you are describing? >> >> >> [1] https://en.wikipedia.org/wiki/HTTP_ETag >> >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran >> wrote: >>> >>> Since i have faced some unexpected behaviour's in NiFi. >>> >>> I don't know why those processors which doesn't run more than once. >>> >>> >>> For example: >>> >>> 1.GetHTTP: >>> >>> I have used GetHTTP processor for download files from "HTTP" Url. >>> Initially i have scheduled 0 sec >>> >>> If i runs the processor it runs only once and not again run.Once copy the >>> same processor and paste in the UI then click run that processor it again >>> runs only once. >>> >>> If i scheduling it then also not runs more than once. >>> >>> >>> >>> 2.ListHDFS: >>> >>> I have configured local cluster properties in ListHDFS. >>> >>> i have 12 files in hdfs directory.If i runs without scheduling then it >>> lists 12 files correctly and after scheduling it only returns 11 files >>> without 1 file and not run after first time run >>> >>> >>> >>> >>> can anyone explain the behaviour of those processsors when 1 day >>> scheduling in TimerDriven? >> >> >
Re: some processors runs only once in NiFi
Aldrin, Thanks for your response. For GetHTTP ,I have checked to download different files even it could not run more than once. ListHDFS:I have used NiFi-1.2.0 in which configured these attributes "Hadoop Configuration Resources","Directory","RecurseSubDirectories" correctly for Hadoop-2.5.2.This runs only once not run again. *Note: *I have checked those processors in windows*.* Can you give any suggestion to solve this? Many Thanks, On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri wrote: > For GetHTTP, this processor makes use of ETags[1] to prevent downloading > the same resource repeatedly. I would speculate that this is the case for > the resource you are specifying. > > As for ListHDFS, could you specify what version you are using? There have > been some bugs concerning how this was handled. If the version is the > latest, could you please provide some more details in terms of structure > and timestamps of the associated files causing the issue you are describing? > > > [1] https://en.wikipedia.org/wiki/HTTP_ETag > > On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran > wrote: > >> Since i have faced some unexpected behaviour's in NiFi. >> >> I don't know why those processors which doesn't run more than once. >> >> >> >> *For example:* >> *1.GetHTTP:* >> >> I have used GetHTTP processor for download files from "HTTP" Url. >> Initially i have scheduled 0 sec >> >> If i runs the processor it runs only once and not again run.Once copy the >> same processor and paste in the UI then click run that processor it again >> runs only once. >> >> If i scheduling it then also not runs more than once. >> >> >> >> *2.ListHDFS:* >> >> I have configured local cluster properties in ListHDFS. >> >> i have 12 files in hdfs directory.If i runs without scheduling then it >> lists 12 files correctly and after scheduling it only returns 11 files >> without 1 file and not run after first time run >> >> >> >> >> can anyone explain the behaviour of those processsors when 1 day >> scheduling in TimerDriven? >> > >
Re: some processors runs only once in NiFi
For GetHTTP, this processor makes use of ETags[1] to prevent downloading the same resource repeatedly. I would speculate that this is the case for the resource you are specifying. As for ListHDFS, could you specify what version you are using? There have been some bugs concerning how this was handled. If the version is the latest, could you please provide some more details in terms of structure and timestamps of the associated files causing the issue you are describing? [1] https://en.wikipedia.org/wiki/HTTP_ETag On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran wrote: > Since i have faced some unexpected behaviour's in NiFi. > > I don't know why those processors which doesn't run more than once. > > > > *For example:* > *1.GetHTTP:* > > I have used GetHTTP processor for download files from "HTTP" Url. > Initially i have scheduled 0 sec > > If i runs the processor it runs only once and not again run.Once copy the > same processor and paste in the UI then click run that processor it again > runs only once. > > If i scheduling it then also not runs more than once. > > > > *2.ListHDFS:* > > I have configured local cluster properties in ListHDFS. > > i have 12 files in hdfs directory.If i runs without scheduling then it > lists 12 files correctly and after scheduling it only returns 11 files > without 1 file and not run after first time run > > > > > can anyone explain the behaviour of those processsors when 1 day > scheduling in TimerDriven? >
some processors runs only once in NiFi
Since i have faced some unexpected behaviour's in NiFi. I don't know why those processors which doesn't run more than once. *For example:* *1.GetHTTP:* I have used GetHTTP processor for download files from "HTTP" Url. Initially i have scheduled 0 sec If i runs the processor it runs only once and not again run.Once copy the same processor and paste in the UI then click run that processor it again runs only once. If i scheduling it then also not runs more than once. *2.ListHDFS:* I have configured local cluster properties in ListHDFS. i have 12 files in hdfs directory.If i runs without scheduling then it lists 12 files correctly and after scheduling it only returns 11 files without 1 file and not run after first time run can anyone explain the behaviour of those processsors when 1 day scheduling in TimerDriven?