Thanks Joe for checking. Yes, I got past it and I was successfully able to demo it to the team :) Now, the next challenge is to drive the performance out of nifi for the high throughput.
On Mon, Oct 31, 2016 at 7:08 PM, Joe Witt <joe.w...@gmail.com> wrote: > Krish, > > Did you ever get past this? > > Thanks > Joe > > On Fri, Oct 28, 2016 at 2:36 PM, Gop Krr <gop....@gmail.com> wrote: > > James, permission issue got resolved. I still don't see any write. > > > > On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <gop....@gmail.com> wrote: > >> > >> Thanks James.. I am looking into permission issue and update the > thread. I > >> will also make the changes as you per your recommendation. > >> > >> On Fri, Oct 28, 2016 at 10:23 AM, James Wing <jvw...@gmail.com> wrote: > >>> > >>> From the screenshot and the error message, I interpret the sequence of > >>> events to be something like this: > >>> > >>> 1.) ListS3 succeeds and generates flowfiles with attributes referencing > >>> S3 objects, but no content (0 bytes) > >>> 2.) FetchS3Object fails to pull the S3 object content with an Access > >>> Denied error, but the failed flowfiles are routed on to PutS3Object > (35,179 > >>> files / 0 bytes in the "putconnector" queue) > >>> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3 > >>> > >>> I recommend a couple thing for FetchS3Object: > >>> > >>> * Only allow the "success" relationship to continue to PutS3Object. > >>> Separate the "failure" relationship to either loop back to > FetchS3Object or > >>> go to a LogAttibute processor, or other handling path. > >>> * It looks like the permissions aren't working, you might want to > >>> double-check the access keys or try a sample file with the AWS CLI. > >>> > >>> Thanks, > >>> > >>> James > >>> > >>> > >>> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <gop....@gmail.com> wrote: > >>>> > >>>> This is how my nifi flow looks like. > >>>> > >>>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <gop....@gmail.com> wrote: > >>>>> > >>>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by > switching > >>>>> to 0.71. Now it is able to list the files from buckets and create > those > >>>>> files in the another bucket. But write is not happening and I am > getting the > >>>>> permission issue ( I have attached below for the reference) Could > this be > >>>>> the setting of the buckets or it has more to do with the access key. > All the > >>>>> files which are creaetd in the new bucket are of 0 byte. > >>>>> Thanks > >>>>> Rai > >>>>> > >>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] > >>>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] > Failed to > >>>>> retrieve S3 Object for > >>>>> StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name= > xxxxx.gz,size=0]; > >>>>> routing to failure: com.amazonaws.services.s3. > model.AmazonS3Exception: > >>>>> Access Denied (Service: Amazon S3; Status Code: 403; Error Code: > >>>>> AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID: > >>>>> lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6 > DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4= > >>>>> > >>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] > >>>>> o.a.nifi.processors.aws.s3.FetchS3Object > >>>>> > >>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied > >>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; > Request ID: > >>>>> 0F34E71C0697B1D8) > >>>>> > >>>>> at > >>>>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse( > AmazonHttpClient.java:1219) > >>>>> ~[aws-java-sdk-core-1.10.32.jar:na] > >>>>> > >>>>> at > >>>>> com.amazonaws.http.AmazonHttpClient.executeOneRequest( > AmazonHttpClient.java:803) > >>>>> ~[aws-java-sdk-core-1.10.32.jar:na] > >>>>> > >>>>> at > >>>>> com.amazonaws.http.AmazonHttpClient.executeHelper( > AmazonHttpClient.java:505) > >>>>> ~[aws-java-sdk-core-1.10.32.jar:na] > >>>>> > >>>>> at > >>>>> com.amazonaws.http.AmazonHttpClient.execute( > AmazonHttpClient.java:317) > >>>>> ~[aws-java-sdk-core-1.10.32.jar:na] > >>>>> > >>>>> at > >>>>> com.amazonaws.services.s3.AmazonS3Client.invoke( > AmazonS3Client.java:3595) > >>>>> ~[aws-java-sdk-s3-1.10.32.jar:na] > >>>>> > >>>>> at > >>>>> com.amazonaws.services.s3.AmazonS3Client.getObject( > AmazonS3Client.java:1116) > >>>>> ~[aws-java-sdk-s3-1.10.32.jar:na] > >>>>> > >>>>> at > >>>>> org.apache.nifi.processors.aws.s3.FetchS3Object. > onTrigger(FetchS3Object.java:106) > >>>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1] > >>>>> > >>>>> at > >>>>> org.apache.nifi.processor.AbstractProcessor.onTrigger( > AbstractProcessor.java:27) > >>>>> [nifi-api-0.7.1.jar:0.7.1] > >>>>> > >>>>> at > >>>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger( > StandardProcessorNode.java:1054) > >>>>> [nifi-framework-core-0.7.1.jar:0.7.1] > >>>>> > >>>>> at > >>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call( > ContinuallyRunProcessorTask.java:136) > >>>>> [nifi-framework-core-0.7.1.jar:0.7.1] > >>>>> > >>>>> at > >>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call( > ContinuallyRunProcessorTask.java:47) > >>>>> [nifi-framework-core-0.7.1.jar:0.7.1] > >>>>> > >>>>> at > >>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1. > run(TimerDrivenSchedulingAgent.java:127) > >>>>> [nifi-framework-core-0.7.1.jar:0.7.1] > >>>>> > >>>>> at > >>>>> java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > >>>>> [na:1.8.0_101] > >>>>> > >>>>> at > >>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > >>>>> [na:1.8.0_101] > >>>>> > >>>>> at > >>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > >>>>> [na:1.8.0_101] > >>>>> > >>>>> at > >>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > >>>>> [na:1.8.0_101] > >>>>> > >>>>> at > >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > >>>>> [na:1.8.0_101] > >>>>> > >>>>> at > >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > >>>>> [na:1.8.0_101] > >>>>> > >>>>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > >>>>> > >>>>> > >>>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard > >>>>> <pierre.villard...@gmail.com> wrote: > >>>>>> > >>>>>> Quick remark: the fix has also been merged in master and will be in > >>>>>> release 1.1.0. > >>>>>> > >>>>>> Pierre > >>>>>> > >>>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <gop....@gmail.com>: > >>>>>>> > >>>>>>> Thanks Adam. I will try 0.7.1 and update the community on the > >>>>>>> outcome. If it works then I can create a patch for 1.x > >>>>>>> Thanks > >>>>>>> Rai > >>>>>>> > >>>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <adamond...@gmail.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> Hey All, > >>>>>>>> > >>>>>>>> I believe OP is running into a bug fixed here: > >>>>>>>> https://issues.apache.org/jira/browse/NIFI-2631 > >>>>>>>> > >>>>>>>> Basically, ListS3 attempts to commit all the files it finds > >>>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631 > >>>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not > yet > >>>>>>>> in > >>>>>>>> a 1.x release. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Adam > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <joe.w...@gmail.com> > >>>>>>>> wrote: > >>>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is > >>>>>>>> > properly streaming the bytes directly to the content repository. > >>>>>>>> > > >>>>>>>> > Looking at the screenshot showing nothing out of the ListS3 > >>>>>>>> > processor > >>>>>>>> > makes me think the bucket has so many things in it that the > >>>>>>>> > processor > >>>>>>>> > or associated library isn't handling that well and is just > listing > >>>>>>>> > everything with no mechanism of max buffer size. Krish please > try > >>>>>>>> > with the largest heap you can and let us know what you see. > >>>>>>>> > > >>>>>>>> > [1] > >>>>>>>> > https://github.com/apache/nifi/blob/master/nifi-nar- > bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/ > org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107 > >>>>>>>> > > >>>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <joe.w...@gmail.com> > >>>>>>>> > wrote: > >>>>>>>> >> moving dev to bcc > >>>>>>>> >> > >>>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked > >>>>>>>> >> transfers and so is loading all into memory. I've not verified > >>>>>>>> >> this > >>>>>>>> >> in the code yet but it seems quite likely. Krish if you can > >>>>>>>> >> verify > >>>>>>>> >> that going with a larger heap gets you in the game can you > please > >>>>>>>> >> file > >>>>>>>> >> a JIRA. > >>>>>>>> >> > >>>>>>>> >> Thanks > >>>>>>>> >> Joe > >>>>>>>> >> > >>>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bbe...@gmail.com > > > >>>>>>>> >> wrote: > >>>>>>>> >>> Hello, > >>>>>>>> >>> > >>>>>>>> >>> Are you running with all of the default settings? > >>>>>>>> >>> > >>>>>>>> >>> If so you would probably want to try increasing the memory > >>>>>>>> >>> settings in > >>>>>>>> >>> conf/bootstrap.conf. > >>>>>>>> >>> > >>>>>>>> >>> They default to 512mb, you may want to try bumping it up to > >>>>>>>> >>> 1024mb. > >>>>>>>> >>> > >>>>>>>> >>> -Bryan > >>>>>>>> >>> > >>>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <gop....@gmail.com> > >>>>>>>> >>> wrote: > >>>>>>>> >>>> > >>>>>>>> >>>> Hi All, > >>>>>>>> >>>> > >>>>>>>> >>>> I have very simple data flow, where I need to move s3 data > from > >>>>>>>> >>>> one bucket > >>>>>>>> >>>> in one account to another bucket under another account. I > have > >>>>>>>> >>>> attached my > >>>>>>>> >>>> processor configuration. > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2] > >>>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread > >>>>>>>> >>>> Thread[Flow Service > >>>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap > >>>>>>>> >>>> space > >>>>>>>> >>>> > >>>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases > >>>>>>>> >>>> going. I need > >>>>>>>> >>>> help from the community. > >>>>>>>> >>>> > >>>>>>>> >>>> Thanks again > >>>>>>>> >>>> > >>>>>>>> >>>> Rai > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > >