By "relaunch option" I'm assuming you mean "launch -originalAppId ...". Looks like Jim does not want to use that option. He wants a new launch to automatically detect data from an earlier launch and, if present, use it.
Ram On Tue, Feb 16, 2016 at 8:29 PM, Thomas Weise <thomas.we...@gmail.com> wrote: > Ram, > > The recovery path, when under the application directory, will be > automatically copied to the new app directory when relaunch option is used. > This is how the previous instance data is available to the new app. > > Thomas > > On Tue, Feb 16, 2016 at 5:23 PM, Munagala Ramanath <r...@datatorrent.com> > wrote: > >> Ah, I understand now. >> >> The path is set in >> IdempotentStorageManager.FSIdempotentStorageManager,setup() near line 146: >> appPath = new Path(context.getValue(DAG.APPLICATION_PATH) + >> Path.SEPARATOR + recoveryPath); >> >> You can try creating a new class that extends FSIdempotentStorageManager >> and override setup() to use a local property >> for the appPath and simply duplicate the rest of the code. >> >> Ram >> >> On Tue, Feb 16, 2016 at 3:59 PM, Jim <jim@facility.supplies> wrote: >> >>> Ram, >>> >>> >>> >>> I am not 100% fluent in the details of the base kinesis operator and how >>> it interacts with Hadoop (hence my posting); if it would support that, then >>> yes, you could. >>> >>> >>> >>> My goal is to make it so one can easily pick up where they left off >>> reading the Kinesis stream, regardless of if you kill the application and >>> re-launch it, etc., without needing to go out to the cli to do some >>> commands (because at some point some operator will forget and then we will >>> reprocess a bunch of transactions; that would not be good! >>> >>> >>> >>> Jim >>> >>> >>> >>> *From:* Munagala Ramanath [mailto:r...@datatorrent.com] >>> *Sent:* Tuesday, February 16, 2016 5:21 PM >>> *To:* users@apex.incubator.apache.org >>> *Subject:* Re: Kinesis Operator Help >>> >>> >>> >>> Why use the application id ? Could you generate and use a java.util.UUID >>> for example and save it in HDFS ? >>> >>> >>> >>> Ram >>> >>> >>> >>> On Tue, Feb 16, 2016 at 11:40 AM, Jim <jim@facility.supplies> wrote: >>> >>> Good morning, >>> >>> >>> >>> I am new to Apex, Hadoop and Yarn (nothing like tackling something new, >>> is there?). >>> >>> >>> >>> I have my first Apex apps working that are edi processors that read new >>> edi transactions from an Amazon Kinesis stream, look at the data, and >>> routes the edi data to an appropriate handler for processing (note the >>> operatorEs pushes the data to ElasticSearch for logging). Here is a >>> diagram: >>> >>> >>> >>> >>> >>> Everything launches, and is working fine with the above diagram from the >>> edi router through the transaction operators. >>> >>> >>> >>> The final challenge I am having, being new to all of this, is that the >>> Kinesis operator, by default, stores it’s app id in into >>> IdempotentStorageManager (aka WindowDataManager) when it is launched, so if >>> the app it shutdown and restarted this same app id is used by default with >>> the checkpoint so you don’t reprocess the same records again when the >>> application is restarted. >>> >>> >>> >>> You can see this id immediately to the right of the Operations / apps in >>> gray lettering ‘application_1453741656046_0520’ in the image from the >>> datatorrent console below: >>> >>> >>> >>> [image: cid:image004.png@01D168BA.5FE56550] >>> >>> >>> >>> However, if you kill the application, and re-launch, this id changes, >>> and it starts reading from the Kinesis stream back from the beginning; and >>> the only way to restart it so it starts where it left off is using the cli >>> as follows: >>> >>> >>> >>> 1.) Run ‘dtcli’ from the command line. >>> >>> 2.) Run ‘launch -originalAppId “application_1453741656046_0520” >>> <path to .apa file>’ >>> >>> >>> >>> This will launch the application using the same app id identified in the >>> console screen above. >>> >>> >>> >>> I want to make this easier, but need some experts help in tweaking this >>> so it works. >>> >>> >>> >>> I am thinking that there should be a way with Kinesis to: >>> >>> >>> >>> 1.) Define in the properties, a Kinesis app id string value. >>> >>> 2.) If this value is defined, it will use that, when launching the >>> application, to check if an Hadoop app id has already been assigned to that >>> identifier. >>> >>> 3.) If that value is not yet stored in the database, it will launch >>> the app, creating a new app id, and store the app id under the identifier >>> key value. >>> >>> 4.) Now if I kill the app, or install new software, it will always >>> pick up where it left off by using the identifier key value to retrieve and >>> assign the app id. >>> >>> >>> >>> Sounds simple, right? J >>> >>> >>> >>> Can one of the experts out there help me figure this out as I don’t want >>> to reprocess already processed edi transactions? >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Jim >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Jim >>> >>> >>> jim@facility.supplies (414) 760-7711 >>> ------------------------------ >>> >>> *The information contained in this communication, including any files or >>> attachments transmitted with it, may contain copyrighted information or >>> information that is confidential and exempt from disclosure under >>> applicable laws and regulations, is intended only for the use of the >>> recipient(s) named above, and may be legally privileged. If the reader of >>> this message is not the intended recipient, you are hereby notified that >>> any dissemination, distribution, or copying of this communication, or any >>> of its contents, files or attachments, is strictly prohibited. If you have >>> received this communication in error, please return it to the sender >>> immediately and delete the original message and any copy of it from your >>> computer system. If you have any questions concerning this message, please >>> contact the sender. * >>> >>> >>> >> >> >