Hello Charlie, Sorry no one has gotten back to you yet, everyone is busy getting 0.4.0 finished up and of course Thanksgiving. Have you made any more progress?
Since it is a continuous task it is well within NiFi's wheelhouse. In your original message you mentioned that you already had them merged in to single flowfile but just had trouble creating the path to do a PutFile. Have you tried using expression language [1] to create the path? Assuming you have attributes for the category and date you should be able to create an expression language expression which properly evaluates to what you need. If you need help with creating the proper expression, just reply with the attribute names for the category and dates and I'd be happy to help. [1] https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html Joe - - - - - - Joseph Percivall linkedin.com/in/Percivall e: [email protected] On Monday, November 23, 2015 11:37 AM, Charlie Frasure <[email protected]> wrote: Joe, This is a continuous task. The main intent is to keep a version of the file prior to conversions etc. Ideally, it would be highly compressed, and easy to locate. Best case scenario, the archive files are the contents of highly structured nested directories. File sizes range from a few bytes to < 1GB. It wouldn't have to run real time (updating archives seems to be a fairly intensive task), but would probably run at least every few days. Thanks, Charlie On Mon, Nov 23, 2015 at 11:08 AM, Joe Witt <[email protected]> wrote: Charlie, > >Can give some pointers on how to get in the ballpark with this but >want to make sure we have a good alignment of purpose here. NiFi has >from time to time come up as an intuitive way to build an archive >management tool and it is always "not quite right" because of the >subtle differences between continuous streams of information and >ad-hoc sort of one-time tasks. > >Would this be a continuous task (always running) even if it is slow >(every few minutes, hours, days) or would it be a one-time thing to >move a bunch of data from one place to another? > >The difference sounds very minor but it will help me to understand how >best to respond. > >Thanks >Joe > > >On Mon, Nov 23, 2015 at 10:54 AM, Charlie Frasure ><[email protected]> wrote: >> Use case: Archive and compress files by category and month, store like files >> in a common directory. >> >> I'm already processing the files, and have extracted the interesting >> attributes from each. I ran them through MergeContent, but have not been >> able to produce a logical directory structure to store the results. I would >> prefer something like archive/categoryA/201511/somefilename.tar.gz where >> somefilename is made up of all the categoryA files received in November >> 2015. >> >> I switched gears, and used PutFile to store the files in the preferred >> directory structure, but at a loss of how to archive them within their >> folders given hundreds of dynamic categories, and date additions every >> month. >> >> I'm playing with MergeContent's Correlation Attribute Name, but am also >> considering trying the "Degfragment" merge strategy by correlating the files >> earlier in the process. >> >> Any suggestions would be appreciated. >
