What is the error you are getting from TransformXML? On Tue, Jun 14, 2016 at 9:38 AM, Anuj Handa <anujha...@gmail.com> wrote:
> anybody has any thoughts on UTF 8 Flow files with XMLtransforemation and > other processors ? > > Anuj > > On Mon, Jun 13, 2016 at 4:45 PM, Anuj Handa <anujha...@gmail.com> wrote: > >> So it seems like its a UTF-8 issue, when i changed the string to use Hex >> instead of Text and using the HEXcode with 00 (2 BYte) the contentsplit >> worked. >> >> <POSTransaction xmlns is the string i was looking to split on which >> translates into following Hex code >> >> *3c0050004f0053005400720061006e00730061006300740069006f006e00200078006d006c006e007300* >> >> the transformXML is now failing i think because of the UTF-8. I know i >> had it working in normal ascii file. >> >> Do i need to specify someplace the flow files are UTF-8 or is it smart >> enough to figure it out on its own ? >> based on some reading i see that some processors expect UTF-8 so the next >> question would be do all processors support UTF 8 ? >> >> Anuj >> >> >> >> On Mon, Jun 13, 2016 at 3:01 PM, Anuj Handa <anujha...@gmail.com> wrote: >> >>> thanks Joe, unfortunately since my xml has namespaces (xmlns ) that >>> approach wont work. >>> any thought on why spilt doesn't work using the tag, does it accept UTF8 >>> flow files ? >>> >>> Anuj >>> >>> On Mon, Jun 13, 2016 at 2:50 PM, ski n <raymondmees...@gmail.com> wrote: >>> >>>> You can also make your input XML well-formed by creating a custom root >>>> element (e.g. <PostTransactions>...xmldocuments</PostTransactions> >>>> and then use the SplitXML processor (or just the transformation step). >>>> >>>> 2016-06-13 18:04 GMT+02:00 Anuj Handa <anujha...@gmail.com>: >>>> >>>>> i have a text file which has multiple XML documents. which starts with >>>>> <POSTransaction >>>>> xmlns >>>>> i am trying to break each one of the XML docs into 1 flow-file so i >>>>> can then use evaluate XML and then convert into JSOn and then load into a >>>>> database. >>>>> >>>>> i tried just the split content and that didnt work. the file is UTF 8 >>>>> not sure if that plays into it. and i am running the nifi on linux and the >>>>> file is also local on linux. >>>>> >>>>> [image: Inline image 1] >>>>> >>>>> this is my entire workflow. >>>>> >>>>> [image: Inline image 2] >>>>> >>>>> >>>>> On Mon, Jun 13, 2016 at 11:43 AM, Joe Percivall < >>>>> joeperciv...@yahoo.com> wrote: >>>>> >>>>>> Awesome, and what processor were you planning to use to split on >>>>>> "#|#|#"? The SplitContent processor[1] can be used to split the content >>>>>> on >>>>>> a sequence of text characters which could split on "<POSTransaction >>>>>> xmlns" >>>>>> without needing to add "#|#|#". >>>>>> >>>>>> Also I see "xmlns" and think this is an xml file you are trying to >>>>>> split. If so are you by chance trying to split evenly on each child? If >>>>>> so >>>>>> the "SplitXml" processor[2] would easily take care of that. >>>>>> >>>>>> [1] >>>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html >>>>>> [2] >>>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitXml/index.html >>>>>> >>>>>> Joe- - - - - - >>>>>> Joseph Percivall >>>>>> linkedin.com/in/Percivall >>>>>> e: joeperciv...@yahoo.com >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Monday, June 13, 2016 11:26 AM, Anuj Handa <anujha...@gmail.com> >>>>>> wrote: >>>>>> Yes that's exactly correct. >>>>>> >>>>>> >>>>>> > On Jun 13, 2016, at 11:14 AM, Joe Percivall <joeperciv...@yahoo.com> >>>>>> wrote: >>>>>> > >>>>>> > Sorry I got a bit confused, in your original question you said that >>>>>> you wanted to append the value and I took it that you just wanted to >>>>>> append >>>>>> the value to the end of the line or text. >>>>>> > >>>>>> > Let me try and restate your goal so I'm sure I understand, >>>>>> ultimately you want to split the incoming FlowFile on each occurrence of >>>>>> "<POSTransaction xmlns" and you are planning on using ReplaceText to add >>>>>> "#|#|#" before each occurrence so that it will be easy to split? >>>>>> > >>>>>> > >>>>>> > Joe >>>>>> > - - - - - - >>>>>> > Joseph Percivall >>>>>> > linkedin.com/in/Percivall >>>>>> > e: joeperciv...@yahoo.com >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Monday, June 13, 2016 11:05 AM, Anuj Handa <anujha...@gmail.com> >>>>>> wrote: >>>>>> > >>>>>> > >>>>>> > >>>>>> > Anuj >>>>>> > Hi Joe, >>>>>> > >>>>>> > I modified the process per your suggestion but it only works to >>>>>> replace the first occurrence, There are multiple such tags which it >>>>>> doesn't >>>>>> replace. . >>>>>> > when i used evaluation mode line by line it appended it to every >>>>>> line in the file and not to the one i waned too. >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Mon, Jun 13, 2016 at 10:40 AM, Joe Percivall < >>>>>> joeperciv...@yahoo.com> wrote: >>>>>> > >>>>>> > Hello, >>>>>> >> >>>>>> >> In order to use ReplaceText[1] to solely append a value to the end >>>>>> of then entire text then change the "Replacement Strategy" to "Append" >>>>>> and >>>>>> leave "Evaluation Mode" as "Entire Text". This will take whatever is the >>>>>> "Replacement Value" and append it as a literal(without interpreting >>>>>> back-references) to the end of the text. >>>>>> >> >>>>>> >> Alternatively, if you want to append to the end of each line then >>>>>> change "Evaluation Mode" to "Line-by-Line". >>>>>> >> >>>>>> >> [1] >>>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ReplaceText/index.html >>>>>> >> >>>>>> >> >>>>>> >> Hope that helps, >>>>>> >> Joe >>>>>> >> - - - - - - Joseph Percivall >>>>>> >> linkedin.com/in/Percivall >>>>>> >> e: joeperciv...@yahoo.com >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> On Monday, June 13, 2016 10:05 AM, Anuj Handa <anujha...@gmail.com> >>>>>> wrote: >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> Hi, >>>>>> >> >>>>>> >> I am trying to read a file and then use replaceText to append a >>>>>> string so I can spilt the line in the next step. I am nable to make the >>>>>> ReplaceText work. >>>>>> >> The flowfile is going through as success without the string being >>>>>> appended or replaced >>>>>> >> >>>>>> >> Any thoughts what i could be doing wrong >>>>>> >> >>>>>> >>>>> >>>>> >>>> >>> >> >