Ultimately I gave up on the non printing character as it breaks. So now I’m trying to use tabs. I’ve gone through my source file and insured that there are no tabs and then I convert my multicharacter delimiter to tabs.
For the csv reader how do you set quote and escape character to null or don’t use. I’ve already set the CSV Format to Tab-Delimited but both Apache Commons and Jackson are puking on unterminated double quotes in the data. I have no way to guarantee a valid utf8 character won’t be in the file so I’m not sure what to set those two so they aren’t used. Thanks Shawn From: Shawn Weeks <swe...@weeksconsulting.us> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org> Date: Thursday, November 7, 2019 at 7:41 AM To: "users@nifi.apache.org" <users@nifi.apache.org> Subject: Re: How to replace multi character delimiter with ASCII 001 So that worked but now I can’t figure out how to set the Value Separator in the CSVReader Service as it doesn’t accept expressions in 1.9.2. I’ve tried setting it to “\u0001” or “\001” with no change. Thanks From: Andy LoPresto <alopre...@apache.org> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org> Date: Wednesday, November 6, 2019 at 5:40 PM To: "users@nifi.apache.org" <users@nifi.apache.org> Subject: Re: How to replace multi character delimiter with ASCII 001 I haven’t tried this, but you might be able to use ${"AQ==“:base64Decode()} as AQ== is the Base64 encoded \u0001 ? Andy LoPresto alopre...@apache.org<mailto:alopre...@apache.org> alopresto.apa...@gmail.com PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 On Nov 6, 2019, at 12:25 PM, Shawn Weeks <swe...@weeksconsulting.us<mailto:swe...@weeksconsulting.us>> wrote: I'm trying to process a delimited file with a multi character delimiter which is not supported by the CSV Record Reader. To get around that I'm trying to replace the delimiter with ASCII 001 the same delimiter used by Hive and one I know isn't in the data. Here is my current configuration but NiFi isn't interpreting \u0001. I've also tried \001 and ${literal('\u0001')}. None of which worked. What is the correct way to do this? Thanks Shawn Weeks <Outlook-asqiibgt.png>