Converting long string to JSON format.

2019-11-06 Thread Rivasa
Hi, so I'm having a bit of trouble where I can't really figure out how to
accomplish what I need, since i'm fairly new to NIFI in general.

So I have a string in the following format:
item1=value1|item2=value2|item3=value3|item4=value4... so on and so forth. 

What I would like to do is convert this into a valid JSON object of the
format of:
{
"item1":"value1", 
"item2":"value2",
"item3":"value3",
"item4":"value4"
...
etc
}

I think I need to use extract text or similar maybe? But I'm not exactly
sure what to do. Could someone point me in the right direction?



--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/


Re: invokehttp timeout

2019-11-06 Thread Michael Di Domenico
On Mon, Nov 4, 2019 at 10:28 AM Vijay Chhipa  wrote:
>
> Just to rule out timeouts set along the network hops, can you use cURL or 
> Postman to ensure that you can make this call successfully from the host 
> that's running NiFi?

yes, using curl seems to work fine.


> > On Oct 30, 2019, at 1:52 PM, Michael Di Domenico  
> > wrote:
> >
> > i'm trying to use invokehttp to post a file to a webserver.  for
> > smaller files this seems to work just fine, but i try to send a larger
> > file, i get snagged with this
> >
> > 2019-10-30 14:47:59,188 ERROR [Timer-Driven Process Thread-10]
> > o.a.nifi.processors.standard.InvokeHTTP
> > InvokeHTTP[id=a2046302-016a-1000-940e-07f6f5992610] Routing to Failure
> > due to exception:
> > org.apache.nifi.processor.exception.ProcessException: IOException
> > thrown from InvokeHTTP[id=a2046302-016a-1000-940e-07f6f5992610]:
> > java.net.SocketTimeoutException: timeout:
> > org.apache.nifi.processor.exception.ProcessException: IOException
> > thrown from InvokeHTTP[id=a2046302-016a-1000-940e-07f6f5992610]:
> > java.net.SocketTimeoutException: timeout
> > org.apache.nifi.processor.exception.ProcessException: IOException
> > thrown from InvokeHTTP[id=a2046302-016a-1000-940e-07f6f5992610]:
> > java.net.SocketTimeoutException: timeout
> >
> > i've lengthened the connect and read timeout in the properties to 1min
> > and 1hrs respectively, but it doesn't seem to help.
> >
> > is there a setting i'm missing?  I also semi-related, i need to ensure
> > i get response from the webserver.  however, the web server might take
> > up 30 mins to respond.  will invokehttp hold the connection open, if
> > not is there a timeout setting somewhere?
>


Re: Converting long string to JSON format.

2019-11-06 Thread Chandrashekhar Kotekar
I would prefer to write custom processor which will first convert those key 
value pairs into Properties object and then map that object to POJO and use 
json4s to convert pojo to JSON. 

Sent from my iPhone

> On 6 Nov 2019, at 9:13 pm, Rivasa  wrote:
> 
> Hi, so I'm having a bit of trouble where I can't really figure out how to
> accomplish what I need, since i'm fairly new to NIFI in general.
> 
> So I have a string in the following format:
> item1=value1|item2=value2|item3=value3|item4=value4... so on and so forth. 
> 
> What I would like to do is convert this into a valid JSON object of the
> format of:
> {
> "item1":"value1", 
> "item2":"value2",
> "item3":"value3",
> "item4":"value4"
> ...
> etc
> }
> 
> I think I need to use extract text or similar maybe? But I'm not exactly
> sure what to do. Could someone point me in the right direction?
> 
> 
> 
> --
> Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/


Re: Converting long string to JSON format.

2019-11-06 Thread Andy LoPresto
I think you could accomplish this using ConvertRecord. For the Record Reader, 
use a CSVReader with the delimiter character set to |, and for the Record 
Writer, use a JsonRecordSetWriter. You may have to use a 
ScriptedRecordSetWriter to parse the key/value pair tokens out individually. 


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 6, 2019, at 9:24 AM, Chandrashekhar Kotekar 
>  wrote:
> 
> I would prefer to write custom processor which will first convert those key 
> value pairs into Properties object and then map that object to POJO and use 
> json4s to convert pojo to JSON. 
> 
> Sent from my iPhone
> 
>> On 6 Nov 2019, at 9:13 pm, Rivasa  wrote:
>> 
>> Hi, so I'm having a bit of trouble where I can't really figure out how to
>> accomplish what I need, since i'm fairly new to NIFI in general.
>> 
>> So I have a string in the following format:
>> item1=value1|item2=value2|item3=value3|item4=value4... so on and so forth. 
>> 
>> What I would like to do is convert this into a valid JSON object of the
>> format of:
>> {
>> "item1":"value1", 
>> "item2":"value2",
>> "item3":"value3",
>> "item4":"value4"
>> ...
>> etc
>> }
>> 
>> I think I need to use extract text or similar maybe? But I'm not exactly
>> sure what to do. Could someone point me in the right direction?
>> 
>> 
>> 
>> --
>> Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/



Re: Converting long string to JSON format.

2019-11-06 Thread Rivasa
That is a good idea, thanks. Is there any documentation on making a script
that will work for this processor? (i.e. requirements if there are any for
the script itself?)



--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/


How to replace multi character delimiter with ASCII 001

2019-11-06 Thread Shawn Weeks
I'm trying to process a delimited file with a multi character delimiter which 
is not supported by the CSV Record Reader. To get around that I'm trying to 
replace the delimiter with ASCII 001 the same delimiter used by Hive and one I 
know isn't in the data. Here is my current configuration but NiFi isn't 
interpreting \u0001. I've also tried \001 and ${literal('\u0001')}. None of 
which worked. What is the correct way to do this?

Thanks
Shawn Weeks

[cid:a63a1786-00f7-4ebd-a424-928935cf08d0]


Re: How to replace multi character delimiter with ASCII 001

2019-11-06 Thread Andy LoPresto
I haven’t tried this, but you might be able to use ${"AQ==“:base64Decode()} as 
AQ== is the Base64 encoded \u0001 ?

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 6, 2019, at 12:25 PM, Shawn Weeks  wrote:
> 
> I'm trying to process a delimited file with a multi character delimiter which 
> is not supported by the CSV Record Reader. To get around that I'm trying to 
> replace the delimiter with ASCII 001 the same delimiter used by Hive and one 
> I know isn't in the data. Here is my current configuration but NiFi isn't 
> interpreting \u0001. I've also tried \001 and ${literal('\u0001')}. None of 
> which worked. What is the correct way to do this?
> 
> Thanks
> Shawn Weeks
> 
> 



RE: [EXT] Re: How to replace multi character delimiter with ASCII 001

2019-11-06 Thread Peter Wicks (pwicks)
Shawn,

We had the same issue, and use special and multi character delimiters here. I 
have not been able to find a CSV library that supports multi-character 
delimiters, otherwise I would have updated the CSV Record Reader to support it. 
I created a special Record Reader that supports multi-character delimiters. We 
use this in Convert Record to convert to a different format as soon as possible 
😊.  I don’t know if your up for using custom code… But just in case you are, 
here is my personal implementation that we use in house.

Thanks,
  Peter

--Class 1--

public class CSVReader extends AbstractControllerService implements 
RecordReaderFactory {
static final PropertyDescriptor COLUMN_DELIMITER = new 
PropertyDescriptor.Builder()
.name("pt-column-delimiter")
.displayName("Column Delimiter")
.description("The character(s) to use to separate columns of data. 
Special characters like metacharacter should use the '' notation. If not 
specified Ctrl+A delimiter is used.")
.required(false)
.defaultValue("\\u0001")
.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
.build();
static final PropertyDescriptor RECORD_DELIMITER = new 
PropertyDescriptor.Builder()
.name("pt-record-delimiter")
.displayName("Record Delimiter")
.description("The character(s) to use to separate rows of data. For 
line return press 'Shift+Enter' in this field. Special characters should use 
the '\\u' notation.")
.required(false)
.defaultValue("\n")
.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
.build();
static final PropertyDescriptor SKIP_HEADER_ROW = new 
PropertyDescriptor.Builder()
.name("pt-skip-header")
.displayName("Skip First Row")
.description("Specifies whether or not the first row of data will 
be skipped.")
.allowableValues("true", "false")
.defaultValue("true")
.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
.build();

private volatile String colDelimiter;
private volatile String recDelimiter;
   private volatile boolean skipHeader;

@OnEnabled
public void storeCsvFormat(final ConfigurationContext context) {
this.colDelimiter = 
StringEscapeUtils.unescapeJava(context.getProperty(COLUMN_DELIMITER).getValue());
this.recDelimiter = 
StringEscapeUtils.unescapeJava(context.getProperty(RECORD_DELIMITER).getValue());
this.skipHeader = context.getProperty(SKIP_HEADER_ROW).asBoolean();
}

@Override
protected List getSupportedPropertyDescriptors() {
List propertyDescriptors = new ArrayList<>();
propertyDescriptors.add(COLUMN_DELIMITER);
propertyDescriptors.add(RECORD_DELIMITER);
propertyDescriptors.add(SKIP_HEADER_ROW);

return propertyDescriptors;
}

@Override
public RecordReader createRecordReader(Map map, InputStream 
inputStream, ComponentLog componentLog) throws MalformedRecordException, 
IOException, SchemaNotFoundException {
return new CSVRecordReader(inputStream, componentLog, this.skipHeader, 
this.colDelimiter, this.recDelimiter);
}
}


--- Class 2 ---

public class CSVRecordReader implements RecordReader {
private final PeekableScanner s;
private final RecordSchema schema;
private final String colDelimiter;
private final String recordDelimiter;

public CSVRecordReader(final InputStream in, final ComponentLog logger, 
final boolean hasHeader, final String colDelimiter, final String 
recordDelimiter) throws IOException {
this.recordDelimiter = recordDelimiter;
this.colDelimiter = colDelimiter;

s = new PeekableScanner(new Scanner(in, 
"UTF-8").useDelimiter(recordDelimiter));
//Build a basic schema based on row count
final String forRowCount = s.peek();
final List fields = new ArrayList<>();

if (forRowCount != null) {
final String[] columns = forRowCount.split(colDelimiter, -1);
for (int nColumnIndex = 0; nColumnIndex < columns.length; 
nColumnIndex++) {
fields.add(new RecordField("Column_" + 
String.valueOf(nColumnIndex), RecordFieldType.STRING.getDataType(), true));
}

schema = new SimpleRecordSchema(fields);
} else {
schema = null;
}

//Skip the header line, if there is one
if (hasHeader && s.hasNext()) s.next();
}

@Override
public Record nextRecord(boolean b, boolean b1) throws IOException, 
MalformedRecordException {
if(!s.hasNext()) return null;

final String row = s.next();
final List recordFields = getSchema().getFields();

final Map values = new 
LinkedHashMap<>(recordFields.size() * 2);
final String[] columns = row.split(colDelimiter, -1);

for (int i = 0; i < columns.length; i++) {