{"name":"flag_s","type":["int","boolean"]}

We have a lot of type massaging baked into the Record API. If the int
version is meant to be used as a boolean equivalent (0 = false,
anything else is truthy) then this is something that probably already
is or should be covered by that type massaging.

On Fri, Mar 18, 2022 at 5:50 AM <stephen.hindma...@bt.com> wrote:
>
> Mark,
>
>
>
> Thank you for your response. I thought that was probably the case, but I 
> tried a cast and it did not work. I got this error.
>
>
>
> Query:
>
> select *
>
> from flowfile
>
> where cast(flag_s as boolean) = true
>
>
>
> Error:
>
> org.apache.calcite.sql.validate.SqlValidatorException: Cast function cannot 
> convert value of type JavaType(class java.lang.Object) to type BOOLEAN
>
>
>
> By taking the union out of the input schema I could get the query to work, 
> but I did find myself getting tangled up in managing various schemas so I am 
> trying to use infer/inherit read/write services instead. I have inherited a 
> very complex flow from a team that have long departed and am looking to 
> simplify it to improve performance and maintainability. I need to convert 
> from CSV/TSV to JSON, normalise fields, filter unwanted records, enrich with 
> more JSON and finally publish to a customer defined schema, so I do need a 
> few steps along the way. I am exploring each step in order to validate my 
> redesign so I take your point about minimising the number of processes and 
> will look again at combining steps in the query process, although I am also a 
> fan of the JOLT transform as I have used that often in previous projects.
>
>
>
> Regards
>
> Steve Hindmarch
>
>
>
> From: Mark Payne <marka...@hotmail.com>
> Sent: 17 March 2022 14:17
> To: users <users@nifi.apache.org>
> Subject: Re: QueryRecord with Union type
>
>
>
> Steve,
>
>
>
> Because your schema has a union, the SQL engine doesn’t really know how to 
> interpret the data. So it interprets it as a “Java Object.” Essentially,
>
> it could be anything. But you can’t compare just anything to true - you need 
> to compare a boolean to true. So you need to tell the SQL engine that the
>
> value you’re looking at is, in fact, a boolean.
>
>
>
> You can do that with a simple CAST() function in your SQL:
>
>
>
> SELECT *
>
> FROM FLOWFILE
>
> WHERE CAST(flag_s AS BOOLEAN) = true
>
>
>
> That should give you what you’re looking for.
>
>
>
> Also worth nothing - you mentioned that you’re using ConvertRecord and 
> UpdateRecord before QueryRecord.
>
> 99% of the time, you should not be using ConvertRecord in conjunction with 
> any other Record processor. Because the Record processors like UpdateRecord
>
> allow you to use any Record Reader, it doesn’t make sense to convert the data 
> first using ConvertRecord - it’s just extra overhead.
>
> And, in fact, you may be able to eliminated the UpdateRecord, as well, as 
> just use the SQL within QueryRecord to perform the transformation needed on 
> the fly,
>
> rather than having another step to update the data, which requires reading 
> the data, parsing it, updating it, serializing the data, writing the data. 
> This may not
>
> be possible, depends on what you’re updating. But QueryRecord does support 
> RecordPath expressions so it’s worth considering.
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Mar 15, 2022, at 8:35 AM, stephen.hindma...@bt.com wrote:
>
>
>
> I am having a play with QueryRecord to do some filtering but I have run 
> across this problem. I have a schema for my records which includes a union 
> type, so the relevant part of the schema is
>
>
>
> {
>
>   "type":"record",
>
>   "namespace":"blah",
>
>   "name":"SimpleTraffic",
>
>   "fields":[
>
>     {"name":"src_address","type":"string"},
>
>     {"name":"flag_s","type":["int","boolean"]}
>
>   ]
>
> }
>
>
>
> This is because I am processing CSV records that look this, where 1 is true 
> and 0 is false.
>
>
>
> 192.168.0.1,1
>
>
>
> Into JSON that looks like this, using a ConvertRecord and an Update Record.
>
>
>
> {"src_address":"192.168.0.1","flag_s":true}
>
>
>
> Then I create a QueryRecord so I can filter out the cases where the flag is 
> false. So I use this query.
>
>
>
> select * from flowfile where flag_s = true
>
>
>
> But I get this error
>
>
>
> org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply '=' to 
> arguments of type '<JAVATYPE(CLASS JAVA.LANG.OBJECT)> = <BOOLEAN>'
>
>
>
> Is this because the type is a Union type and the Calcite processor cannot 
> work out which subtype it should be? Can I do anything to persuade the query 
> to use an operator or a function on this field to make it usable? I have 
> tried casting to Boolean or Char but no success. Or do I need to use two 
> separate “before” and “after” schemas to eliminate the union?
>
>
>
> Regards
>
>
>
> Steve Hindmarch
>
>

Reply via email to