{"name":"flag_s","type":["int","boolean"]}
We have a lot of type massaging baked into the Record API. If the int version is meant to be used as a boolean equivalent (0 = false, anything else is truthy) then this is something that probably already is or should be covered by that type massaging. On Fri, Mar 18, 2022 at 5:50 AM <stephen.hindma...@bt.com> wrote: > > Mark, > > > > Thank you for your response. I thought that was probably the case, but I > tried a cast and it did not work. I got this error. > > > > Query: > > select * > > from flowfile > > where cast(flag_s as boolean) = true > > > > Error: > > org.apache.calcite.sql.validate.SqlValidatorException: Cast function cannot > convert value of type JavaType(class java.lang.Object) to type BOOLEAN > > > > By taking the union out of the input schema I could get the query to work, > but I did find myself getting tangled up in managing various schemas so I am > trying to use infer/inherit read/write services instead. I have inherited a > very complex flow from a team that have long departed and am looking to > simplify it to improve performance and maintainability. I need to convert > from CSV/TSV to JSON, normalise fields, filter unwanted records, enrich with > more JSON and finally publish to a customer defined schema, so I do need a > few steps along the way. I am exploring each step in order to validate my > redesign so I take your point about minimising the number of processes and > will look again at combining steps in the query process, although I am also a > fan of the JOLT transform as I have used that often in previous projects. > > > > Regards > > Steve Hindmarch > > > > From: Mark Payne <marka...@hotmail.com> > Sent: 17 March 2022 14:17 > To: users <users@nifi.apache.org> > Subject: Re: QueryRecord with Union type > > > > Steve, > > > > Because your schema has a union, the SQL engine doesn’t really know how to > interpret the data. So it interprets it as a “Java Object.” Essentially, > > it could be anything. But you can’t compare just anything to true - you need > to compare a boolean to true. So you need to tell the SQL engine that the > > value you’re looking at is, in fact, a boolean. > > > > You can do that with a simple CAST() function in your SQL: > > > > SELECT * > > FROM FLOWFILE > > WHERE CAST(flag_s AS BOOLEAN) = true > > > > That should give you what you’re looking for. > > > > Also worth nothing - you mentioned that you’re using ConvertRecord and > UpdateRecord before QueryRecord. > > 99% of the time, you should not be using ConvertRecord in conjunction with > any other Record processor. Because the Record processors like UpdateRecord > > allow you to use any Record Reader, it doesn’t make sense to convert the data > first using ConvertRecord - it’s just extra overhead. > > And, in fact, you may be able to eliminated the UpdateRecord, as well, as > just use the SQL within QueryRecord to perform the transformation needed on > the fly, > > rather than having another step to update the data, which requires reading > the data, parsing it, updating it, serializing the data, writing the data. > This may not > > be possible, depends on what you’re updating. But QueryRecord does support > RecordPath expressions so it’s worth considering. > > > > Thanks > > -Mark > > > > > > > > On Mar 15, 2022, at 8:35 AM, stephen.hindma...@bt.com wrote: > > > > I am having a play with QueryRecord to do some filtering but I have run > across this problem. I have a schema for my records which includes a union > type, so the relevant part of the schema is > > > > { > > "type":"record", > > "namespace":"blah", > > "name":"SimpleTraffic", > > "fields":[ > > {"name":"src_address","type":"string"}, > > {"name":"flag_s","type":["int","boolean"]} > > ] > > } > > > > This is because I am processing CSV records that look this, where 1 is true > and 0 is false. > > > > 192.168.0.1,1 > > > > Into JSON that looks like this, using a ConvertRecord and an Update Record. > > > > {"src_address":"192.168.0.1","flag_s":true} > > > > Then I create a QueryRecord so I can filter out the cases where the flag is > false. So I use this query. > > > > select * from flowfile where flag_s = true > > > > But I get this error > > > > org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply '=' to > arguments of type '<JAVATYPE(CLASS JAVA.LANG.OBJECT)> = <BOOLEAN>' > > > > Is this because the type is a Union type and the Calcite processor cannot > work out which subtype it should be? Can I do anything to persuade the query > to use an operator or a function on this field to make it usable? I have > tried casting to Boolean or Char but no success. Or do I need to use two > separate “before” and “after” schemas to eliminate the union? > > > > Regards > > > > Steve Hindmarch > >