Actually, they could use Drools (and in theory, Javascript or Groovy rule engines) right? Even if not directly supported, they could be added in some of the Scripting modules.
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/ On Wed, Jun 15, 2022 at 10:27 AM Matt Casters <[email protected]> wrote: > Great topic! > > So to add to that, data profiling can be done right now using the > common statistical features but it's geared towards operational data > profiling. Everything you can aggregate (min, max, count all, count > non-null, ...) or checksum can be kept track of and I would consider it > "best-practice" to do this in scenarios where data is being staged before > processing for example. That way you can set alerts on the profiling data > (counts) to see that not too many records are being rejected. Another > example would be to put alerts on certain fields in data sets to see that > they're not over 80% null (again as an example). > > I kept this sort of "operational data profiling" in mind when I was > architecting the new monitoring and logging functionality for the post 2.0 > versions. > > As far as the user interfaces are concerned for "online data profiling", > usually used to profile input data sources and so on, I'm going to join > Bart in inviting the community to submit requirements. I'm convinced > there's a lot we can do with little effort but I still think it's always > better to start from those fresh requirements. > > Thanks in advance! > Matt > > On Wed, Jun 15, 2022 at 4:58 PM Bart Maertens <[email protected]> > wrote: > >> Hi Kevin, >> >> There are no dedicated data profiling/quality transforms in Hop (yet), >> while simultaneously, everything can be used to build data >> profiling/quality checks. >> You can build your own data quality checks and profiling in a Hop project >> or framework. We'll probably do more on both quality and profiling in >> future releases, but that functionality is not available yet. >> Feel free to create an improvement ticket in JIRA so we can keep track of >> it. >> >> Regards, >> Bart >> >> >> >> On Wed, Jun 15, 2022 at 4:48 PM Kevin L Kitts <[email protected]> wrote: >> >>> Hi All, >>> >>> >>> >>> I saw in the documentation “Getting Started” section a reference to >>> “Data Profiling”. I’d like to find more information on how data profiling >>> and data quality related tasks are accomplished in hop. Is there a section >>> of the documentation that describes data profiling/data quality features of >>> hop? >>> >>> >>> >>> Thanks! >>> >> > > -- > Neo4j Chief Solutions Architect > *✉ *[email protected] > > > >
