Re: Dealing with bad data when trying to do date computations

2017-05-02 Thread John Omernik
I just want to say, there is a great JIRA already opened here: https://issues.apache.org/jira/browse/DRILL-4258 I added a comment, I would encourage others to add comments if they think this idea would be beneficial. On Wed, Mar 1, 2017 at 8:50 AM, John Omernik wrote: > So what would need to b

Re: Dealing with bad data when trying to do date computations

2017-03-01 Thread John Omernik
So what would need to be done to get this process kick started? I see a few components here: 1. Develop the table in sys (sys.functions) that stores the information about the function. For this I propose this for discussions name - The name of the function description - The Description of the f

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread John Omernik
You could also generate documentation updates via query at each release. This would be a great feature, move the information close to the analysts hands, I love how that would work. (I think I remember some talk about extending sys.options to be self documenting as well ) On Tue, Feb 28, 20

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread Jinfeng Ni
Regarding the list of functions (build-in or UDF), someone once suggested that we make the functions self-documented by adding a sys.functions table. select * from sys.functions where name like '%SPLIT%'; return function_name, parameter_list, description etc. This way, use could simply query sys

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread Jinfeng Ni
> 4. I think as part of developer review and pull requests that add > functions/functionality should require a pull request to also provide a > documentation update. This helps to ensure that the docs keep up to date, > as well as keeping users appraised of what is happening... i.e. it's a good >

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread John Omernik
Thanks Charles, that worked even on my 1.8. Drill folks: We need to do some documentation updates. We've added functions (like REGEXP_MATCHES, and it's in 1.8, so I am not sure where it was added) and other functions like SPLIT and yet no mention in https://drill.apache.org/docs/string-manipulat

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread Charles Givre
Hi John, I believe that Drill 1.9 includes a REGEXP_MATCHES( , ) function which does what you'd expect it to. I'm not sure when this was introduced, so it maybe in earlier versions of Drill. Best, -- C On Tue, Feb 28, 2017 at 11:03 AM, John Omernik wrote: > I have a data set that has birthdays

Dealing with bad data when trying to do date computations

2017-02-28 Thread John Omernik
I have a data set that has birthdays in -MM-DD format. Most of this data is great. I am trying to compute the age using EXTRACT(year from age(dob)) But some of my data is crapola... let's call it alternative data... When I try to run the Extract function, I get Error: SYSTEM ERROR: Illeg