So my question on the double escape, is there no way to handle that so the user can use single escaped regex? I know many folks who use big data platform to test large complex regexes for things like security appliances, and having to convert the regex seems like a lot of work if you consider every user has to do that. If there was a way to do it in Drill, that would save countless people hours and save many mistakes.
On Thu, Feb 4, 2016 at 12:03 PM, Nicolas Paris <[email protected]> wrote: > John, Jason, > > 2016-02-04 18:47 GMT+01:00 John Omernik <[email protected]>: > > > I'd be curios on how you are implemeting the regex... using Java's regex > > libraries? etc. > > > Yeah, I use > java.util.regex > > > > > I know one thing with Hive that always bothered me was the need to double > > escape things. > > > > '\d\d\d\d-\d\d-\d\d' needed to be '\\d\\d\\d\\d-\\d\\d-\\d\\d' of we can > > avoid that it would be AWESOME. > > > My guess is this comes from java way to handle strings. All langages I > have used need to double escape. > > > > On Thu, Feb 4, 2016 at 11:37 AM, Jason Altekruse < > [email protected] > > > > > wrote: > > code is here: https://github.com/parisni/drill-simple-contains > It's disturbing how it is simple... > > > > > > I think you should actually just put the function in > > > > Drill itself. System > > > native functions are implemented in the same interface as UDFs, because > > our > > > mechanism for evaluating them is very efficient (we code generate code > > > blocks by linking together the bodies of the individual functions to > > > evaluate a complete expression). > > > well the folder tree is quite impressive (https://github.com/apache/drill > ). > > > what folder is supposed to be " > > Drill itself" > ? > > > > > You can open a JIRA, marking it a feature request. You can open a poll > > > request against the apache github repo, making sure you follow the > > standard > > > format for your commit message, prefixing with the JIRA number in the > > > format > > > Example: > > > DRILL-XXXX: Feature description > > > > > > This will automatically link the PR to your JIRA. > > > Ok I will try thanks > > a lot > > > > - Jason > > > > > > On Thu, Feb 4, 2016 at 8:44 AM, Nicolas Paris <[email protected]> > > wrote: > > > > > > > Jason, I have it working, > > > > > > > > Just tell me the way to proceed to PR. > > > > 1. where do I put my maven project ? Witch folder in my drill github > > > fork? > > > > 2. do I need a jira ? how proceed ? > > > > > > > > For now, I only published it on my github account in a separate > project > > > > > > > > Thanks > > > > > > > > 2016-02-04 16:52 GMT+01:00 Jason Altekruse <[email protected] > >: > > > > > > > > > Awesome, thanks! > > > > > > > > > > On Thu, Feb 4, 2016 at 7:44 AM, Nicolas Paris <[email protected] > > > > > > wrote: > > > > > > > > > > > Well I am creating a udf > > > > > > good exercise > > > > > > I hope a PR soon > > > > > > > > > > > > 2016-02-04 16:37 GMT+01:00 Jason Altekruse < > > [email protected] > > > >: > > > > > > > > > > > > > I didn't realize that we were lacking this functionality. As > the > > > > > > > repeated_contains operator handles wildcards it makes sense to > > add > > > > > such a > > > > > > > function to drill. > > > > > > > > > > > > > > It should be simple to implement, would someone like to open a > > JIRA > > > > and > > > > > > > submit a PR for this? > > > > > > > > > > > > > > - Jason > > > > > > > > > > > > > > On Tue, Feb 2, 2016 at 8:56 AM, John Omernik <[email protected] > > > > > > wrote: > > > > > > > > > > > > > > > I would like to see something like this as well, even if it's > > an > > > > > > included > > > > > > > > UDF like REGEX(field, pattern) using Java's library for regex > > > like > > > > > Hive > > > > > > > > does. That would be EXTREMELY helpful. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 2, 2016 at 6:55 AM, Nicolas Paris < > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > ANSI SQL doesn't define regex operator. > > > > > > > > > > Drill neither. > > > > > > > > > > > > > > > > > > > Drill has SQL functions extension like > "REPEATED_CONTAINS" > > > that > > > > > > looks > > > > > > > > to > > > > > > > > > handle regex. regex operator could be replaced with one new > > SQL > > > > > > > > extension ? > > > > > > > > > I guess I could create my own functions in java, right ? > > Maybe > > > > push > > > > > > it > > > > > > > > into > > > > > > > > > github then ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Doesn't it enough 'LIKE' operator? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sadly not, I'am looking for complex pattern matching. > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Miura, Masahide > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > > From: Nicolas Paris [mailto:[email protected]] > > > > > > > > > > Sent: Tuesday, February 02, 2016 9:04 PM > > > > > > > > > > To: [email protected] > > > > > > > > > > Subject: REGEX search Operator > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > I can't find any reference in the documentation about a > > regex > > > > > > > operator. > > > > > > > > > > > > > > > > > > > > I would like to be able to query this way : > > > > > > > > > > > > > > > > > > > > SELECT * > > > > > > > > > > FROM xxx > > > > > > > > > > WHERE text_field regexOperator 'regex_pattern'; > > > > > > > > > > > > > > > > > > > > Thanks for helping, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
