Andries Engelbrecht created DRILL-2456: ------------------------------------------
Summary: regexp_replace using hex codes fails on larger JSON data sets Key: DRILL-2456 URL: https://issues.apache.org/jira/browse/DRILL-2456 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 0.7.0 Environment: Drill 0.7 MapR 4.0.1 CentOS Reporter: Andries Engelbrecht Assignee: Daniel Barclay (Drill) Attachments: drillbit.log This query works with only 1 file select regexp_replace(`text`, '[^\x20-\xad]', '°'), count(id) from dfs.twitter.`/feed/2015/03/13/17/FlumeData.1426267859699.json` group by `text` order by count(id) desc limit 10; This one fails with multiple files select regexp_replace(`text`, '[^\x20-\xad]', '°'), count(id) from dfs.twitter.`/feed/2015/03/13` group by `text` order by count(id) desc limit 10; Query failed: Query failed: Failure while trying to start remote fragment, Encountered an illegal char on line 1, column 31: '' [ 43ff1aa4-4a71-455d-b817-ec5eb8d179bb on twitternode:31010 ] Using text in regexp_replace does work for same dataset. This query works fine on full data set. select regexp_replace(`text`, '[^ -~¡-ÿ]', '°'), count(id) from dfs.twitter.`/feed/2015/03/13` group by `text` order by count(id) desc limit 10; Attached snippet drillbit.log for error -- This message was sent by Atlassian JIRA (v6.3.4#6332)