Hi Divya, I presume that “sample_data.csv” is your file? The default CSV configuration reads files without headers and puts all columns into a single array called “columns”. Do a SELECT * and you’ll see it. You’ll see an array that contains your data:
[“Fred”, “Flintstone”] So, the correct query would be: SELECT REGEXP_REPLACE(CAST(columns[0] AS VARCHAR(100)), '[,".]', '') FROM dfs.`installedsoftwares/ApacheDrill/apache-drill-1.10.0.tar/apache-drill-1.10.0/sample-data/sample_data.csv` I notice the regex is messing with brackets. Are you trying to convert the array form shown above to a cleaner list? Won’t work: those brackets are not in the data; they are the textual sugar added to show the array when printing. Maybe what you want is: SELECT columns[0] as `a`, columns[1] as `b` … Or, if your file actually contains headers, use a table function (or storage plugin config) to specify to use the headings to create individual columns. See the example at [1] under “Using the Formats Attributes as Table Function Parameters”. - Paul [1] https://drill.apache.org/docs/plugin-configuration-basics/ > On Jul 26, 2017, at 8:22 PM, Divya Gehlot <divya.htco...@gmail.com> wrote: > > The another thing which I observed is when I run below query > SELECT REGEXP_REPLACE('"This, col7 data yes."', '[,".]', '') FROM > (VALUES(1)) > EXPR$0 > This col7 data yes > > > Same when I run the csv file it gives me empty result set : > SELECT REGEXP_REPLACE(CAST(`Column1` AS VARCHAR(100)), '[,".]', '') FROM > dfs.`installedsoftwares/ApacheDrill/apache-drill-1.10.0.tar/apache-drill-1.10.0/sample-data/sample_data.csv` > > EXPR$0 > > > P.S. As advised all the above queries I ran on Drill 1.11 > > Appreciate the help . > > Thanks, > Divya > > On 27 July 2017 at 09:54, Divya Gehlot <divya.htco...@gmail.com> wrote: > >> Hi, >> Please find attached the sample_data.csv file >> Pasting the content of the csv file below , in case attachment doesn't >> reach >> >>> Column1,Column2,Column3,Column4,Column5 >>> colonedata1,coltwodata1,-35.924476,138.5987123, >>> colonedata2,coltwodata2,-27.4372536,153.0304583,137 >>> colonedata3,coltwodata3,-35.2793885,149.1233503,134 >>> colonedata4,coltwodata4,-33.8724176,151.2067579, >>> colonedata5,coltwodata5,,, >>> "This, col6 data",coltwodata6,-33.869732,151.2055553,351 >>> "This, col7 data yes.",coltwodata7,1.2845045,103.8482739,80 >>> Chifley,coltwodata5,,, >> >> >> Error : >> >>> Query Failed: An Error Occurred >>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: >>> IllegalArgumentException: reallocation size must be non-negative Fragment >>> 0:0 >> >> >> >> Thanks all for the help. >> >> Thanks , >> Divya >> >> >> On 26 July 2017 at 23:31, Paul Rogers <prog...@mapr.com> wrote: >> >>> Hi Divya, >>> >>> We found a couple of issues in CSV files that would lead to the kind of >>> errors you encountered. These issues will be fixed in the upcoming Drill >>> 1.11 release. >>> >>> Sharing a sample CSV file will let us check the issue. Even better, >>> voting is open for the 1.11 release. Please go ahead and download it and >>> try your file with that release. Let us know if you still have a problem. >>> >>> Thanks, >>> >>> - Paul >>> >>>> On Jul 26, 2017, at 6:14 AM, Khurram Faraaz <kfar...@mapr.com> wrote: >>>> >>>> Can you please share your CSV file, the SQL query and the version of >>> Drill that you are on. So someone can take a look and try to reproduce the >>> error that you are seeing. >>>> >>>> >>>> Thanks, >>>> >>>> Khurram >>>> >>>> ________________________________ >>>> From: Divya Gehlot <divya.htco...@gmail.com> >>>> Sent: Wednesday, July 26, 2017 3:18:08 PM >>>> To: user@drill.apache.org >>>> Subject: regex replace in string >>>> >>>> Hi, >>>> I have a CSV file where column values are >>>> "This is the column,one " >>>> "This is column , two" >>>> column3 >>>> column4 >>>> >>>> When I try to regex_replace it throws error >>>> >>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: >>>>> IllegalArgumentException: reallocation size must be non-negative >>> Fragment >>>>> 0:0 >>>> >>>> >>>> How can I read the above columns as one string like This is the column >>>> one >>>> Appreciate the help >>>> >>>> Thanks, >>>> Divyab >>> >>> >>