Vishal, Does the output have to be DEC? Could you try FLOAT? The other option would be to use the TO_NUMBER function. One thing which might be causing an issue are null values as well. In any event, can you query the parquet output that Drill is generating? If so, another option might be to look at the last entry in the parquet file and then find that entry in your CSV data to see what is "near" to see if you can figure out what is breaking.
Best, -- C > On Feb 14, 2020, at 2:40 PM, Vishal Jadhav (BLOOMBERG/ 731 LEX) > <[email protected]> wrote: > > Yes, that's what I am doing and it seems to work, I am casting the data as > e.g. CAST (price as DEC(a,b)). > > I have about 40,000 csv, each with about 1000+ rows, it fails after 5 mins of > conversion and I do see some parquet files are produced. So, it would nice to > know how far we went through the logs, what record is having an issue. > > Error does say, look at the logs, but not able to find anything meaningful in > there. > > > From: [email protected] At: 02/14/20 12:18:36To: Vishal Jadhav > (BLOOMBERG/ 731 LEX ) , [email protected] > Subject: Re: data issue > > Hi Vishal, > This one is an easy one (I think)... All columns in CSV are read as VARCHAR. > So if you are trying to convert anything in CSV to a Numeric format, you will > first have to CAST it via one of Drill's data conversion functions to the > appropriate numeric type. > -- C > >> On Feb 14, 2020, at 10:44 AM, Vishal Jadhav (BLOOMBERG/ 731 LEX) > <[email protected]> wrote: >> >> During my select statement on conversion of csv file to parquet file, I get > the NumberFormatException exception, I am running drill in the embedded mode. > Is there a way to find out which csv file or row in that file is causing the > issue? >> I checked the logs with trace verbosity, but not able find the 'data' which > has the issue. >> >> Error: SYSTEM ERROR: NumberFormatException >> >> Fragment 1:5 >> >> Please, refer to logs for more information. >> >> Thanks! >> - Vishal >> > >
