Hi Milan,

Thanks for your advice.

I spotted one corruption in a smaller sample of 3000 lines and then it 
worked.

Then a tried a larger number of 10000 lines and it gave the following:
Saw 10000 rows, 4 columns (correct) and 40022 fields*Line 1 has 6 columns 
(not sure where "line 1" starts but line 1 was ok as per using only 3000 
lines file)

How do I find the corruptions using the above message? Clearly it detected 
6 columns in some "Line 1", but it is not the first line.

Are there any julia functions or packages I can use to clean up the data or 
that will highlight corrupted lines in the data.

I did try loading the 15,000 line csv file into excel and it worked fine 
there.

Looking forward to your expert advice.

Thanks.

Keith  

On Friday, 6 February 2015 12:19:55 UTC-8, Milan Bouchet-Valat wrote:
>
> Le vendredi 06 février 2015 à 11:12 -0800, Keith Kee a écrit : 
> > Hi 
> > 
> > 
> > Using DataFrames ( v"0.6.0" ) and Win32 julia 0.3.5 
> > 
> > 
> > ds = readtable("EURUSD.CSV", header=false) 
> > 
> > 
> > 
> > results in 
> > 
> > 
> > 
> > BoundsError() 
> > in findcorruption at io.jl:698 
> > in readtable! at io.jl:779 
> > in readtable at io.jl:893 
> > 
> > 
> > The original file has 15000 lines, works when I cut it down to 10 
> > lines. 
> > 
> > 
> > Please advise as to whether there are limits to readtable on win32 
> > setups? 
> 15000 sounds quite small even for 32-bit. More likely, the file contains 
> something readtable() doesn't like, and which does not appear in the 
> first 10 lines. You could try removing half of the file, see if it 
> works, and go on like that until you (possibly) find out which line 
> creates a bug. 
>
>
> Regards 
>

Reply via email to