Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

Derek Homeier Tue, 26 Aug 2014 09:57:29 -0700

Hi Adrian,

>> not sure whether to call it a bug; the error seems to arise before reading 
>> any actual data
>> (even on reading from an empty string); when genfromtxt is checking the 
>> filling_values used
>> to substitute missing or invalid data it is apparently testing on default 
>> testing values of 1 or -1
>> which your conversion scheme does not know about. Although I think it is 
>> rather the user’s
>> responsibility to provide valid converters, probably the documentation 
>> should at least be
>> updated to make them aware of this requirement.
>> I see two possible fixes/workarounds:
>> 
>> provide an keyword argument filling_values=[0,0,'1:1’]
> This workaround seems to be work, but I doubt that the actual problem is
> the converter function I pass. The '-1', which is used as the testing
> value is the first_values from the 3rd column (line 1574 in npyio.py),
> but the converter is defined for column 4. by setting the filling_values
> to an array of length 3, this obviously makes the problem disappear. But
> I think if the first row is used, it should also use the values from the
> column for which the converter is defined.


it is certainly related to the converter function because a KeyError for the 
dictionary you provide is raised:
File "test.py", line 13, in <module>
    3: lambda rel: relEnum[rel.decode()]})
  File "/sw/lib/python3.4/site-packages/numpy/lib/npyio.py", line 1581, in 
genfromtxt
    missing_values=missing_values[i],)
  File "/sw/lib/python3.4/site-packages/numpy/lib/_iotools.py", line 784, in 
update
    tester = func(testing_value or asbytes('1'))
  File "test.py", line 13, in <lambda>
    3: lambda rel: relEnum[rel.decode()]})
KeyError: '-1’

But you are right that the problem with using the first_values, which should of 
course be valid,
somehow stems from the use of usecols, it seems that in that loop

    for (i, conv) in user_converters.items():

i in user_converters and in usecols get out of sync. This certainly looks like 
a bug, the entire way of
modifying i inside the loop appears a bit dangerous to me. I’ll have look if I 
can make this safer.

As long as your data don’t actually contain any missing values you might also 
simply use np.loadtxt.

Cheers,
                                                Derek

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

Reply via email to