Probably not in most, you're right.

Can't you get generic code as long as a method to convert to OrderedDict is 
supplied, though?

When you don't need anything more specific, convert the dataframe row to an 
OrderedDict, then either work with that object or convert it into a more 
appropriate internal format. But if you want to write specific algorithms 
for different storage types, that's still an option (e.g. either work with 
immutable DBI rows, or use a custom convert method to a more appropriate 
format, skipping the OrderedDict intermediate step).

On Friday, September 12, 2014 3:26:47 PM UTC-5, John Myles White wrote:
>
> I'm not sure that losing zero copy semantics is actually a big performance 
> hit in most pipelines.
>
> I think much more important is that you can't write generic code right now 
> because the abstractions aren't linked in any way. The rows you fetch from 
> a database using DBI aren't mutable, whereas the rows you fecth using 
> eachrow(df) are.
>
>  -- John
>
> On Sep 12, 2014, at 1:08 PM, Gray Calhoun <gcal...@iastate.edu 
> <javascript:>> wrote:
>
> It seems like standardizing on "convert" would be a natural approach when 
> one needs to go from one to the other. I don't know the DBI semantics, but
>
>   myrow = convert(Dict, mydataframerow)
>   myrow2 = convert(OrderedDict, mydataframerow), 
>
> etc is transparent and lets different data storage objects use efficient 
> representations internally (losing "zero copy semantics" is a huge 
> sacrifice.)
>
> It's also easier to enforce in future packages: much simpler to add 
> convert methods than to re-represent rows as OrderedDicts (or whatever 
> datatype).
>
> On Friday, September 12, 2014 12:19:47 PM UTC-5, John Myles White wrote:
>>
>> We really need to standardize on a single type that reflects a single row 
>> of a tabular data structure that gets used both by DBI and by DataFrames.
>>
>> DataFrameRow is really nice because it's a zero-copy operation for 
>> DataFrames, but we can't provide zero-copy semantics when pulling rows out 
>> of a database.
>>
>> I tend to think we should have all tabular data systems use an 
>> OrderedDict to represent a single row of data.
>>
> [...] 
>
>
>

Reply via email to