For testing/evaluation purposes, it's actually the case that I don't need 
to actually use any of the fixed length string fields. They are in the 
data, but I have numerical encodings for most of the important ones in 
other fields. So in playing around I found that I could just create a 
bitstype of the appropriate size to consume them when using test data. 

However the caveat to this strategy is that numpy's internal memory layout 
for a record array (and when it's saved to a file), is that the records are 
packed structs. So a record with fields float64, int32, float32, int16 has 
an itemsize of 18 bytes. If I build an immutable composite type in Julia:

immutable TestType
    a::Float64
    b::Int32
    c::Float32
    d::Int16
end

And then query the size of each field and the the aggregate as a whole:

println(sizeof(Float64))
println(sizeof(Int32))
println(sizeof(Float32))
println(sizeof(Int16))
println(sizeof(TestType))

I get:

8
4
4
2
24

So it looks like Julia is padding the internal layout of `TestType` (I was 
hoping this wasn't the case based on some of the language in:
http://julialang.org/blog/2013/03/efficient-aggregates/). 

If I put dummy fields to mimic the padding when I create some test data, I 
can get everything to work just fine on the Julia side. However, if for 
real data I assume that I get a file and can't pick packed vs padded, is 
there any way on the Julia side to specified that my immutable type's 
memory layout should be packed? Or perhaps some other workaround? I 
couldn't find any promising leads on this reading the documentation and 
searching through the Github issues. In cython this is trivial to do since 
you can define an actual struct as `cdef packed struct`. 

I forgot to mention it when I posted originally, but I'm currently using 
Julia v0.3.2 on OS X.

Any suggestions would be appreciated.

Josh


On Friday, November 21, 2014 3:11:32 PM UTC-5, Tim Holy wrote:
>
> You'll see why if you type `methods(mmap_array)`: the dims has to be 
> represented as a tuple. 
>
> Currently, the only way I know of to create a fixed-sized buffer as an 
> element 
> of a "struct" in julia is via immutables with one field per object. Here's 
> one 
> example: 
>
> https://github.com/JuliaGPU/CUDArt.jl/blob/1742a19b35a52ecec4ee14cfbec823f8bcb22e0f/gen/gen_libcudart_h.jl#L403-L660
>  
>
> It has not escaped notice that this is less than ideal :-). 
>
> --Tim 
>
> On Friday, November 21, 2014 11:57:10 AM Joshua Adelman wrote: 
> > I'm playing around with Julia for the first time in an attempt to see if 
> I 
> > can replace a Python + Cython component of a system I'm building. 
> Basically 
> > I have a file of bytes representing a numpy structured/recarray (in 
> memory 
> > this is an array of structs). This gets memory mapped into a numpy array 
> as 
> > (Python code): 
> > 
> > f = open(data_file, 'r+') 
> > cmap = mmap.mmap(f.fileno(), nbytes) 
> > data_array = np.ndarray(size, dtype=dtype, buffer=cmap) 
> > 
> > 
> > where dtype=[('x', np.int32), ('y', np.float64), ('name', 'S17')]. 
> > 
> > In cython I would create a C packed struct and to deal with the fixed 
> > length string elements, I would specify them as char[N] arrays: 
> > 
> > cdef packed struct atype: 
> >     np.int32_t x 
> >     np.float64 y 
> >     char[17] name 
> > 
> > I'm trying to figure out how I would accomplish something similar in 
> Julia. 
> > Setting aside the issue of the fixed length strings for a moment, I 
> thought 
> > to initially create a composite type: 
> > 
> > immutable AType 
> >     x::Int32 
> >     y::Float64 
> >     name::??? 
> > end 
> > 
> > and then if I had an file containing 20 records use: 
> > 
> > f = open("test1.dat", "r") 
> > data = mmap_array(AType, 20, f) 
> > 
> > but I get an error: 
> > 
> > ERROR: `mmap_array` has no method matching mmap_array(::Type{AType}, 
> > 
> > ::Int64, ::IOStream) 
> > 
> > Is there a way to memory map a file into an array of custom 
> > records/composite types in Julia? And if there is, how should one 
> represent 
> > the fixed length string fields? 
> > 
> > Any suggestions would be much appreciated. 
> > 
> > Josh 
>
>

Reply via email to