Re: [julia-users] Re: zero-allocation reinterpretation of bytes

Jameson Nash Wed, 25 Mar 2015 21:24:30 -0700

> Given the performance difference and the different behavior, I'm tempted
to just deprecate the two-argument form of pointer.


let's try to be aware of the fact that there is is no performance
difference, before we throw out any wild claims about function calls being
problematic or slow:

julia> g(x) = for i = 1:1e6 pointer(x,12) end
g (generic function with 1 method)

julia> h(x) = for i = 1:1e6 pointer(x)+12*sizeof(x) end
h (generic function with 1 method)

julia> @time g(Int8[])
elapsed time: 0.451235329 seconds (144 bytes allocated)

julia> @time h(Int8[])
elapsed time: 0.450592699 seconds (144 bytes allocated)

> There's a branch in eltype, which is probably causing this difference.

That branch is of the form `if true`, so it will get optimized away. (there
is a performance gap still to calling sizeof, but it stems from a current
limitation of the julia codegen/inference, and not anything major)

> To more closely follow the principle of pointer arithmetic long ago
established by C

C needed to define pointer arithmetic to be equivalent to array access,
because it decided that `a[x]` was defined to be just syntactic sugar for
`*(a+x)`. I don't see how that is really a feature, since it throws away
perfectly good syntax and instead gives you something harder to use. So
instead, Julia defines math-like operations to generally work like math (so
x+1 gives you the pointer to the next byte), and array-like operations work
like array operations (so unsafe_load, pointer, getindex, pointer_to_array,
etc. all operate based on elements). FWIW though, Wikipedia seems to note
that most languages don't define pointer arithmetic at all:
http://en.wikipedia.org/wiki/Pointer_(computer_programming)

For your purposes, I believe you should be able to dispense with pointers
entirely by reading the data from a file (or IOBuffer) and using StrPack.jl
to deal with any specific alignment issues you may encounter.

On Wed, Mar 25, 2015 at 9:07 AM Stefan Karpinski <ste...@karpinski.org>
wrote:

> Given the performance difference and the different behavior, I'm tempted
> to just deprecate the two-argument form of pointer.
>
> On Wed, Mar 25, 2015 at 12:53 PM, Sebastian Good <
> sebast...@palladiumconsulting.com> wrote:
>
>> I guess what I find most confusing is that there would be a difference,
>> since adding 1 to a pointer only adds one byte, not one element size.
>>
>> > p1 = pointer(zeros(UInt64));
>> Ptr{UInt64} @0x000000010b28c360
>> > p1 + 1
>> Ptr{UInt64} @0x000000010b28c361
>>
>> I would have expected the latter to end in 68. the two argument pointer
>> function gets this “right”.
>>
>> > a=zeros(UInt64);
>> > pointer(a,1)
>> Ptr{Int64} @0x000000010b9c72e0
>> > pointer(a,2)
>> Ptr{Int64} @0x000000010b9c72e8
>>
>> I can see arguments multiple ways, but when I’m given a strongly typed
>> pointer (Ptr{T}), I would expect it to participate in arithmetic in
>> increments of sizeof(T).
>>
>> On March 25, 2015 at 6:36:37 AM, Stefan Karpinski (ste...@karpinski.org)
>> wrote:
>>
>> That does seem to be the issue. It's tricky to fix since you can't
>> evaluate sizeof(Ptr) unless the condition is true.
>>
>> On Tue, Mar 24, 2015 at 7:13 PM, Stefan Karpinski <ste...@karpinski.org>
>> wrote:
>>
>>> There's a branch in eltype, which is probably causing this difference.
>>>
>>> On Tue, Mar 24, 2015 at 7:00 PM, Sebastian Good <
>>> sebast...@palladiumconsulting.com> wrote:
>>>
>>>>  Yep, that’s done it. The only difference I can see in the code I
>>>> wrote before and this code is that previously I had
>>>>
>>>> convert(Ptr{T}, pointer(raw, byte_number))
>>>>
>>>>  whereas here we have
>>>>
>>>> convert(Ptr{T}, pointer(raw) + byte_number - 1)
>>>>
>>>> The former construction seems to emit a call to a Julia-intrinsic
>>>> function, while the latter executes the more expected simple machine loads.
>>>> Is there a subtle difference between the two calls to pointer?
>>>>
>>>> Thanks all for your help!
>>>>
>>>> On March 24, 2015 at 12:19:00 PM, Matt Bauman (mbau...@gmail.com)
>>>> wrote:
>>>>
>>>>  (The key is to ensure that the method gets specialized for different
>>>> types with the parametric `::Type{T}` in the signature instead of
>>>> `T::DataType`).
>>>>
>>>> On Tuesday, March 24, 2015 at 12:10:59 PM UTC-4, Stefan Karpinski
>>>> wrote:
>>>>>
>>>>> This seems like it works fine to me (on both 0.3 and 0.4):
>>>>>
>>>>>  immutable Test
>>>>> x::Float32
>>>>> y::Int64
>>>>> z::Int8
>>>>> end
>>>>>
>>>>>  julia> a = [Test(1,2,3)]
>>>>> 1-element Array{Test,1}:
>>>>>  Test(1.0f0,2,3)
>>>>>
>>>>> julia> b = copy(reinterpret(UInt8, a))
>>>>> 24-element Array{UInt8,1}:
>>>>>  0x00
>>>>>  0x00
>>>>>  0x80
>>>>>  0x3f
>>>>>  0x03
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>  0x02
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>  0x03
>>>>>  0xe0
>>>>>  0x82
>>>>>  0x10
>>>>>  0x01
>>>>>  0x00
>>>>>  0x00
>>>>>  0x00
>>>>>
>>>>> julia> prim_read{T}(::Type{T}, data::Array{Uint8,1}, offset::Int) =
>>>>> unsafe_load(convert(Ptr{T}, pointer(data) + offset))
>>>>> prim_read (generic function with 1 method)
>>>>>
>>>>> julia> prim_read(Test, b, 0)
>>>>> Test(1.0f0,2,3)
>>>>>
>>>>> julia> @code_native prim_read(Test, b, 0)
>>>>> .section __TEXT,__text,regular,pure_instructions
>>>>> Filename: none
>>>>> Source line: 1
>>>>> push RBP
>>>>> mov RBP, RSP
>>>>> Source line: 1
>>>>> mov RCX, QWORD PTR [RSI + 8]
>>>>> vmovss XMM0, DWORD PTR [RCX + RDX]
>>>>> mov RAX, QWORD PTR [RCX + RDX + 8]
>>>>> mov DL, BYTE PTR [RCX + RDX + 16]
>>>>> pop RBP
>>>>> ret
>>>>>
>>>>>
>>>>> On Tue, Mar 24, 2015 at 5:04 PM, Simon Danisch <sdan...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> There is a high chance that I simply don't understand llvmcall well
>>>>>> enough, though ;)
>>>>>>
>>>>>> Am Montag, 23. März 2015 20:20:09 UTC+1 schrieb Sebastian Good:
>>>>>>>
>>>>>>> I'm trying to read some binary formatted data. In C, I would define
>>>>>>> an appropriately padded struct and cast away. Is is possible to do
>>>>>>> something similar in Julia, though for only one value at a time?
>>>>>>> Philosophically, I'd like to approximate the following, for some simple
>>>>>>> bittypes T (Int32, Float32, etc.)
>>>>>>>
>>>>>>> T read<T>(char* data, size_t offset) { return *(T*)(data + offset); }
>>>>>>>
>>>>>>> The transliteration of this brain-dead approach results in the
>>>>>>> following, which seems to allocate a boxed Pointer object on every
>>>>>>> invocation. The pointer function comes with ample warnings about how it
>>>>>>> shouldn't be used, and I imagine that it's not polite to the garbage
>>>>>>> collector.
>>>>>>>
>>>>>>>  prim_read{T}(::Type{T}, data::AbstractArray{Uint8, 1}, byte_number)
>>>>>>> = unsafe_load(convert(Ptr{T}, pointer(data, byte_number)))
>>>>>>>
>>>>>>> I can reinterpret the whole array, but this will involve a division
>>>>>>> of the offset to calculate the new offset relative to the reinterpreted
>>>>>>> array, and it allocates an array object.
>>>>>>>
>>>>>>> Is there a better way to simply read the machine word at a
>>>>>>> particular offset in a byte array? I would think it should inline to a
>>>>>>> single assembly instruction if done right.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Re: [julia-users] Re: zero-allocation reinterpretation of bytes

Reply via email to