On Monday, September 26, 2016 at 1:59:15 PM UTC, David Smith wrote:
>
> Hi, Isaiah. This is a valid question.
>
> 0. As a preface, I'd like to say I'm not trying to replace anything. I 
> wrote RawArray to solve a problem we have in magnetic resonance imaging 
> (quickly saving and loading large complex float arrays), and then I decided 
> to share it so if other people like it and find it useful, then cool beans.
>
> Now for the mild stumping...
>
> 1. I don't think NRRD is as substantially used as you might think. I've 
> worked in imaging science for years on the data processing/file format end, 
> and I've never seen anyone use it, and I've never even heard of it.  (Pity, 
> because it looks nice enough. :-\)
>
> 2. RawArray is simpler to handle and trivial to understand. I believe all 
> you need from an I/O library is I/O.* I don't want my file I/O library 
> performing transformations on my data. 
>
> I also don't need it to read image formats. Part of the reason behind 
> RawArray is to avoid standard image formats because they are not optimized 
> for large complex-float arrays. I just want to save multi-GB data arrays to 
> disk quickly and read them back quickly on a different machine, five years 
> later. 
>
> I have other implementations (https://github.com/davidssmith/ra), and all 
> are super short and platform agnostic.
>
> 3. RawArray is surely faster. All it does is read. It doesn't perform any 
> transformations or encoding, so it can't possibly be slower than NRRD.
>

Maybe not compared to NRRD, but it can be slower than lossless image 
compression.

I did read (short.. good):
https://github.com/davidssmith/ra/blob/master/doc/ra-sedona-abstract.pdf

https://en.wikipedia.org/wiki/Free_Lossless_Image_Format

FLIF is not a replacement for all uses (multidimensional, would be 
interesting to know if could to be extended to..), but seem to be the best 
option for non-lyssy image compression:

http://flif.info/index.html
"
    53% smaller than lossless JPEG 2000 compression,
    74% smaller than lossless JPEG XR compression.

Even if the best image format was picked out of PNG, JPEG 2000, WebP or BPG 
for a given image corpus, depending on the type of images (photograph, line 
art, 8 bit or higher bit depth, etc), then FLIF still beats that by 12% on 
a median corpus 
[..]
    FLIF does away with knowing what image format performs the best at any 
given task.
[..]
Other lossless formats also support progressive decoding (e.g. PNG with 
Adam7 interlacing), but FLIF is better at it. Here is a simple 
demonstration video, which shows an image as it is slowly being downloaded:
[..]
No patents, Free

    Unlike some other image formats (e.g. BPG and JPEG 2000), FLIF is 
completely royalty-free and it is not known to be encumbered by software 
patents. At least as far as we know. FLIF is uses arithmetic coding, just 
like FFV1 (which inspired FLIF), but as far as we know, all patents related 
to arithmetic coding are expired. Other than that, we do not think FLIF 
uses any techniques on which patents are claimed. However, we are not 
lawyers. There are a stunning number of software patents, some of which are 
very broad and vague; it is impossible to read them all, let alone 
guarantee that nobody will ever claim part of FLIF to be covered by some 
patent. All we know is that we did not knowingly use any technique which is 
(still) patented, and we did not patent FLIF ourselves either.

    The reference implementation of FLIF is Free Software. It is released 
under the terms of the GNU Lesser General Public License (LGPL), version 3 
or any later version.
[..]
    The reference FLIF decoder is also available as a shared library, 
released under the more permissive (non-copyleft) terms of the Apache 2.0 
license. Public domain example code is available to illustrate how to use 
the decoder library.

    Moreover, the reference implementation is available free of charge 
(gratis) under these terms.
[..]
FLIF currently has the following features:

    Lossless compression
    Lossy compression (encoder preprocessing option, format itself is 
lossless so no generation loss)
    Greyscale, RGB, RGBA (also palette and color-bucket modes)
    Color depth: up to 16 bits per channel (high bit depth)"

-- 
Palli.

There is a C library at (https://github.com/davidssmith/ra) if you think a 
> pure Julia implementation isn't fast enough. 
>
> Cheers,
> Dave
>
> [*] That said, I'm not completely ruling out having transformations 
> available in RawArray between the RAM and disk. For example, when I first 
> wrote it, I had included Blosc compression as an option, signaled by a flag 
> in the header. But in general most transformations are best made in RAM 
> after reading or on disk with already existing, battle-proven tools, such 
> as gzip, uunencode, tar, etc. 
>
>
> On Sunday, September 25, 2016 at 9:59:45 PM UTC-5, Isaiah wrote:
>>
>> Is there a reason to use this file format over NRRD [1]? To borrow a wise 
>> phrasing: I wonder if the world needs another lightweight raw data format ;)
>>
>> For what it's worth, NRRD is already supported by JuliaIO/Images.jl, and 
>> I believe addresses the use-cases identified in your readme, but with a 
>> number of technical and non-technical advantages (not least: a number of 
>> independent implementations, and a substantial user base, at least as far 
>> as these things go).
>>
>> I say this -- very selfishly I admit -- as someone who has been on the 
>> receiving end of far too many files in home-brewed formats.
>>
>> [1] http://teem.sourceforge.net/nrrd/descformat.html
>>
>> On Sunday, September 25, 2016, David Smith <david...@gmail.com> wrote:
>>
>>> Hi, all:
>>>
>>> I finally pushed this out, and it might satisfy some of your needs for a 
>>> simple way to store N-d arrays to disk. Hope you enjoy it.
>>>
>>> RawArray (.ra) is a simple file format for storing n-dimensional arrays. 
>>> RawArray was designed to be portable, fast, storage efficient, and future 
>>> proof. Basically it writes the binary array data directly to disk with a 
>>> short header that is used to recreate type and dimension information. 
>>>
>>> RawArray is faster than HDF5 and supports complex numbers out of the 
>>> box, which HDF5 does not. RawArray supports all basic `Int`, `UInt`, 
>>> `Float`, and `Complex{}` types, and more can be easily added in the future, 
>>> such as Rational or Big*. It can also handle derived types, but the 
>>> serialization of them is currently left up to the user.
>>>
>>> A system of version numbers and flags are implemented to future-proof 
>>> the data files as well, in case the implementation needs to change for some 
>>> reason.
>>>
>>> You can grab it with `Pkg.add("RawArray")`. A minimum of Julia 0.4 is 
>>> required.
>>>
>>> Repository: https://github.com/davidssmith/RawArray.jl
>>>
>>> Cheers,
>>> Dave
>>>
>>

Reply via email to