Hi Seth:

On 15/1/25 15:06, 'Seth Hillbrand' via KiCad Developers wrote:
The data for embedded files follows the SEXPR format (https://datatracker.ietf.org/doc/draft-rivest-sexp/). Base64 is supposed to be bracketed by the pipes.  This allows third-party sexpr parsers to more easily handle our data format when we follow conventions.  We did not do this for the images and that was an oversight.  Eventually, images will be added to the embedded files format and the distinction will go away.


Ok, note that images already changed in the past, a pitty they didn't get the correct format. (Note: data was "xxxx" and changed to xxxx)

This is not the only thing that is constantly changing, things like "hide -> (hide yes)" or "(uuid xxxx) -> (uuid "xxxx")" pop quite often. I guess somebody should be in charge of approving the way things are implemented in the file formats. Not to mention document it before a release, and I mean document the new release not the previous.

BTW: This is related to the popularity issue, if the change of format from custom to Sexp had been from custom to JSON (IMHO far more popular than Sexp) these errors would not happened. You have plenty of libs and tools to implement and verify JSON.


We use MurMur3 hash -- unmodified from the source at https://github.com/aappleby/smhasher. You might look at things like https://stackoverflow.com/questions/75921577/murmur3-hash-compatibility-between-go-and-python to determine why your method is different.  Yes it is fast. No it is not worst.  We do have robust and popular hashes. This is one of them.  That is why we use it.


I see we have a quite different idea of what is popular. Let me clarify, if you get a minimal Linux core, lets say the docker image for "debian:bookworm-slim" (a slim version of Debian Bookworm intended to be the base for other docker images) you'll find MD5, SHA256, SHA512, SHA224, SHA384 and a few more hashes implemented with command line commands. If you take a language like Python (included in KiCad) and take a look at the standard hashlib module you'll find SHA1, SHA224, SHA256, SHA384, SHA512, SHA-3 and MD5. These are popular hash algorithms.

Now if you take a look at MMH3 ... even the command line tool is rare and hard to find! Not supported by the core Python, more than one competing modules at PyPi, the most popular implements MMH2, not MMH3. The one that implements MMH3 isn't popular enough to be part of Debian. For me this isn't a popular hash.

The compression used (Zstandard) is becoming popular, but isn't really popular. If you use Base64 + GZip + MD5 your data can be processed by a shell script on most (if not all) modern Unix style OSs and you don't need extra dependencies for Python.


Bug reports for preferred behavior are great to receive at GitLab.


You mean the image data vs embedded file inconsistency?


Regards, SET



KiCad Services Corporation Logo
Seth Hillbrand
*Lead Developer*
+1-530-302-5483‬
Long Beach, CA
www.kipro-pcb.com <https://www.kipro-pcb.com/> [email protected]



On Wed, Jan 15, 2025 at 5:03 AM Salvador E. Tropea <[email protected]> wrote:

    Hi All!

    Given the lack of documentation, I have some questions:

    1) Why the data for embedded files seems to be so different than the
    data for images? I mean, images are base64 encoded and stored as
    (data
    STRING) with the string separated in chunks, before KiCad 8 it
    wasn't an
    string, so it looked as keywords. KiCad 8 fixed it. And now embedded
    files are (data |KEYWORDS|) ... Why the |? Why not strings? Can
    someone
    explain it?

    2) The checksum seems to be a really rare one, is it MurMur Hash 3
    with
    seed 0xABBA2345? I can't find a popular command line tool to
    verify it.
    I tried the "mmh3" Python module using `mmh3.hash128(c,
    seed=0xABBA2345)` (with c as the bytes from the file decoded and
    uncompressed) and couldn't reproduce the checksum. I guess this rare
    hash is fast, is it worst? I mean: we have various robust and popular
    hashes, why this?

    BTW: I find strange that after choosing to embed fonts the dialog
    doesn't immediatly show them. They are there when I save, but I think
    they should be there before.

    Regards, SET

-- You received this message because you are subscribed to the Google
    Groups "KiCad Developers" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:devlist%[email protected]>.
    To view this discussion visit
    
https://groups.google.com/a/kicad.org/d/msgid/devlist/b4479a17-148b-490d-8058-4c82225e4e11%40inti.gob.ar.

--
You received this message because you are subscribed to the Google Groups "KiCad Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/a/kicad.org/d/msgid/devlist/CAFdeG-p5CHrbVHcANSKjU3TSRarkzSb8LbnAsN2pC79xqisk8g%40mail.gmail.com <https://groups.google.com/a/kicad.org/d/msgid/devlist/CAFdeG-p5CHrbVHcANSKjU3TSRarkzSb8LbnAsN2pC79xqisk8g%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups "KiCad 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/kicad.org/d/msgid/devlist/7edff2ca-38f5-447d-8247-d8f9b6ec3ccb%40inti.gob.ar.

Reply via email to