Thank you all for the great info! I've opened up a ticket here:
https://issues.apache.org/jira/browse/AVRO-2644

In the meantime, I've passed the files in explicitally which works well.

Thanks again,
Austin

On Thu, Nov 28, 2019 at 10:43 AM Lee Hambley <lee.hamb...@gmail.com> wrote:

> `info sort` (which is mostly talking about the `sort` POSIX command)
> claims that LC_COLLATE is what affects sort orders on POSIX-like systems,
> but I know that LANG and LC_ALL somehow override those other variables.
>
> "C" is, as Michael says the "safest" and/or most predictable one, it sorts
> by the naive reading of the underlying bytes, no lexical sorting of numbers
> and lowercase numbers sort earlier than their uppercase counterparts
> (because they are smaller binary numbers)
>
> Hope this helps
>
> Lee Hambley
> http://lee.hambley.name/
> +49 (0) 170 298 5667
>
>
> On Thu, 28 Nov 2019 at 15:56, Michael A. Smith <mich...@smith-li.com>
> wrote:
>
>> On unixish systems it probably depends on the locale, as in LANG and
>> LC_COLLATE. In my experience, the least surprising behavior comes with
>> LANG=C, except when you're dealing with file names containing a lot of
>> non-ascii text.
>>
>> On Thu, Nov 28, 2019 at 04:29 Ryan Skraba <r...@skraba.com> wrote:
>>
>>> Effectively, the schemas are added in the order that the file system
>>> lists files:
>>> https://github.com/apache/avro/blob/f310ac8db5ab962a49d448f41b7b953488cdb033/lang/java/tools/src/main/java/org/apache/avro/tool/SpecificCompilerTool.java#L149
>>>
>>> As you observed, this depends on the operating system and/or
>>> filesystem... I've experienced this in the past (with an unrelated
>>> tool that generated a classpath from a list of JARS, and seeing an
>>> unreliable order on Windows vs. linux).
>>>
>>> Just reading the code, it should be deterministic if you explicitly
>>> list the avsc files (or at least the "problem" file)  with the
>>> required order:
>>>
>>> java -jar avro-tools-1.9.1.jar compile schemas/Component.avsc
>>> /schemas/Parent.avsc out-dir/
>>>
>>> or
>>>
>>> java -jar avro-tools-1.9.1.jar compile schemas/Component.avsc schemas/
>>> out-dir/
>>>
>>> Would it be possible to give this workaround a try?
>>>
>>> I took a quick look at the avro-maven-plugin; it doesn't use
>>> listFiles() directly to discover files, but uses FileSetManager from
>>> the maven project.  I'm hoping they've taken this into account!
>>>
>>> Thanks for the well-described, well-defined email!  It would make an
>>> excellent bug report :D  https://issues.apache.org/jira/browse/AVRO
>>>
>>> Ryan
>>>
>>>
>>> On Thu, Nov 28, 2019 at 12:05 AM Austin Cawley-Edwards
>>> <austin.caw...@gmail.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > We're trying to use the `compile {src dir} {output dir}` command in
>>> > `avro-tools` and finding that there are some non-deterministic
>>> > behaviors between systems, depending on how the OS sorts files.
>>> >
>>> > Example:
>>> > schemas/Component.avsc
>>> >   - defines Component record type in the namespace `com.test`
>>> >
>>> > schemas/Parent.avsc
>>> >   - defines a Parent record,  in the same `com.test` namespace, with a
>>> > field of type `com.test.Component`
>>> >
>>> >
>>> > With the command, `java -jar avro-tools-1.9.1.jar compile schemas/
>>> > out-dir/`, some systems compile the directory in the order Component,
>>> > Parent while others compile in the order Parent, Component. The latter
>>> > fails as Component has not been defined when it is referenced by
>>> > Parent.
>>> >
>>> > We have also tried using the IDL and importing the dependency types,
>>> > and then converting them to avsc, and finally compiling the entire
>>> > directory, but that fails as the generated avsc files embed/ duplicate
>>> > the "Component" types each time it is used.
>>> >
>>> >
>>> > Is there a way to deterministically compile a directory? Or compile
>>> > directly from IDL to java?
>>> >
>>> >
>>> > OS:
>>> > Linux 857aaf92e059 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12
>>> > 10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
>>> >
>>> > Avro:
>>> > version 1.9.1
>>> >
>>> >
>>> >
>>> > Thank you!
>>> > Austin Cawley-Edwards
>>>
>>

Reply via email to