[ 
https://issues.apache.org/jira/browse/AVRO-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901050#action_12901050
 ] 

Thiruvalluvan M. G. commented on AVRO-533:
------------------------------------------

Hi Jeremy,

I could get the code to compile on my machine. I was trying to understand the 
code from the bottom. Since the schema and encoding/decoding at the very 
bottom, the following comments apply to those parts of the code. Please bear 
with me, it'll take a while to go through the whole codebase. Since we wish and 
hope that the code will live for several years, a couple of weeks spent in 
getting it right should be worth it.

Most of the tests passed on my machine, some failed. One test even crashes 
NTest UI.

The class hierarchy of Schema is:
{code}
Schema
    ArraySchema
    MapSchema
    NamedSchema
        EnumSchema
        FixedSchema        
        RecordSchema
            ErrorSchema
    PrimitiveSchema
        BooleanSchema
        NullSchema
    UnionSchema
{code}

It's not clear why only NullSchema and BooleanSchema are specified and the rest 
of the primitive schemas are left alone. I think a a more appropriate hierarchy 
would be:

{code}
Schema
    NamedSchema
        EnumSchema
        FixedSchema        
        RecordSchema/ErrorSchema
    UnnamedSchema
        ArraySchema
        MapSchema
        PrimitiveSchema
        UnionSchema
{code}

Instead of having SchemaType as a string, it'd be better to have it as an enum, 
which, unlike string, is close-ended.

Testing appears inadequate. In TestSchema, for example the test only covers 
parsing.

In the Encoder and Decoder interfaces, we are accepting the Stream in every 
function. Typically, a lot of encoding/decoding happens on the same stream. So 
attaching the stream the interface once will be better, the way we do in the 
Java implementation.

I see some files have both Windows and Unix line-endings. I don't know where 
the problem is. Do you think keeping Windows line-ending throughout will keep 
us from these troubles?

Most of the places the tabs are expanded into spaces, which is perfect. But 
certain tabs seem to have escaped. We try to avoid hard-tabs.

We can save quite a bit of code if we use parameterized tests. There is attempt 
to achieve parameterization in TestSchema. But it leads to some problems, I 
think. When tests fail, it becomes hard to find which test has failed.

I tired to capture all the above ideas (regarding the Schema not 
encoder/decoder) by refactoring the code. I kept it at 
g...@github.com:thirumg/Avro.NET.git in the master branch.
Please feel free to pull it and try. I have commented out certain tests and 
code in Avrogen because of this refactroing. Please don't mistake me, my 
intention was not to arbitrarily change your code. I thought some ideas are 
better shown by demonstration rather than pages and pages of text.

Please let me know what you think.

Do you have any suggestion for a code coverage tool? It'll be nice to know how 
much coverage we get with our tests.

Thanks

Thiru

   

> .NET implementation of Avro
> ---------------------------
>
>                 Key: AVRO-533
>                 URL: https://issues.apache.org/jira/browse/AVRO-533
>             Project: Avro
>          Issue Type: New Feature
>    Affects Versions: 1.4.0
>            Reporter: Jeff Hammerbacher
>         Attachments: AVRO-533.zip, dotnet.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to