[ 
https://issues.apache.org/jira/browse/AVRO-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907288#action_12907288
 ] 

Doug Cutting commented on AVRO-659:
-----------------------------------

Jeff, I'm still trying to understand the use case you have in mind.

Most folks writing data to files should use an Avro data file, which includes 
the schema.  If folks are doing RPC, then the protocol they use to write data 
is typically a file in their source code tree, and the protocol they use to 
read data is determined through the handshake.   If folks are writing 
individual records to a database then a best practice is to maintain a registry 
of schemas used in the database as a separate table, and have each instance 
refer to its schema in the registry via its MD5 hash.  The application would 
still probably store or create the schemas it uses for new database records 
with the source code.  The registry is updated when writing records and 
accessed when reading them.

We do not want to encourage folks to write data without also storing the schema 
used to write that schema in the same repository as the data. I don't feel a 
path-based schema registry is a good idea.  Keeping a copy of the schema with 
source code that writes data is a good practice: the schema is part of the 
writing code and should be versioned with it.  Generating schemas on the fly 
when writing data is a fine practice too.  But whenever data is persisted, its 
schema should be stored with it.

> Portable specification of the location of schema and protocol files
> -------------------------------------------------------------------
>
>                 Key: AVRO-659
>                 URL: https://issues.apache.org/jira/browse/AVRO-659
>             Project: Avro
>          Issue Type: New Feature
>            Reporter: Jeff Hammerbacher
>
> Avro doesn't require code generation, which is great. However, if you want to 
> use a protocol or a schema, your code needs to know where to find it. When 
> your code is ported to new systems, the protocol or schema file must be 
> placed in the same place as on the previous system for things to work 
> correctly.
> For importing modules in a portable fashion, Python provides a default set of 
> places it will look for modules and an environment variable called PYTHONPATH 
> that programs can use to override these defaults. It may be useful to explore 
> similar constructs for Avro implementations that don't do code generation. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to