Re: AVRO schema evolution: adding optional column with default fails deserialization
First of all you can use confluents schema registry as you which - it's not in the paid bundle as long as you are not hosting kafka as a service (ie amazon et al). And I would recommend you to. It's good and trivial to operate. Second, take a look at the serializer in my pet project at: https://github.com/bitbouncer/kspp/blob/master/include/kspp/avro/avro_serdes.h :96 Note that this encoder/decoder does not support schema evolution but it discovers the actual written schema and gets a "avro::ValidSchema" from the schema registry on read. And this is what you need. This is of course c++ but you can probably figure out what you need to do. In the end you will need a rest/grpc service somewhere that your serializer can use to get an in that you can refer to across your infrastructure. I did write one some years ago but reverted to confluents since most people use that. /svante Den tors 1 aug. 2019 kl 18:05 skrev Martin Mucha : > Thanks for answer! > > Ad: "which byte[] are we talking about?" — actually I don't know. Please > lets break it down together. > > I'm pretty sure, that we're not using confluent platform(iiuc the paid > bundle, right?). I shared some serializer before [1], so you're saying, > that this wont include neither schema ID, nor schema OK? Ok, lets assume > that. Next. We're using SpringKafka project, to get this serialized data > and send them over kafka. So we don't have any schema registry, but in > principle it could be possible to include schema within each message. But I > cannot see how that could be done. SpringKafka requires us to provide > him > org.apache.kafka.clients.producer.ProducerConfig#VALUE_SERIALIZER_CLASS_CONFIG, > which we did, but it's just a class calling serializer [1], and from that > point on I have no idea how it could figure out used schema. The question > here I'm asking is, whether when sending avro bytes (obtained by provided > serializer[1]), they are or can be somehow paired with schema used to > serialize data? Is this what kafka senders do, or can do? Include ID/whole > schema somewhere in headers or ...??? And when I read kafka messages, will > the schema be (or could be) somewhere stored in ConsumerRecord or somewhere > like that? > > sorry for confused questions, but I'm really missing knowledge to even ask > properly. > > thanks, > Martin. > > [1] > public static byte[] serialize(T data, > boolean useBinaryDecoder, boolean pretty) { > try { > if (data == null) { > return new byte[0]; > } > > log.debug("data='{}'", data); > Schema schema = data.getSchema(); > ByteArrayOutputStream byteArrayOutputStream = new > ByteArrayOutputStream(); > Encoder binaryEncoder = useBinaryDecoder > ? > EncoderFactory.get().binaryEncoder(byteArrayOutputStream, null) > : EncoderFactory.get().jsonEncoder(schema, > byteArrayOutputStream, pretty); > > DatumWriter datumWriter = new > GenericDatumWriter<>(schema); > datumWriter.write(data, binaryEncoder); > > binaryEncoder.flush(); > byteArrayOutputStream.close(); > > byte[] result = byteArrayOutputStream.toByteArray(); > log.debug("serialized data='{}'", > DatatypeConverter.printHexBinary(result)); > return result; > } catch (IOException ex) { > throw new SerializationException( > "Can't serialize data='" + data, ex); > } > } > > čt 1. 8. 2019 v 17:06 odesílatel Svante Karlsson > napsal: > >> For clarity: What byte[] are we talking about? >> >> You are slightly missing my point if we are speaking about kafka. >> >> Confluent encoding: >> >> 0 schema_id avro >> >> avro_binary_payload does not in any case contain the schema or schema id. >> The schema id is a confluent thing. (in an avrofile the schema is prepended >> by value in the file) >> >> While it's trivial to build a schema registry that for example instead >> gives you a md5 hash of the schema you have to use it throughout your >> infrastructure OR use known reader and writer schema (ie hardcoded). >> >> In confluent world the id=N is the N+1'th registered schema in the >> database (a kafka topic) if I remember right. Loose that database and you >> cannot read your kafka topics. >> >> So you have to use some other encoder, homegrown or not that embeds >> either the full schema in every message (expensive) of some id. Does this >> make sense? >> >> /svante >> &
Re: AVRO schema evolution: adding optional column with default fails deserialization
For clarity: What byte[] are we talking about? You are slightly missing my point if we are speaking about kafka. Confluent encoding: 0 schema_id avro avro_binary_payload does not in any case contain the schema or schema id. The schema id is a confluent thing. (in an avrofile the schema is prepended by value in the file) While it's trivial to build a schema registry that for example instead gives you a md5 hash of the schema you have to use it throughout your infrastructure OR use known reader and writer schema (ie hardcoded). In confluent world the id=N is the N+1'th registered schema in the database (a kafka topic) if I remember right. Loose that database and you cannot read your kafka topics. So you have to use some other encoder, homegrown or not that embeds either the full schema in every message (expensive) of some id. Does this make sense? /svante Den tors 1 aug. 2019 kl 16:38 skrev Martin Mucha : > Thanks for answer. > > What I knew already is, that in each message there is _somehow_ present > either _some_ schema ID or full schema. I saw some byte array manipulations > to get _somehow_ defined schema ID from byte[], which worked, but that's > definitely not how it should be done. What I'm looking for is some > documentation of _how_ to do these things right. I really cannot find a > single thing, yet there must be some util functions, or anything. Is there > some devel-first-steps page, where can I find answers for: > > * How to test, whether byte[] contains full schema or just id? > * How to control, whether message is serialized with ID or with full > schema? > * how to get ID from byte[]? > * how to get full schema from byte[]? > > I don't have confluent platform, and cannot have it, but implementing "get > schema by ID" should be easy task, provided, that I have that ID. In my > scenario I know, that message will be written using one schema, just > different versions of it. So I just need to know, which version it is, so > that I can configure deserializer to enable schema evolution. > > thanks in advance, > Martin > > čt 1. 8. 2019 v 15:55 odesílatel Svante Karlsson > napsal: > >> In an avrofile the schema is in the beginning but if you refer a single >> record serialization like Kafka then you have to add something that you can >> use to get hold of the schema. Confluents avroencoder for Kafka uses >> confluents schema registry that uses int32 as schema Id. This is prepended >> (+a magic byte) to the binary avro. Thus using the schema registry again >> you can get the writer schema. >> >> /Svante >> >> On Thu, Aug 1, 2019, 15:30 Martin Mucha wrote: >> >>> Hi, >>> >>> just one more question, not strictly related to the subject. >>> >>> Initially I though I'd be OK with using some initial version of schema >>> in place of writer schema. That works, but all columns from schema older >>> than this initial one would be just ignored. So I need to know EXACTLY the >>> schema, which writer used. I know, that avro messages contains either full >>> schema or at least it's ID. Can you point me to the documentation, where >>> this is discussed? So in my deserializer I have byte[] as a input, from >>> which I need to get the schema information first, in order to be able to >>> deserialize the record. I really do not know how to do that, I'm pretty >>> sure I never saw this anywhere, and I cannot find it anywhere. But in >>> principle it must be possible, since reader need not necessarily have any >>> control of which schema writer used. >>> >>> thanks a lot. >>> M. >>> >>> út 30. 7. 2019 v 18:16 odesílatel Martin Mucha >>> napsal: >>> >>>> Thank you very much for in depth answer. I understand how it works now >>>> better, will test it shortly. >>>> Thank you for your time. >>>> >>>> Martin. >>>> >>>> út 30. 7. 2019 v 17:09 odesílatel Ryan Skraba napsal: >>>> >>>>> Hello! It's the same issue in your example code as allegro, even with >>>>> the SpecificDatumReader. >>>>> >>>>> This line: datumReader = new SpecificDatumReader<>(schema) >>>>> should be: datumReader = new SpecificDatumReader<>(originalSchema, >>>>> schema) >>>>> >>>>> In Avro, the original schema is commonly known as the writer schema >>>>> (the instance that originally wrote the binary data). Schema >>>>> evolution applies when you are using the constructor of the >>>>>
Re: AVRO schema evolution: adding optional column with default fails deserialization
In an avrofile the schema is in the beginning but if you refer a single record serialization like Kafka then you have to add something that you can use to get hold of the schema. Confluents avroencoder for Kafka uses confluents schema registry that uses int32 as schema Id. This is prepended (+a magic byte) to the binary avro. Thus using the schema registry again you can get the writer schema. /Svante On Thu, Aug 1, 2019, 15:30 Martin Mucha wrote: > Hi, > > just one more question, not strictly related to the subject. > > Initially I though I'd be OK with using some initial version of schema in > place of writer schema. That works, but all columns from schema older than > this initial one would be just ignored. So I need to know EXACTLY the > schema, which writer used. I know, that avro messages contains either full > schema or at least it's ID. Can you point me to the documentation, where > this is discussed? So in my deserializer I have byte[] as a input, from > which I need to get the schema information first, in order to be able to > deserialize the record. I really do not know how to do that, I'm pretty > sure I never saw this anywhere, and I cannot find it anywhere. But in > principle it must be possible, since reader need not necessarily have any > control of which schema writer used. > > thanks a lot. > M. > > út 30. 7. 2019 v 18:16 odesílatel Martin Mucha > napsal: > >> Thank you very much for in depth answer. I understand how it works now >> better, will test it shortly. >> Thank you for your time. >> >> Martin. >> >> út 30. 7. 2019 v 17:09 odesílatel Ryan Skraba napsal: >> >>> Hello! It's the same issue in your example code as allegro, even with >>> the SpecificDatumReader. >>> >>> This line: datumReader = new SpecificDatumReader<>(schema) >>> should be: datumReader = new SpecificDatumReader<>(originalSchema, >>> schema) >>> >>> In Avro, the original schema is commonly known as the writer schema >>> (the instance that originally wrote the binary data). Schema >>> evolution applies when you are using the constructor of the >>> SpecificDatumReader that takes *both* reader and writer schemas. >>> >>> As a concrete example, if your original schema was: >>> >>> { >>> "type": "record", >>> "name": "Simple", >>> "fields": [ >>> {"name": "id", "type": "int"}, >>> {"name": "name","type": "string"} >>> ] >>> } >>> >>> And you added a field: >>> >>> { >>> "type": "record", >>> "name": "SimpleV2", >>> "fields": [ >>> {"name": "id", "type": "int"}, >>> {"name": "name", "type": "string"}, >>> {"name": "description","type": ["null", "string"]} >>> ] >>> } >>> >>> You could do the following safely, assuming that Simple and SimpleV2 >>> classes are generated from the avro-maven-plugin: >>> >>> @Test >>> public void testSerializeDeserializeEvolution() throws IOException { >>> // Write a Simple v1 to bytes using your exact method. >>> byte[] v1AsBytes = serialize(new Simple(1, "name1"), true, false); >>> >>> // Read as Simple v2, same as your method but with the writer and >>> reader schema. >>> DatumReader datumReader = >>> new SpecificDatumReader<>(Simple.getClassSchema(), >>> SimpleV2.getClassSchema()); >>> Decoder decoder = DecoderFactory.get().binaryDecoder(v1AsBytes, null); >>> SimpleV2 v2 = datumReader.read(null, decoder); >>> >>> assertThat(v2.getId(), is(1)); >>> assertThat(v2.getName(), is(new Utf8("name1"))); >>> assertThat(v2.getDescription(), nullValue()); >>> } >>> >>> This demonstrates with two different schemas and SpecificRecords in >>> the same test, but the same principle applies if it's the same record >>> that has evolved -- you need to know the original schema that wrote >>> the data in order to apply the schema that you're now using for >>> reading. >>> >>> I hope this clarifies what you are looking for! >>> >>> All my best, Ryan >>> >>> >>> >>> On Tue, Jul 30, 2019 at 3:30 PM Martin Mucha wrote: >>> > >>> > Thanks for answer. >>> > >>> > Actually I have exactly the same behavior with avro 1.9.0 and >>> following deserializer in our other app, which uses strictly avro codebase, >>> and failing with same exceptions. So lets leave "allegro" library and lots >>> of other tools out of it in our discussion. >>> > I can use whichever aproach. All I need is single way, where I can >>> deserialize byte[] into class generated by avro-maven-plugin, and which >>> will respect documentation regarding schema evolution. Currently we're >>> using following deserializer and serializer, and these does not work when >>> it comes to schema evolution. What is the correct way to serialize and >>> deserializer avro data? >>> > >>> > I probably don't understand your mention about GenericRecord or >>> GenericDatumReader. I tried to use GenericDatumReader in deserializer >>> below, but then it seems I got back just GenericData$Record instance, which >>> I can use then to access array of instances, which is not what I'm looking >>> for(IIUC), since
Re: Avro C++: How to serialize data as Data Object File to send in Kafka?
This is maybe not the nicest implementation since it feels way to complicated but the only on I found out. Checkout encode starting at line 95. Note that the example encodes data using confluent's schema registry format (ie 5 extra bytes) and does a double copy - I have not found a way to get rid of that. https://github.com/bitbouncer/kspp/blob/master/include/kspp/avro/avro_serdes.h /svante Den fre 12 juli 2019 kl 14:42 skrev steinio : > I am trying to serialize some data, created by a .json definition, and I > would like to send the data that DataFileWriter writes. > DataFileWriter takes a file name and writes it to a binary file. > I can get around this by reading back the file to a string and sending the > string over by stream (kafka.producer). > This is not really a viable solution for a high speed producer application, > and by looking at the DataFileWriter, it looks like it should also be able > to take an std::unique_ptr instead of a file name, and write > it to a stream. > But this gives an error when trying to build the application. > > /error C2280: > > 'std::unique_ptr>::unique_ptr(const > std::unique_ptr<_Ty,std::default_delete<_Ty>> &)': attempting to reference > a > deleted function > with > [ > _Ty=avro::OutputStream > ] > c:\program files (x86)\microsoft visual studio > 14.0\vc\include\memory(1435): > note: see declaration of > > 'std::unique_ptr>::unique_ptr' > with > [ > _Ty=avro::OutputStream > ]/ > > The error I guess is that I am trying pass a std::unique_ptr as an > argument, > which is not possible since they can not be copied, but I should rather > call > std::move(myUniquePtr) as an argument instead. > But this gives me another error: > > /error C2664: 'avro::DataFileWriter::DataFileWriter(const > avro::DataFileWriter &)': cannot convert argument 1 from > 'std::shared_ptr' to 'const char *'/ > > There are no examples no how to send data as object data file that includes > header data, so I am just trying and failing here. Is there a "correct" way > of doing this? > I see that this is really easy to do in the Java library, it is just > > /ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); > DatumWriter writer = new GenericDatumWriter<>(schema); > DataFileWriter dataFileWriter = new > DataFileWriter<>(writer); > dataFileWriter.create(schema, outputStream); > > dataFileWriter.append(data); > dataFileWriter.close();/ > > What I have done so far is this: > > /std::ifstream ifs("data.json"); > avro::ValidSchema dataSchm; > avro::compileJsonSchema(ifs, dataSchm); > const char* file = "data.bin"; > std::shared_ptr out = avro::memoryOutputStream(); > avro::DataFileWriter dfw(file, dataSchm); > dfw.write(data); > dfw.close(); > std::ifstream ifs("processmsg.bin"); > std::string str((std::istreambuf_iterator(ifs)), > builder.payload(str); > producer.produce(builder);/ > > How can I avoid having to write this to a file, and instead just write the > binary data directly to an output stream that I can encode and send? > > > > -- > Sent from: http://apache-avro.679487.n3.nabble.com/Avro-Users-f679479.html >
Re: C++ How to get the length of the encoded data
You need to call flush(). Send common usecaee below auto bin_os = avro::memoryOutputStream(); avro::EncoderPtr bin_encoder = avro::binaryEncoder(); bin_encoder->init(*bin_os.get()); avro::encode(*bin_encoder, src); bin_encoder->flush(); /* push back unused characters to the output stream again, otherwise content_length will be a multiple of 4096 */ n Sat, Oct 27, 2018, 20:34 Olivier Delbeke wrote: > Hi all, > > Just started with AVRO (in C++) and a bit stuck (so this will be an easy > question). > All examples show how to encode data and decode it right after you did > that, so you can immediately connect the OutputStream to the InputStream, > completely ignoring the size of the encoded data. However, what I'd like to > do is to send the data to Kafka, so I need to know how many bytes I need to > read from the stream. After encoding, the byteCount() of the OutputStream > always returns 4096 which looks huge for 3 doubles and a short string. When > I read back those 4096 bytes of data, I can indeed see that only the first > 40 bytes are non-zeros. I cannot find any method in the Encoder or in the > OutputStream that does return the length of the encoded data. Am I > overlooking something ? > > Thanks, > > Olivier >
Re: fromJson is failing with null as uniontype
The problem is that avro has it's own representation of union encoding so your experience would encode to {"int": 50}. In a recent project we ended up writing a slightly modded json parser to be able to use avro schemas on existing json rest calls. 2016-02-29 9:20 GMT+01:00 Chris Miller: > Did you ever figure this out? I was having the same problem. > > > -- > Chris Miller > > On Fri, Feb 19, 2016 at 2:53 AM, Siva wrote: > >> Can someone help on this? Is anyone faced similar issue? >> >> Thanks, >> Sivakumar Bhavanari. >> >> On Wed, Feb 17, 2016 at 4:21 PM, Siva wrote: >> >>> Hi Everyone, >>> >>> I m new to avro, running into issues if a type is combined with "null", >>> like ["null","int"] or ["null", "string"]. >>> >>> I have a schema like below >>> >>> { >>>"type":"record","namespace":"tutorialspoint", >>> "name":"empdetails","fields":[ { >>> "name":"experience", >>> "type":["null","string"],"default":null >>> }, >>> { "name":"age", "type":"int" } >>>] >>> } >>> >>> Below is the json dataset. >>> >>> {"experience" : "da", "age": 50} >>> >>> java -jar avro-tools-1.7.7.jar fromjson --schema-file test.avsc >>> test.json > test.avro >>> >>> If I have "null" value in "experience" column it goes though, but if it >>> has some string it is giving below error. Similar error with int types as >>> well (VALUE_NUMBER_INT). >>> >>> Exception in thread "main" org.apache.avro.AvroTypeException: Expected >>> start-union. Got VALUE_STRING >>> at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697) >>> at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441) >>> at >>> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290) >>> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) >>> >>> I have columns with strings or nulls in json, is there a work around to >>> this error without changing the json data? >>> >>> Thanks in advance. >>> >>> Thanks, >>> Sivakumar Bhavanari. >>> >> >> >
Re: Using Avro for encoding messages
I had the same problem a while ago and for the same reasons as you mention we decided to use fingerprints (MD5 hash of the schema), however there are some catches here. First I believe that the normalisation of the schema is incomplete so you might end up with different hashes of the same schema. Second, using a 128 bit integer prepended to both key and values takes more space than using 32 bit. Not a big issue for values but for keys this doubles our size. Third, we already started to use confluent's registry as well because of the already existing integration with other pieces of infrastructure. (camus, bottledwater etc.) What should be useful given this perspective is a byte or two prepending the schema id - defining the registry namespace. I've added the fingerprint schema registry as a example in the c++ kafka library at https://github.com/bitbouncer/csi-kafka/tree/master/examples/schema-registry We run a couple of those in a mesos cluster and use HAproxy find them. /svante 2015-07-09 10:36 GMT+02:00 Daniel Schierbeck daniel.schierb...@gmail.com: I'm working on a system that will store Avro-encoded messages in Kafka. The system will have both producers and consumers in different languages, including Ruby (not JRuby) and Java. At the moment I'm encoding each message as a data file, which means that the full schema is included in each encoded message. This is obviously suboptimal, but it doesn't seem like there's a standardized format for single-message Avro encodings. I've reviewed Confluent's schema-registry offering, but that seems to be overkill for my needs, and would require me to run and maintain yet another piece of infrastructure. Ideally, I wouldn't have to use anything besides Kafka. Is this something that other people have experience with? I've come up with a scheme that would seem to work well independently of what kind of infrastructure you're using: whenever a writer process is asked to encode a message m with schema s for the first time, it broadcasts (s', s) to a schema registry, where s' is the fingerprint of s. The schema registry in this case can be pluggable, and can be any mechanism that allows different processes to access the schemas. The writer then encodes the message as (s', m), i.e. only includes the schema fingerprint. A reader, when first encountering a message with a schema fingerprint s', looks up s from the schema registry and uses s to decode the message. Here, the concept of a schema registry has been abstracted away and is not tied to the concept of schema ids and versions. Furthermore, there are some desirable traits: 1. Schemas are identified by their fingerprints, so there's no need for an external system to issue schema ids. 2. Writing (s', s) pairs is idempotent, so there's no need to coordinate that task. If you've got a system with many writers, you can let all of them broadcast their schemas when they boot or when they need to encode data using the schemas. 3. It would work using a range of different backends for the schema registry. Simple key-value stores would obviously work, but for my case I'd probably want to use Kafka itself. If the schemas are writting to a topic with key-based compaction, where s' is the message key and s is the message value, then Kafka would automatically clean up duplicates over time. This would save me from having to add more pieces to my infrastructure. Has this problem been solved already? If not, would it make sense to define a common message format that defined the structure of (s', m) pairs? Cheers, Daniel Schierbeck
Re: Using Avro for encoding messages
What causes the schema normalization to be incomplete? Bad implementation, I use C++ avro and it's not complete and not very active. And is that a problem? As long as the reader can get the schema, it shouldn't matter that there are duplicates – as long as the differences between the duplicates do not affect decoding. Not really a problem, we tend to use machine generated schemas and they are always identical. I think there are holes in the simplification of types if I remember correctly. Namespaces should be collapsed, {type : string} - string etc Current implementation can't reliably decide if two types are identical. If you correct the problem later then a registered schema would actually change it's hash since it now can be simplified. If this is a problem depends on your application. We currently encode this as you suggest schema_type (byte)schema_id (32/128bits)avro (binary) The binary fields should probably have a defined endianness also. I agree on that a defacto way of encoding this would be nice. Currently I would say that the confluent / linkedin way is the normal
Re: Schema default values in C++ implementation
I think you are hit by https://issues.apache.org/jira/browse/AVRO-1335 I recently extended the avrogen_cpp thing so it also generates the following members to your class. ... static inline const boost::uuids::uuid schema_hash() { static const boost::uuids::uuid _hash(boost::uuids::string_generator()(eea03bf9-7719-1af0-1dfb-e8049f677f7d)); return _hash; } static inline const char* schema_as_string() { return {\type\:\record\,\name\:\cpx\,\fields\:[{\name\:\numbername\,\type\:\string\},{\name\:\re\,\type\:\double\},{\name\:\im\,\type\:\double\}]}; } static const avro::ValidSchema valid_schema() { static const avro::ValidSchema _validSchema(avro::compileJsonSchemaFromString(schema_as_string())); return _validSchema; } ... As you can see the existing C++ code seems to loose the default values somewhere and this of course also makes the schema hash unusable (in my use-case) /svante
Re: Deserialize Avro Object Without Schema
The schema is written inside an avro file. Thats why you don't need to provide it. You really need the schema to decode avro data. Either by providing a schema from somewhere and using a generic datum reader or by generating a hardcoded decoder that knows the schema from compile time. regards svante 2015-03-25 11:10 GMT+01:00 Alexander Zenger a.zen...@cetec.cc: Hi, -Ursprüngliche Nachricht- Von: Rendy Bambang Junior [mailto:rendy.b.jun...@gmail.com] Gesendet: Mittwoch, 25. März 2015 10:08 An: user@avro.apache.org Betreff: Deserialize Avro Object Without Schema It should be possible right? Since the schema itself is embedded in the data. yes and it is working for me. Altough I'm reading data from a file, and I create a DataFileReader from the GenericDatumReader which then reads the deserialized data. On a quick look, I didn't find a FileReader for streams. Here is my example: DatumReaderGenericRecord datumReader = new GenericDatumReaderGenericRecord(); DataFileReaderGenericRecord dataFileReader = null; try { dataFileReader = new DataFileReader(DATA_FILE, datumReader); } catch (IOException exp) { System.out.println(Could not read file + DATA_FILE.getName()); System.exit(1); } GenericRecord person = null; try { person = dataFileReader.next(person); } catch (IOException exp) { System.out.println(Could not read user from file + DATA_FILE.getName()); System.exit(1); } System.out.println(Id: + person.get(id)); System.out.println(Name: + person.get(name)); System.out.println(Email: + person.get(email)); -- Regards Alexander Zenger
Re: Error building/running tests on avro c++
I had some issues with the cmakefile when I built avro c++ for windows a month or two ago. If I remebered correctly it did not find or possibly figure out the configuration of boost. I ended up doing some small hacks in the CMakeList.txt file to get it to compile. This was on windows so the changes are not relevant to you but after that the tests compiles fine. I think the changes are as follows ... #windows fix SET(EXECUTABLE_OUTPUT_PATH ${CMAKE_SOURCE_DIR}/bin/$(Platform)) SET(LIBRARY_OUTPUT_PATH ${CMAKE_SOURCE_DIR}/lib/$(Platform)) set(Boost_INCLUDE_DIRS ${CMAKE_SOURCE_DIR}/../boost_1_55_0) #set(Boost_USE_STATIC_LIBS ON) #set(Boost_USE_MULTITHREADED ON) #set(Boost_LIBRARIES boost_filesystem-vc120-mt-1_55 boost_system-vc120-mt-1_55 boost_program_options-vc120-mt-1_55 boost_iostreams-vc120-mt-1_55) #boost_filesystem-vc120-mt-1_55.lib # boost_filesystem-vc120-mt-1_55 set(BOOST_LIBRARYDIR ${Boost_INCLUDE_DIRS}/lib/$(Platform)/lib) #add_definitions (-DHAVE_BOOST_ASIO) link_directories(${BOOST_LIBRARYDIR}) #find_package (Boost 1.55 REQUIRED #COMPONENTS filesystem system program_options iostreams) add_definitions (${Boost_LIB_DIAGNOSTIC_DEFINITIONS}) include_directories (api ${CMAKE_CURRENT_BINARY_DIR} ${Boost_INCLUDE_DIRS}) as you can see I removed the find_package and basically pointed it out myself. Sooner or later I'll have to fix this on linux as well as that is my final target but I prefer the visual studio development environment for debugging purposes If you can't figure this out I might give you a hand. If it is to any help - there is a github repo with cross compilation directives for among others avro https://github.com/bitbouncer/csi-build-scripts/blob/master/raspberry_rebuild_ia32.sh the relevant portion is ... export BOOST_VERSION=1_55_0 export AVRO_VERSION=1.7.6 cd avro-cpp-$AVRO_VERSION export BOOST_ROOT=$PWD/../boost_$BOOST_VERSION export Boost_INCLUDE_DIR=$PWD/../boost_$BOOST_VERSION/boost export PI_TOOLS_HOME=~/xtools/tools rm -rf avro rm -rf build mkdir build cd build cmake -DCMAKE_TOOLCHAIN_FILE=../csi-build-scripts/toolchains/raspberry.ia32.cmake .. make -j4 cd .. mkdir avro cp -r api/*.* avro cd .. skip the pi tools part and give it a try. There are a lot of other missing features on avro c++ that's on my todolist. /svante 2014-07-31 22:40 GMT+02:00 jeff saremi jeffsar...@hotmail.com: Does anyone know what the problem might be? appreciated it: [ 97%] Building CXX object CMakeFiles/buffertest.dir/test/buffertest.cc.o In file included from /temp/boost/boost/thread/detail/platform.hpp:17, from /temp/boost/boost/thread/thread_only.hpp:12, from /temp/boost/boost/thread/thread.hpp:12, from /temp/boost/boost/thread.hpp:13, from /temp/avro/test/buffertest.cc:21: /temp/boost/boost/config/requires_threads.hpp:47:5: error: #error Compiler threading support is not turned on. Please set the correct command line options for threading: -pthread (Linux), -pthreads (Solaris) or -mthreads (Mingw32) and 100's of similar messages follow. or error like: /temp/boost/boost/thread/detail/thread.hpp:93: error: expected class-name before '{' token /temp/boost/boost/thread/detail/thread.hpp:127: error: expected class-name before '{' token /temp/boost/boost/thread/detail/thread.hpp:144: error: expected class-name before '{' token /temp/boost/boost/thread/detail/thread.hpp:163: error: 'thread_attributes' does not name a type /temp/boost/boost/thread/detail/thread.hpp:172: error: 'thread_data_ptr' in namespace 'boost::detail' does not name a type /temp/boost/boost/thread/detail/thread.hpp:176: error: expected ',' or '...' before '' token /temp/boost/boost/thread/detail/thread.hpp:176: error: ISO C++ forbids declaration of 'attributes' with no type /temp/boost/boost/thread/detail/thread.hpp:185: error: expected ',' or '...' before '' token /temp/boost/boost/thread/detail/thread.hpp:185: error: ISO C++ forbids declaration of 'attributes' with no
Re: Error building/running tests on avro c++
Since your using solaris check the ticket below: (it speaks about a bug in 1.53 that has been fixed in 1.54) https://svn.boost.org/trac/boost/ticket/8212 /svante 2014-08-01 15:28 GMT+02:00 jeff saremi jeffsar...@hotmail.com: Svente, thanks very much for the info. I looked at the shell file as well. I'm not doing much different than that. So i believe this has to do with the boost compilation on my platform: Solaris. The shell file in the link did the default invocation of boost build which is what i did. But i think some flags are needed wrt multi-threading build. If i figure it out i'll share that with every one. -- Date: Fri, 1 Aug 2014 14:37:36 +0200 Subject: Re: Error building/running tests on avro c++ From: s...@csi.se To: user@avro.apache.org I had some issues with the cmakefile when I built avro c++ for windows a month or two ago. If I remebered correctly it did not find or possibly figure out the configuration of boost. I ended up doing some small hacks in the CMakeList.txt file to get it to compile. This was on windows so the changes are not relevant to you but after that the tests compiles fine. I think the changes are as follows ... #windows fix SET(EXECUTABLE_OUTPUT_PATH ${CMAKE_SOURCE_DIR}/bin/$(Platform)) SET(LIBRARY_OUTPUT_PATH ${CMAKE_SOURCE_DIR}/lib/$(Platform)) set(Boost_INCLUDE_DIRS ${CMAKE_SOURCE_DIR}/../boost_1_55_0) #set(Boost_USE_STATIC_LIBS ON) #set(Boost_USE_MULTITHREADED ON) #set(Boost_LIBRARIES boost_filesystem-vc120-mt-1_55 boost_system-vc120-mt-1_55 boost_program_options-vc120-mt-1_55 boost_iostreams-vc120-mt-1_55) #boost_filesystem-vc120-mt-1_55.lib # boost_filesystem-vc120-mt-1_55 set(BOOST_LIBRARYDIR ${Boost_INCLUDE_DIRS}/lib/$(Platform)/lib) #add_definitions (-DHAVE_BOOST_ASIO) link_directories(${BOOST_LIBRARYDIR}) #find_package (Boost 1.55 REQUIRED #COMPONENTS filesystem system program_options iostreams) add_definitions (${Boost_LIB_DIAGNOSTIC_DEFINITIONS}) include_directories (api ${CMAKE_CURRENT_BINARY_DIR} ${Boost_INCLUDE_DIRS}) as you can see I removed the find_package and basically pointed it out myself. Sooner or later I'll have to fix this on linux as well as that is my final target but I prefer the visual studio development environment for debugging purposes If you can't figure this out I might give you a hand. If it is to any help - there is a github repo with cross compilation directives for among others avro https://github.com/bitbouncer/csi-build-scripts/blob/master/raspberry_rebuild_ia32.sh the relevant portion is ... export BOOST_VERSION=1_55_0 export AVRO_VERSION=1.7.6 cd avro-cpp-$AVRO_VERSION export BOOST_ROOT=$PWD/../boost_$BOOST_VERSION export Boost_INCLUDE_DIR=$PWD/../boost_$BOOST_VERSION/boost export PI_TOOLS_HOME=~/xtools/tools rm -rf avro rm -rf build mkdir build cd build cmake -DCMAKE_TOOLCHAIN_FILE=../csi-build-scripts/toolchains/raspberry.ia32.cmake .. make -j4 cd .. mkdir avro cp -r api/*.* avro cd .. skip the pi tools part and give it a try. There are a lot of other missing features on avro c++ that's on my todolist. /svante 2014-07-31 22:40 GMT+02:00 jeff saremi jeffsar...@hotmail.com: Does anyone know what the problem might be? appreciated it: [ 97%] Building CXX object CMakeFiles/buffertest.dir/test/buffertest.cc.o In file included from /temp/boost/boost/thread/detail/platform.hpp:17, from /temp/boost/boost/thread/thread_only.hpp:12, from /temp/boost/boost/thread/thread.hpp:12, from /temp/boost/boost/thread.hpp:13, from /temp/avro/test/buffertest.cc:21: /temp/boost/boost/config/requires_threads.hpp:47:5: error: #error Compiler threading support is not turned on. Please set the correct command line options for threading: -pthread (Linux), -pthreads (Solaris) or -mthreads (Mingw32) and 100's of similar messages follow. or error like: /temp/boost/boost/thread/detail/thread.hpp:93: error: expected class-name before '{' token /temp/boost/boost/thread/detail/thread.hpp:127: error: expected class-name before '{' token /temp/boost/boost/thread/detail/thread.hpp:144: error: expected class-name before '{' token /temp/boost/boost/thread/detail/thread.hpp:163: error: 'thread_attributes' does not name a type /temp/boost/boost/thread/detail/thread.hpp:172: error: 'thread_data_ptr' in namespace 'boost::detail' does not name a type /temp/boost/boost/thread/detail/thread.hpp:176: error: expected ',' or '...' before '' token /temp/boost/boost/thread/detail/thread.hpp:176: error: ISO C++ forbids declaration of 'attributes' with no type /temp/boost/boost/thread/detail/thread.hpp:185: error: expected ',' or '...' before '' token /temp/boost/boost/thread/detail/thread.hpp:185: error: ISO C++ forbids declaration of 'attributes' with no
128 bit integers
I'm having issues with endian converison of 128 bit integers (uuid's in my case) but the problem is generic I currently encodes them as fixed but that leaves the swapping of bytes (for endianness) up to the user. I had not given the matter any thought until we streched some existing 64 bit id's in an existing database to 128 by simply adding 0 in the upper 64 bits. It turns out that the c/c++ and java versions ar not (of course) compatible. I think we have the same issue in the spec The spec exemplifies serverHash in avro rpc as a md5. {type: fixed, name: MD5, size: 16} java to java this works fine... What's the best way to tackle this? /svante
c++11 http client and server library
I've started to work on a c++ library that I intend to use for performing avro encoded rest calls. If/when I understand how to implement avro rpc it should be simple enough to extend the existing code base to that as well. The client is implemented using libcurl and boost asio. The server is based on some of the boost asio samples but using Joyents/NGINX http parser. The client has both async and sync methods and http 1.1 is (partially?) supported. connection:keep-alive is implemented. The code that I based this on supported openssl as well but I have not yet completed that part. Fragments are there I've noticed that the existing avrogencpp can't be used for avro-rpc specs - anyone that can share some light on how it's supposed to be implemented? Should be portable and currently runs on ubuntu 13.10 and windows. (currently adding raspberry support) All included code should be in various state of opensource and my own contributions are distributed with boost license. The documentation is sparse but there are some examples that's rather simple. dependencies boost avrocpp libcurl cmake C++11 (tested with gcc visual studio 2013) code can be found here https://github.com/bitbouncer/csi-http Comments most welcome. /svante