Re: Avro schema properties contention on multithread read

2017-07-08 Thread Zoltan Farkas
The order of attributes in Json might matter as far as I can remember, so 
LinkedHashMap might not be replaceable with a concurrenthashmap.
Plus concurrenthashmap is not exactly without concurrency overhead…
I wrote a util that creates a immutable schema 
https://github.com/zolyfarkas/spf4j/blob/master/spf4j-avro/src/main/java/org/spf4j/avro/schema/Schemas.java#L26
 

 
But you would have to use it it conjunction with a unsynchronized avro 
implementation. (which I do in my fork, and you can do as well).

I wonder if there is interest in merging this into the avro lib someday.

—Z


> On Jul 6, 2017, at 12:20 PM, f...@legsem.com wrote:
> 
> On 05.07.2017 21:53, Zoltan Farkas wrote:
> 
>> The synchronization in JsonProperties is curently inconsistent (see 
>> getObjectProps()) which makes current implementation @NotThreadSafe
>>  
>> I think it would be probably best to remove synchronization from those 
>> methods... and add @NotThreadSafe to the class...
>> Utilities like Schemas.synchronizedSchema(...) and 
>> Schemas.unmodifiableSchema(...) could be added to help with various use 
>> cases...
>>  
>>  
>> —Z
>>  
> Thank you for your reply. I like your Schemas.unmodifiableSchema(...) a lot.
> 
> While what you are describing would be ideal, a simpler solution might be to 
> change the LinkedHashMap that backs jsonProperties into something like a 
> ConcurrentHashMap, avoiding the need for synchronization.
> 
> This being said ConcurrentHashMap itself does not preserve insertion order, 
> so its not a mere replacement to LinkedHashMap.
> 



Re: Avro schema properties contention on multithread read

2017-07-06 Thread fady
On 05.07.2017 21:53, Zoltan Farkas wrote:

> The synchronization in JsonProperties is curently inconsistent (see 
> getObjectProps()) which makes current implementation @NotThreadSafe 
> 
> I think it would be probably best to remove synchronization from those 
> methods... and add @NotThreadSafe to the class... 
> Utilities like Schemas.synchronizedSchema(...) and 
> Schemas.unmodifiableSchema(...) could be added to help with various use 
> cases... 
> 
> --Z

Thank you for your reply. I like your Schemas.unmodifiableSchema(...) a
lot. 

While what you are describing would be ideal, a simpler solution might
be to change the LinkedHashMap that backs jsonProperties into something
like a ConcurrentHashMap, avoiding the need for synchronization. 

This being said ConcurrentHashMap itself does not preserve insertion
order, so its not a mere replacement to LinkedHashMap.

Re: Avro schema properties contention on multithread read

2017-07-05 Thread Zoltan Farkas
The synchronization in JsonProperties is curently inconsistent (see 
getObjectProps()) which makes current implementation @NotThreadSafe

I think it would be probably best to remove synchronization from those methods… 
and add @NotThreadSafe to the class…
Utilities like Schemas.synchronizedSchema(…) and Schemas.unmodifiableSchema(…) 
could be added to help with various use cases…


—Z


> On Jun 29, 2017, at 2:21 AM, f...@legsem.com wrote:
> 
> Hello,
> 
> We are using Avro Schema properties and while running concurrent tests, we 
> noticed a lot of contentions on org.apache.avro.JsonProperties#getJsonProp.
> 
> In the attached screen shot, we have 4 concurrent threads all sharing the 
> same avro schema and reading from it simultaneously.
> 
> On this screen shot each red period is a contention between threads. Most of 
> these contentions are on getJsonProp.
> 
> This is due to getJsonProp being a synchronized method.
> 
> We have tried avro 1.7.7, 1.8.1 and 1.8.2. All have this problem (getJsonProp 
> is deprecated in 1.8 but the replacement method is also synchronized).
> 
> We can work around this by not sharing the avro schemas between threads 
> (using ThreadLocal for instance) but this is ugly.
> 
> It seems that avro schemas are mostly immutable, which is great for 
> multithread read access, but it turns out Properties within these schemas are 
> mutable and, since they are stored in a LinkedHashMap, synchronization is 
> necessary.
> 
> Anyone having a similar issue?
> 
> Thank you