Re: [infinispan-dev] Store as binary
On Jan 21, 2014, at 1:36 PM, Sanne Grinovero sa...@infinispan.org wrote: What's the point for these tests? +1 On 20 Jan 2014 15:48, Radim Vansa rva...@redhat.com wrote: OK, I have results for dist-udp-no-tx or local-no-tx modes on 8 nodes (in local mode the nodes don't communicate, naturally): Dist mode: 3 % down for reads, 1 % for writes Local mode: 19 % down for reads, 16 % for writes Details in [1], ^ is for both keys and values stored as binary. Radim [1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/4/artifact/report/All_report.html On 01/20/2014 11:14 AM, Pedro Ruivo wrote: On 01/20/2014 10:07 AM, Mircea Markus wrote: Would be interesting to see as well, though performance figure would not include the network latency, hence it would not tell much about the benefit of using this on a real life system. that's my point. I'm interested to see the worst scenario since all other cluster modes, will have a lower (or none) impact in performance. Of course, the best scenario would be only each node have access to remote keys... Pedro On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote: IMO, we should try the worst scenario: Local Mode + Single thread. this will show us the highest impact in performance. Cheers, ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa rva...@redhat.com JBoss DataGrid QA ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño gal...@redhat.com twitter.com/galderz Project Lead, Escalante http://escalante.io Engineer, Infinispan http://infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote: On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote: What's the point for these tests? +1 To validate if storing the data in binary format yields better performance than store is as a POJO. That will highly depend on the scenarios you want to test for. AFAIK this started after Paul described how session replication works in WildFly, and we already know that both strategies are suboptimal with the current options available: in his case the active node will always write on the POJO, while the backup node will essentially only need to store the buffer just in case he might need to take over. Sure, one will be slower, but if you want to make a suggestion to him about which configuration he should be using, we should measure his use case, not a different one. Even then as discussed in Palma, an in memory String representation might be way more compact because of pooling of strings and a very high likelihood for repeated headers (as common in web frameworks), so you might want to measure the CPU vs storage cost on the receiving side.. but then again your results will definitely depend on the input data and assumptions on likelihood of failover, how often is being written on the owner node vs on the other node (since he uses locality), etc.. many factors I'm not seeing being considered here and which could make a significant difference. As of now, it doesn't so I need to check why. You could play with the test parameters until it produces an output you like better, but I still see no point? This is not a realistic scenario, at best it could help us document suggestions about which scenarios you'd want to keep the option enabled vs disabled, but then again I think we're wasting time as we could implement a better strategy for Paul's use case: one which never deserializes a value received from a remote node until it's been requested as a POJO, but keeps the POJO as-is when it's stored locally. I believe that would make sense also for OGM and probably most other users of Embedded. Basically, that would re-implement something similar to the previous design but simplifying it a bit so that it doesn't allow for a back-and-forth conversion between storage types but rather dynamically favors a specific storage strategy. Cheers, Sanne Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Integration between HotRod and OGM
Hi Emmanuel, Just had a good chat with Davide on this and one solution to overcome the shortcoming you mentioned in the above email would be to enhance the hotrod client to support grouping: RemoteClient.put(G g, K k, V v); //first param is the group RemoteClinet.getGroup(G g) : MapK,V; It requires an enhancement on our local grouping API: EmbeddedCache.getGroup(G). This is something useful for us in a broader context, as it is the step needed to be able to deprecated AtomicMaps and get suggest them being replaced with Grouping. This approach still has some limitations compared to the current embedded integration: - performance caused by the lack of transactions: this means increased TCP chattiness between the Hot Rod client and the server. - you'd have to handle atomicity, potentially by retrying an operation What do you think? On Dec 3, 2013, at 3:10 AM, Mircea Markus mmar...@redhat.com wrote: On Nov 19, 2013, at 10:22 AM, Emmanuel Bernard emman...@hibernate.org wrote: It's an interesting approach that would work fine-ish for entities assuming the Hot Rod client is multi threaded and assuming the client uses Future to parallelize the calls. The Java Hotrod client is both multithreaded and exposes an Async API. But it won't work for associations as we have them designed today. Each association - or more precisely the query results to go from an entity A1 to the list of entities B associated to it - is represented by an AtomicMap. Each entry in this map does correspond to an entry in the association. While we can guess the column names and build from the metadata the list of composed keys for entities, we cannot do the same for associations as the key is literally the (composite) id of the association and we cannot guess that most of the time (we can in very pathological cases). We could imagine that we list the association row keys in a special entry to work around that but this approach is just as problematic and is conceptually the same. The only solution would be to lock the whole association for each operation and I guess impose some versioning / optimistic lock. That is not a pattern that scales sufficiently from my experience. I think so too :-) That's the problem with interconnected data :) Emmanuel On Mon 2013-11-18 23:05, Mircea Markus wrote: Neither the grouping API nor the AtomicMap work over hotrod. Between the grouping API and AtomicMap, I think the one that would make more sense migrating is the grouping API. One way or the other, I think the hotrod protocol would require an enhancement - mind raising a JIRA for that? For now I guess you can sacrifice performance and always sending the entire object across on every update instead of only the deltas? On Nov 18, 2013, at 9:56 AM, Emmanuel Bernard emman...@hibernate.org wrote: Someone mentioned the grouping API as some sort of alternative to AtomicMap. Maybe we should use that? Note that if we don't have a fine-grained approach we will need to make sure we *copy* the complex data structure upon reads to mimic proper transaction isolation. On Tue 2013-11-12 15:14, Sanne Grinovero wrote: On 12 November 2013 14:54, Emmanuel Bernard emman...@hibernate.org wrote: On the transaction side, we can start without them. +1 on omitting transactions for now. And on the missing AtomicMaps, I hope the Infinispan will want to implement it? Would be good to eventually converge on similar featuresets on remote vs embedded APIs. I know the embedded version relies on batching/transactions, but I guess we could obtain a similar effect with some ad-hoc commands in Hot Rod? Sanne On Tue 2013-11-12 14:34, Davide D'Alto wrote: Hi, I'm working on the integration between HotRod and OGM. We already have a dialect for Inifinispan and I'm trying to follow the same logic. At the moment I'm having two problems: 1) In the Infinispan dialect we are using the AtomicMap and the AtomicMapLookup but this classes don't work with the RemoteCache. Is there an equivalent for HotRod? 2) As far as I know HotRod does not support transactions. I've found a link to a branch on Mircea repository: https://github.com/mmarkus/ops_over_hotrod/wiki/Usage-guide Is this something I could/should use? Any help is appreciated. Thanks, Davide ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org
Re: [infinispan-dev] Integration between HotRod and OGM
Hi Mircea, could you explain how Grouping is different than AtomicMaps ? I understand you're all suggesting to move to AtomicMaps as the implementation is better but is that an implementation detail, or how is it inherently different so that we can build something more reliable on it? From the limited knowledge I have in this area, I have been assuming - since they have very similar properties - that this was essentially a different syntax to get to the same semantics but obviously I'm wrong. It would be especially helpfull to have a clear comparison on the different semantics in terms of transactions, atomicity and visibility of state across the three kinds: AtomicMaps, FineGrainedAtomicMaps, Grouping. Let's also keep in mind that Hibernate OGM uses a carefully selected combination of *both* AtomicMap and FGAM instances - depending on the desired semantics we want to achieve, so since those two where clearly different and we actually build on those differences - I'm not seeing how we could migrate two different things to the same construct without having to move fishy locking details out of Infinispan but in OGM, and I wouldn't be too happy with that as such logic would belong in Infinispan to provide. - Sanne On 21 January 2014 15:07, Mircea Markus mmar...@redhat.com wrote: Hi Emmanuel, Just had a good chat with Davide on this and one solution to overcome the shortcoming you mentioned in the above email would be to enhance the hotrod client to support grouping: RemoteClient.put(G g, K k, V v); //first param is the group RemoteClinet.getGroup(G g) : MapK,V; It requires an enhancement on our local grouping API: EmbeddedCache.getGroup(G). This is something useful for us in a broader context, as it is the step needed to be able to deprecated AtomicMaps and get suggest them being replaced with Grouping. This approach still has some limitations compared to the current embedded integration: - performance caused by the lack of transactions: this means increased TCP chattiness between the Hot Rod client and the server. - you'd have to handle atomicity, potentially by retrying an operation What do you think? On Dec 3, 2013, at 3:10 AM, Mircea Markus mmar...@redhat.com wrote: On Nov 19, 2013, at 10:22 AM, Emmanuel Bernard emman...@hibernate.org wrote: It's an interesting approach that would work fine-ish for entities assuming the Hot Rod client is multi threaded and assuming the client uses Future to parallelize the calls. The Java Hotrod client is both multithreaded and exposes an Async API. But it won't work for associations as we have them designed today. Each association - or more precisely the query results to go from an entity A1 to the list of entities B associated to it - is represented by an AtomicMap. Each entry in this map does correspond to an entry in the association. While we can guess the column names and build from the metadata the list of composed keys for entities, we cannot do the same for associations as the key is literally the (composite) id of the association and we cannot guess that most of the time (we can in very pathological cases). We could imagine that we list the association row keys in a special entry to work around that but this approach is just as problematic and is conceptually the same. The only solution would be to lock the whole association for each operation and I guess impose some versioning / optimistic lock. That is not a pattern that scales sufficiently from my experience. I think so too :-) That's the problem with interconnected data :) Emmanuel On Mon 2013-11-18 23:05, Mircea Markus wrote: Neither the grouping API nor the AtomicMap work over hotrod. Between the grouping API and AtomicMap, I think the one that would make more sense migrating is the grouping API. One way or the other, I think the hotrod protocol would require an enhancement - mind raising a JIRA for that? For now I guess you can sacrifice performance and always sending the entire object across on every update instead of only the deltas? On Nov 18, 2013, at 9:56 AM, Emmanuel Bernard emman...@hibernate.org wrote: Someone mentioned the grouping API as some sort of alternative to AtomicMap. Maybe we should use that? Note that if we don't have a fine-grained approach we will need to make sure we *copy* the complex data structure upon reads to mimic proper transaction isolation. On Tue 2013-11-12 15:14, Sanne Grinovero wrote: On 12 November 2013 14:54, Emmanuel Bernard emman...@hibernate.org wrote: On the transaction side, we can start without them. +1 on omitting transactions for now. And on the missing AtomicMaps, I hope the Infinispan will want to implement it? Would be good to eventually converge on similar featuresets on remote vs embedded APIs. I know the embedded version relies on batching/transactions, but I guess we could obtain a similar effect with some ad-hoc commands in Hot Rod? Sanne
Re: [infinispan-dev] Store as binary
On Jan 21, 2014, at 2:13 PM, Sanne Grinovero sa...@infinispan.org wrote: On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote: On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote: What's the point for these tests? +1 To validate if storing the data in binary format yields better performance than store is as a POJO. That will highly depend on the scenarios you want to test for. AFAIK this started after Paul described how session replication works in WildFly, and we already know that both strategies are suboptimal with the current options available: in his case the active node will always write on the POJO, while the backup node will essentially only need to store the buffer just in case he might need to take over. Indeed as it is today, it doesn't make sense for WildFly's session replication. Sure, one will be slower, but if you want to make a suggestion to him about which configuration he should be using, we should measure his use case, not a different one. Even then as discussed in Palma, an in memory String representation might be way more compact because of pooling of strings and a very high likelihood for repeated headers (as common in web frameworks), pooling like in String.intern()? Even so, if most of your access to the String is to serialize it and sent is remotely then you have a serialization cost(CPU) to pay for the reduced size. so you might want to measure the CPU vs storage cost on the receiving side.. but then again your results will definitely depend on the input data and assumptions on likelihood of failover, how often is being written on the owner node vs on the other node (since he uses locality), etc.. many factors I'm not seeing being considered here and which could make a significant difference. I'm looking for the default setting of storeAsBinary in the configurations we ship. I think the default configs should be optimized for distribution, random key access (every reads/writes for any key executes on every node of the cluster with the same probability) for both read an write. As of now, it doesn't so I need to check why. You could play with the test parameters until it produces an output you like better, but I still see no point? the point is to provide the best defaults params for the default config, and see what's the usefulness of storeAsBinary. This is not a realistic scenario, at best it could help us document suggestions about which scenarios you'd want to keep the option enabled vs disabled, but then again I think we're wasting time as we could implement a better strategy for Paul's use case: one which never deserializes a value received from a remote node until it's been requested as a POJO, but keeps the POJO as-is when it's stored locally. I disagree: Paul's scenario, whilst very important, is quite specific. For what I consider the general case (random key access, see above), your approach is suboptimal. I believe that would make sense also for OGM and probably most other users of Embedded. Basically, that would re-implement something similar to the previous design but simplifying it a bit so that it doesn't allow for a back-and-forth conversion between storage types but rather dynamically favors a specific storage strategy. It all boils down to what we want to optimize for: random key access or some degree of affinity. I think the former is the default. One way or the other, from the test Radim ran with random key access, the storeAsBinary doesn't bring any benefit and it should: http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html Cheers, Sanne Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Integration between HotRod and OGM
On Jan 21, 2014, at 4:08 PM, Sanne Grinovero sa...@infinispan.org wrote: Hi Mircea, could you explain how Grouping is different than AtomicMaps ? Here's the original thread where this has been discussed: http://goo.gl/WNs6KY I would add to that that the AtomicMap requires transactions, which grouping doesn't. Also in the context of hotrod (i.e. this email thread) FGAM is a structure that would be harder to migrate over I understand you're all suggesting to move to AtomicMaps as the implementation is better we're suggesting to move from the AM to grouping but is that an implementation detail, or how is it inherently different so that we can build something more reliable on it? They both are doing pretty much the same thing, so it's more a matter of choosing one instead of the other. Grouping fits way nicer into the picture, both as a concept and the implementation. From the limited knowledge I have in this area, I have been assuming - since they have very similar properties - that this was essentially a different syntax to get to the same semantics but obviously I'm wrong. It would be especially helpfull to have a clear comparison on the different semantics in terms of transactions, atomicity and visibility of state across the three kinds: AtomicMaps, FineGrainedAtomicMaps, Grouping. Let's also keep in mind that Hibernate OGM uses a carefully selected combination of *both* AtomicMap and FGAM instances - depending on the desired semantics we want to achieve, so since those two where clearly different and we actually build on those differences - I'm not seeing how we could migrate two different things to the same construct without having to move fishy locking details out of Infinispan but in OGM, and I wouldn't be too happy with that as such logic would belong in Infinispan to provide. I wasn't aware that OGM still uses AtomicMap, but the only case in which I imagine that would be useful is in order to force a lock on the whole AtomicMap. Is that so or some other aspect that I'm missing? - Sanne On 21 January 2014 15:07, Mircea Markus mmar...@redhat.com wrote: Hi Emmanuel, Just had a good chat with Davide on this and one solution to overcome the shortcoming you mentioned in the above email would be to enhance the hotrod client to support grouping: RemoteClient.put(G g, K k, V v); //first param is the group RemoteClinet.getGroup(G g) : MapK,V; It requires an enhancement on our local grouping API: EmbeddedCache.getGroup(G). This is something useful for us in a broader context, as it is the step needed to be able to deprecated AtomicMaps and get suggest them being replaced with Grouping. This approach still has some limitations compared to the current embedded integration: - performance caused by the lack of transactions: this means increased TCP chattiness between the Hot Rod client and the server. - you'd have to handle atomicity, potentially by retrying an operation What do you think? On Dec 3, 2013, at 3:10 AM, Mircea Markus mmar...@redhat.com wrote: On Nov 19, 2013, at 10:22 AM, Emmanuel Bernard emman...@hibernate.org wrote: It's an interesting approach that would work fine-ish for entities assuming the Hot Rod client is multi threaded and assuming the client uses Future to parallelize the calls. The Java Hotrod client is both multithreaded and exposes an Async API. But it won't work for associations as we have them designed today. Each association - or more precisely the query results to go from an entity A1 to the list of entities B associated to it - is represented by an AtomicMap. Each entry in this map does correspond to an entry in the association. While we can guess the column names and build from the metadata the list of composed keys for entities, we cannot do the same for associations as the key is literally the (composite) id of the association and we cannot guess that most of the time (we can in very pathological cases). We could imagine that we list the association row keys in a special entry to work around that but this approach is just as problematic and is conceptually the same. The only solution would be to lock the whole association for each operation and I guess impose some versioning / optimistic lock. That is not a pattern that scales sufficiently from my experience. I think so too :-) That's the problem with interconnected data :) Emmanuel On Mon 2013-11-18 23:05, Mircea Markus wrote: Neither the grouping API nor the AtomicMap work over hotrod. Between the grouping API and AtomicMap, I think the one that would make more sense migrating is the grouping API. One way or the other, I think the hotrod protocol would require an enhancement - mind raising a JIRA for that? For now I guess you can sacrifice performance and always sending the entire object across on every update instead of only the deltas? On Nov 18, 2013, at 9:56 AM,
Re: [infinispan-dev] Dropping AtomicMap/FineGrainedAtomicMap
On Jan 20, 2014, at 11:28 AM, Galder Zamarreño gal...@redhat.com wrote: Hi all, Dropping AtomicMap and FineGrainedAtomicMap was discussed last week in the F2F meeting [1]. It's complex and buggy, and we'd recommend people to use the Grouping API instead [2]. Grouping API would allow data to reside together, while the standard map API would apply per-key locking. We don't have a timeline for this yet, but we want to get as much feedback on the topic as possible so that we can evaluate the options. +1 There's been a good discussion on this topic: http://goo.gl/WNs6KY Cheers, [1] https://issues.jboss.org/browse/ISPN-3901 [2] http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_the_grouping_api -- Galder Zamarreño gal...@redhat.com twitter.com/galderz Project Lead, Escalante http://escalante.io Engineer, Infinispan http://infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev