Re: [infinispan-dev] Store as binary

2014-01-21 Thread Galder Zamarreño

On Jan 21, 2014, at 1:36 PM, Sanne Grinovero sa...@infinispan.org wrote:

 What's the point for these tests? 

+1

 On 20 Jan 2014 15:48, Radim Vansa rva...@redhat.com wrote:
 OK, I have results for dist-udp-no-tx or local-no-tx modes on 8 nodes
 (in local mode the nodes don't communicate, naturally):
 Dist mode: 3 % down for reads, 1 % for writes
 Local mode: 19 % down for reads, 16 % for writes
 
 Details in [1], ^ is for both keys and values stored as binary.
 
 Radim
 
 [1]
 https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/4/artifact/report/All_report.html
 
 On 01/20/2014 11:14 AM, Pedro Ruivo wrote:
 
  On 01/20/2014 10:07 AM, Mircea Markus wrote:
  Would be interesting to see as well, though performance figure would not 
  include the network latency, hence it would not tell much about the 
  benefit of using this on a real life system.
  that's my point. I'm interested to see the worst scenario since all
  other cluster modes, will have a lower (or none) impact in performance.
 
  Of course, the best scenario would be only each node have access to
  remote keys...
 
  Pedro
 
  On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote:
 
  IMO, we should try the worst scenario: Local Mode + Single thread.
 
  this will show us the highest impact in performance.
  Cheers,
 
  ___
  infinispan-dev mailing list
  infinispan-dev@lists.jboss.org
  https://lists.jboss.org/mailman/listinfo/infinispan-dev
 
 
 --
 Radim Vansa rva...@redhat.com
 JBoss DataGrid QA
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
gal...@redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-21 Thread Sanne Grinovero
On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote:

 On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote:

 What's the point for these tests?

 +1

 To validate if storing the data in binary format yields better performance 
 than store is as a POJO.

That will highly depend on the scenarios you want to test for. AFAIK
this started after Paul described how session replication works in
WildFly, and we already know that both strategies are suboptimal with
the current options available: in his case the active node will always
write on the POJO, while the backup node will essentially only need to
store the buffer just in case he might need to take over.

Sure, one will be slower, but if you want to make a suggestion to him
about which configuration he should be using, we should measure his
use case, not a different one.

Even then as discussed in Palma, an in memory String representation
might be way more compact because of pooling of strings and a very
high likelihood for repeated headers (as common in web frameworks), so
you might want to measure the CPU vs storage cost on the receiving
side.. but then again your results will definitely depend on the input
data and assumptions on likelihood of failover, how often is being
written on the owner node vs on the other node (since he uses
locality), etc.. many factors I'm not seeing being considered here and
which could make a significant difference.

 As of now, it doesn't so I need to check why.

You could play with the test parameters until it produces an output
you like better, but I still see no point? This is not a realistic
scenario, at best it could help us document suggestions about which
scenarios you'd want to keep the option enabled vs disabled, but then
again I think we're wasting time as we could implement a better
strategy for Paul's use case: one which never deserializes a value
received from a remote node until it's been requested as a POJO, but
keeps the POJO as-is when it's stored locally. I believe that would
make sense also for OGM and probably most other users of Embedded.
Basically, that would re-implement something similar to the previous
design but simplifying it a bit so that it doesn't allow for a
back-and-forth conversion between storage types but rather dynamically
favors a specific storage strategy.

Cheers,
Sanne


 Cheers,
 --
 Mircea Markus
 Infinispan lead (www.infinispan.org)





 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Integration between HotRod and OGM

2014-01-21 Thread Mircea Markus
Hi Emmanuel,

Just had a good chat with Davide on this and one solution to overcome the 
shortcoming you mentioned in the above email would be to enhance the hotrod 
client to support grouping:

RemoteClient.put(G g, K k, V v); //first param is the group
RemoteClinet.getGroup(G g) : MapK,V;

It requires an enhancement on our local grouping API: 
EmbeddedCache.getGroup(G). This is something useful for us in a broader 
context, as it is the step needed to be able to deprecated AtomicMaps and get 
suggest them being replaced with Grouping. 

This approach still has some limitations compared to the current embedded 
integration: 
- performance caused by the lack of transactions: this means increased TCP 
chattiness between the Hot Rod client and the server. 
- you'd have to handle atomicity, potentially by retrying an operation

What do you think?


On Dec 3, 2013, at 3:10 AM, Mircea Markus mmar...@redhat.com wrote:

 
 On Nov 19, 2013, at 10:22 AM, Emmanuel Bernard emman...@hibernate.org wrote:
 
 It's an interesting approach that would work fine-ish for entities
 assuming the Hot Rod client is multi threaded and assuming the client
 uses Future to parallelize the calls.
 
 The Java Hotrod client is both multithreaded and exposes an Async API.
 
 
 But it won't work for associations as we have them designed today.
 Each association - or more precisely the query results to go from an
 entity A1 to the list of entities B associated to it - is represented by
 an AtomicMap.
 Each entry in this map does correspond to an entry in the association.
 
 While we can guess the column names and build from the metadata the
 list of composed keys for entities, we cannot do the same for
 associations as the key is literally the (composite) id of the
 association and we cannot guess that most of the time (we can in very
 pathological cases).
 We could imagine that we list the association row keys in a special
 entry to work around that but this approach is just as problematic and
 is conceptually the same.
 The only solution would be to lock the whole association for each
 operation and I guess impose some versioning / optimistic lock.
 
 That is not a pattern that scales sufficiently from my experience.
 
 I think so too :-)
 
 That's the problem with interconnected data :)
 
 Emmanuel
 
 On Mon 2013-11-18 23:05, Mircea Markus wrote:
 Neither the grouping API nor the AtomicMap work over hotrod.
 Between the grouping API and AtomicMap, I think the one that would make 
 more sense migrating is the grouping API.
 One way or the other, I think the hotrod protocol would require an 
 enhancement - mind raising a JIRA for that?
 For now I guess you can sacrifice performance and always sending the entire 
 object across on every update instead of only the deltas?
 
 On Nov 18, 2013, at 9:56 AM, Emmanuel Bernard emman...@hibernate.org 
 wrote:
 
 Someone mentioned the grouping API as some sort of alternative to
 AtomicMap. Maybe we should use that?
 Note that if we don't have a fine-grained approach we will need to
 make sure we *copy* the complex data structure upon reads to mimic
 proper transaction isolation.
 
 On Tue 2013-11-12 15:14, Sanne Grinovero wrote:
 On 12 November 2013 14:54, Emmanuel Bernard emman...@hibernate.org 
 wrote:
 On the transaction side, we can start without them.
 
 +1 on omitting transactions for now.
 
 And on the missing AtomicMaps, I hope the Infinispan will want to 
 implement it?
 Would be good to eventually converge on similar featuresets on remote
 vs embedded APIs.
 
 I know the embedded version relies on batching/transactions, but I
 guess we could obtain a similar effect with some ad-hoc commands in
 Hot Rod?
 
 Sanne
 
 
 On Tue 2013-11-12 14:34, Davide D'Alto wrote:
 Hi,
 I'm working on the integration between HotRod and OGM.
 
 We already have a dialect for Inifinispan and I'm trying to follow the 
 same
 logic.
 At the moment I'm having two problems:
 
 1) In the Infinispan dialect we are using the AtomicMap and the
 AtomicMapLookup but this classes don't work with the RemoteCache. Is 
 there
 an equivalent for HotRod?
 
 2) As far as I know HotRod does not support transactions. I've found a 
 link
 to a branch on Mircea repository:
 https://github.com/mmarkus/ops_over_hotrod/wiki/Usage-guide
 Is this something I could/should use?
 
 Any help is appreciated.
 
 Thanks,
 Davide
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 

Re: [infinispan-dev] Integration between HotRod and OGM

2014-01-21 Thread Sanne Grinovero
Hi Mircea,
could you explain how Grouping is different than AtomicMaps ?
I understand you're all suggesting to move to AtomicMaps as the
implementation is better but is that an implementation detail, or how
is it inherently different so that we can build something more
reliable on it?

From the limited knowledge I have in this area, I have been assuming -
since they have very similar properties - that this was essentially a
different syntax to get to the same semantics but obviously I'm wrong.

It would be especially helpfull to have a clear comparison on the
different semantics in terms of transactions, atomicity and visibility
of state across the three kinds: AtomicMaps, FineGrainedAtomicMaps,
Grouping.

Let's also keep in mind that Hibernate OGM uses a carefully selected
combination of *both* AtomicMap and FGAM instances - depending on the
desired semantics we want to achieve, so since those two where clearly
different and we actually build on those differences - I'm not seeing
how we could migrate two different things to the same construct
without having to move fishy locking details out of Infinispan but
in OGM, and I wouldn't be too happy with that as such logic would
belong in Infinispan to provide.

- Sanne


On 21 January 2014 15:07, Mircea Markus mmar...@redhat.com wrote:
 Hi Emmanuel,

 Just had a good chat with Davide on this and one solution to overcome the 
 shortcoming you mentioned in the above email would be to enhance the hotrod 
 client to support grouping:

 RemoteClient.put(G g, K k, V v); //first param is the group
 RemoteClinet.getGroup(G g) : MapK,V;

 It requires an enhancement on our local grouping API: 
 EmbeddedCache.getGroup(G). This is something useful for us in a broader 
 context, as it is the step needed to be able to deprecated AtomicMaps and get 
 suggest them being replaced with Grouping.

 This approach still has some limitations compared to the current embedded 
 integration:
 - performance caused by the lack of transactions: this means increased TCP 
 chattiness between the Hot Rod client and the server.
 - you'd have to handle atomicity, potentially by retrying an operation

 What do you think?


 On Dec 3, 2013, at 3:10 AM, Mircea Markus mmar...@redhat.com wrote:


 On Nov 19, 2013, at 10:22 AM, Emmanuel Bernard emman...@hibernate.org 
 wrote:

 It's an interesting approach that would work fine-ish for entities
 assuming the Hot Rod client is multi threaded and assuming the client
 uses Future to parallelize the calls.

 The Java Hotrod client is both multithreaded and exposes an Async API.


 But it won't work for associations as we have them designed today.
 Each association - or more precisely the query results to go from an
 entity A1 to the list of entities B associated to it - is represented by
 an AtomicMap.
 Each entry in this map does correspond to an entry in the association.

 While we can guess the column names and build from the metadata the
 list of composed keys for entities, we cannot do the same for
 associations as the key is literally the (composite) id of the
 association and we cannot guess that most of the time (we can in very
 pathological cases).
 We could imagine that we list the association row keys in a special
 entry to work around that but this approach is just as problematic and
 is conceptually the same.
 The only solution would be to lock the whole association for each
 operation and I guess impose some versioning / optimistic lock.

 That is not a pattern that scales sufficiently from my experience.

 I think so too :-)

 That's the problem with interconnected data :)

 Emmanuel

 On Mon 2013-11-18 23:05, Mircea Markus wrote:
 Neither the grouping API nor the AtomicMap work over hotrod.
 Between the grouping API and AtomicMap, I think the one that would make 
 more sense migrating is the grouping API.
 One way or the other, I think the hotrod protocol would require an 
 enhancement - mind raising a JIRA for that?
 For now I guess you can sacrifice performance and always sending the 
 entire object across on every update instead of only the deltas?

 On Nov 18, 2013, at 9:56 AM, Emmanuel Bernard emman...@hibernate.org 
 wrote:

 Someone mentioned the grouping API as some sort of alternative to
 AtomicMap. Maybe we should use that?
 Note that if we don't have a fine-grained approach we will need to
 make sure we *copy* the complex data structure upon reads to mimic
 proper transaction isolation.

 On Tue 2013-11-12 15:14, Sanne Grinovero wrote:
 On 12 November 2013 14:54, Emmanuel Bernard emman...@hibernate.org 
 wrote:
 On the transaction side, we can start without them.

 +1 on omitting transactions for now.

 And on the missing AtomicMaps, I hope the Infinispan will want to 
 implement it?
 Would be good to eventually converge on similar featuresets on remote
 vs embedded APIs.

 I know the embedded version relies on batching/transactions, but I
 guess we could obtain a similar effect with some ad-hoc commands in
 Hot Rod?

 Sanne


Re: [infinispan-dev] Store as binary

2014-01-21 Thread Mircea Markus

On Jan 21, 2014, at 2:13 PM, Sanne Grinovero sa...@infinispan.org wrote:

 On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote:
 
 On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote:
 
 What's the point for these tests?
 
 +1
 
 To validate if storing the data in binary format yields better performance 
 than store is as a POJO.
 
 That will highly depend on the scenarios you want to test for. AFAIK
 this started after Paul described how session replication works in
 WildFly, and we already know that both strategies are suboptimal with
 the current options available: in his case the active node will always
 write on the POJO, while the backup node will essentially only need to
 store the buffer just in case he might need to take over.

Indeed as it is today, it doesn't make sense for WildFly's session replication.

 
 Sure, one will be slower, but if you want to make a suggestion to him
 about which configuration he should be using, we should measure his
 use case, not a different one.
 
 Even then as discussed in Palma, an in memory String representation
 might be way more compact because of pooling of strings and a very
 high likelihood for repeated headers (as common in web frameworks),

pooling like in String.intern()? 
Even so, if most of your access to the String is to serialize it and sent is 
remotely then you have a serialization cost(CPU) to pay for the reduced size.

 so
 you might want to measure the CPU vs storage cost on the receiving
 side.. but then again your results will definitely depend on the input
 data and assumptions on likelihood of failover, how often is being
 written on the owner node vs on the other node (since he uses
 locality), etc.. many factors I'm not seeing being considered here and
 which could make a significant difference.

I'm looking for the default setting of storeAsBinary in the configurations we 
ship. I think the default configs should be optimized for distribution, random 
key access (every reads/writes for any key executes on every node of the 
cluster with the same probability) for both read an write.

 
 As of now, it doesn't so I need to check why.
 
 You could play with the test parameters until it produces an output
 you like better, but I still see no point?

the point is to provide the best defaults params for the default config, and 
see what's the usefulness of storeAsBinary.  

 This is not a realistic
 scenario, at best it could help us document suggestions about which
 scenarios you'd want to keep the option enabled vs disabled, but then
 again I think we're wasting time as we could implement a better
 strategy for Paul's use case: one which never deserializes a value
 received from a remote node until it's been requested as a POJO, but
 keeps the POJO as-is when it's stored locally.

I disagree: Paul's scenario, whilst very important, is quite specific. For what 
I consider the general case (random key access, see above), your approach is 
suboptimal.  


 I believe that would
 make sense also for OGM and probably most other users of Embedded.
 Basically, that would re-implement something similar to the previous
 design but simplifying it a bit so that it doesn't allow for a
 back-and-forth conversion between storage types but rather dynamically
 favors a specific storage strategy.

It all boils down to what we want to optimize for: random key access or some 
degree of affinity. I think the former is the default.
One way or the other, from the test Radim ran with random key access, the 
storeAsBinary doesn't bring any benefit and it should: 
http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html

 
 Cheers,
 Sanne
 
 
 Cheers,
 --
 Mircea Markus
 Infinispan lead (www.infinispan.org)
 
 
 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Integration between HotRod and OGM

2014-01-21 Thread Mircea Markus

On Jan 21, 2014, at 4:08 PM, Sanne Grinovero sa...@infinispan.org wrote:

 Hi Mircea,
 could you explain how Grouping is different than AtomicMaps ?

Here's the original thread where this has been discussed: http://goo.gl/WNs6KY 
I would add to that that the AtomicMap requires transactions, which grouping 
doesn't. Also in the context of hotrod (i.e. this email thread) FGAM is a 
structure that would be harder to migrate over 

 I understand you're all suggesting to move to AtomicMaps as the
 implementation is better

we're suggesting to move from the AM to grouping

 but is that an implementation detail, or how
 is it inherently different so that we can build something more
 reliable on it?

They both are doing pretty much the same thing, so it's more a matter of 
choosing one instead of the other. Grouping fits way nicer into the picture, 
both as a concept and the  implementation.

 
 From the limited knowledge I have in this area, I have been assuming -
 since they have very similar properties - that this was essentially a
 different syntax to get to the same semantics but obviously I'm wrong.
 
 It would be especially helpfull to have a clear comparison on the
 different semantics in terms of transactions, atomicity and visibility
 of state across the three kinds: AtomicMaps, FineGrainedAtomicMaps,
 Grouping.
 
 Let's also keep in mind that Hibernate OGM uses a carefully selected
 combination of *both* AtomicMap and FGAM instances - depending on the
 desired semantics we want to achieve, so since those two where clearly
 different and we actually build on those differences - I'm not seeing
 how we could migrate two different things to the same construct
 without having to move fishy locking details out of Infinispan but
 in OGM, and I wouldn't be too happy with that as such logic would
 belong in Infinispan to provide.

I wasn't aware that OGM still uses AtomicMap, but the only case in which I 
imagine that would be useful is in order to force a lock on the whole 
AtomicMap. Is that so or some other aspect that I'm missing? 

 
 - Sanne
 
 
 On 21 January 2014 15:07, Mircea Markus mmar...@redhat.com wrote:
 Hi Emmanuel,
 
 Just had a good chat with Davide on this and one solution to overcome the 
 shortcoming you mentioned in the above email would be to enhance the hotrod 
 client to support grouping:
 
 RemoteClient.put(G g, K k, V v); //first param is the group
 RemoteClinet.getGroup(G g) : MapK,V;
 
 It requires an enhancement on our local grouping API: 
 EmbeddedCache.getGroup(G). This is something useful for us in a broader 
 context, as it is the step needed to be able to deprecated AtomicMaps and 
 get suggest them being replaced with Grouping.
 
 This approach still has some limitations compared to the current embedded 
 integration:
 - performance caused by the lack of transactions: this means increased TCP 
 chattiness between the Hot Rod client and the server.
 - you'd have to handle atomicity, potentially by retrying an operation
 
 What do you think?
 
 
 On Dec 3, 2013, at 3:10 AM, Mircea Markus mmar...@redhat.com wrote:
 
 
 On Nov 19, 2013, at 10:22 AM, Emmanuel Bernard emman...@hibernate.org 
 wrote:
 
 It's an interesting approach that would work fine-ish for entities
 assuming the Hot Rod client is multi threaded and assuming the client
 uses Future to parallelize the calls.
 
 The Java Hotrod client is both multithreaded and exposes an Async API.
 
 
 But it won't work for associations as we have them designed today.
 Each association - or more precisely the query results to go from an
 entity A1 to the list of entities B associated to it - is represented by
 an AtomicMap.
 Each entry in this map does correspond to an entry in the association.
 
 While we can guess the column names and build from the metadata the
 list of composed keys for entities, we cannot do the same for
 associations as the key is literally the (composite) id of the
 association and we cannot guess that most of the time (we can in very
 pathological cases).
 We could imagine that we list the association row keys in a special
 entry to work around that but this approach is just as problematic and
 is conceptually the same.
 The only solution would be to lock the whole association for each
 operation and I guess impose some versioning / optimistic lock.
 
 That is not a pattern that scales sufficiently from my experience.
 
 I think so too :-)
 
 That's the problem with interconnected data :)
 
 Emmanuel
 
 On Mon 2013-11-18 23:05, Mircea Markus wrote:
 Neither the grouping API nor the AtomicMap work over hotrod.
 Between the grouping API and AtomicMap, I think the one that would make 
 more sense migrating is the grouping API.
 One way or the other, I think the hotrod protocol would require an 
 enhancement - mind raising a JIRA for that?
 For now I guess you can sacrifice performance and always sending the 
 entire object across on every update instead of only the deltas?
 
 On Nov 18, 2013, at 9:56 AM, 

Re: [infinispan-dev] Dropping AtomicMap/FineGrainedAtomicMap

2014-01-21 Thread Mircea Markus

On Jan 20, 2014, at 11:28 AM, Galder Zamarreño gal...@redhat.com wrote:

 Hi all,
 
 Dropping AtomicMap and FineGrainedAtomicMap was discussed last week in the 
 F2F meeting [1]. It's complex and buggy, and we'd recommend people to use the 
 Grouping API instead [2]. Grouping API would allow data to reside together, 
 while the standard map API would apply per-key locking.
 
 We don't have a timeline for this yet, but we want to get as much feedback on 
 the topic as possible so that we can evaluate the options.

+1
There's been a good discussion on this topic: http://goo.gl/WNs6KY


 
 Cheers,
 
 [1] https://issues.jboss.org/browse/ISPN-3901
 [2] 
 http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_the_grouping_api
 --
 Galder Zamarreño
 gal...@redhat.com
 twitter.com/galderz
 
 Project Lead, Escalante
 http://escalante.io
 
 Engineer, Infinispan
 http://infinispan.org
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev