Re: [infinispan-dev] Infinispan and change data capture

2016-12-15 Thread Randall Hauch

> On Dec 15, 2016, at 9:09 AM, Sanne Grinovero  wrote:
> 
> Thanks Randall,
> those clarifications have been great.
> 
> Emmanuel: some of your statements conflict with Randall's
> clarifications and with the feasibility points I've been pointing at.
> You say "collect *all* changes". I've been questioning that Infinispan
> can not keep *all* changes around for a given single key; I understand
> we'd allow clients to retrieve streams of changes persisted into
> Kafka, but we need to be clear that we won't be handling *all* changes
> to Kafka (nor to Debezium), so the magic these can do is somewhat
> limited. They can certainly expand on the capabilities that Infinispan
> would provide on its own, but some of the use cases which Gustavo
> mentioned would not be suitable.

I’m not sure we were saying conflicting things. I was saying what is possible: 
Debezium would capture whatever it can from Infinispan via a client listener 
API and record it as a stream of change events. I think Emmanuel was arguing 
that the stream will be (far?) more useful to a wider range of consumers if it 
has every change made by Infinispan, compared to a stream that contains only 
some of the changes made in Infinispan.

> 
> I don't think this is a big problem in practice though; take the
> example of monitoring fluctuations of value of some stock symbol for
> example: it wouldn't be possible to investigate derivative numbers
> from these fluctuations just from the Key/Value pair "stock name" /
> "value", however people can store such events in a different way, for
> example by using a composite key "stock name" + "timestamp". People
> just need clarity on how this works, including us to model the storage
> appropriately.
> 
> Thanks,
> Sanne
> 
> 
> On 15 December 2016 at 09:54, Emmanuel Bernard  wrote:
>> The goal is as followed: allow to collect all changes to push them to 
>> Debezium and thus Kafka.
>> 
>> This need does not require to remember all changes since the beginning of 
>> time in Infinispan. Just enough to:
>> - let Kafka catchup assuming it is the bottleneck
>> - let us not lose a change in Kafka when it happened in Infinispan 
>> (coordinator, owner, replicas dying)
>> 
>> The ability to read back history would then be handled by the Debezium / 
>> Kafka tail, not infinispan itself.
>> 
>> Check my email on this tread from Dec 9th.
>> 
>>> On 12 Dec 2016, at 16:13, Sanne Grinovero  wrote:
>>> 
>>> I'm reading many clever suggestions for various aspects of such a
>>> system, but I fail to see a clear definition of the goal.
>>> 
 From Randall's opening email I understand how MySQL does this, but
>>> it's an example and I'm not sure which aspects are implementation
>>> details of how MySQL happens to accomplish this, and which aspects are
>>> requirements for the Infinispan enhancement proposals.
>>> 
>>> I remember a meeting with Manik Surtani, Jonathan Halliday and Mark
>>> Little, whose outcome was a general agreement that Infinispan would
>>> eventually need both tombstones and versioned entries, not just for
>>> change data capture but to improve several other aspects;
>>> unfortunately that was in December 2010 and never became a priority,
>>> but the benefits are clear.
>>> The complexities which have put off such plans lie in the "garbage
>>> collection", aka the need to not grow the history without bounds, and
>>> have to drop or compact history.
>>> 
>>> So I'm definitely sold on the need to add a certain amount of history,
>>> but we need to define how much of this history is expected to be held.
>>> 
>>> In short, what's the ultimate goal? I see two main but different
>>> options intertwined:
>>> - allow to synchronize the *final state* of a replica
>>> - inspect specific changes
>>> 
>>> For the first case, it would be enough for us to be able to provide a
>>> "squashed history" (as in Git squash), but we'd need to keep versioned
>>> shapshots around and someone needs to tell you which ones can be
>>> garbage collected.
>>> For example when a key is: written, updated, updated, deleted since
>>> the snapshot, we'll send only "deleted" as the intermediary states are
>>> irrelevant.
>>> For the second case, say the goal is to inspect fluctuations of price
>>> variations of some item, then the intermediary states are not
>>> irrelevant.
>>> 
>>> Which one will we want to solve? Both?
>>> Personally the attempt of solving the second one seems like a huge
>>> pivot of the project, the current data-structures and storage are not
>>> designed for this. I see the value of such benefits, but maybe
>>> Infinispan is not the right tool for such a problem.
>>> 
>>> I'd prefer to focus on the benefits of the squashed history, and have
>>> versioned entries soon, but even in that case we need to define which
>>> versions need to be kept around, and how garbage collection /
>>> vacuuming is handled.
>>> This can be designed to be transparent to the client: handled as an
>>> internal implementation detail which 

Re: [infinispan-dev] Infinispan and change data capture

2016-12-14 Thread Randall Hauch

> On Dec 14, 2016, at 7:58 AM, Sanne Grinovero  wrote:
> 
> On 12 December 2016 at 17:56, Gustavo Fernandes  > wrote:
>> On Mon, Dec 12, 2016 at 3:13 PM, Sanne Grinovero 
>> wrote:
>> 
>>> 
>>> In short, what's the ultimate goal? I see two main but different
>>> options intertwined:
>>> - allow to synchronize the *final state* of a replica
>> 
>> 
>> I'm assuming this case is already in place when using remote listeners and
>> includeCurrentState=true and we are
>> discussing how to improve it, as described in the proposal in the wiki and
>> on the 5th email of this thread.
>> 
>>> 
>>> - inspect specific changes
>>> 
>>> For the first case, it would be enough for us to be able to provide a
>>> "squashed history" (as in Git squash), but we'd need to keep versioned
>>> shapshots around and someone needs to tell you which ones can be
>>> garbage collected.
>>> For example when a key is: written, updated, updated, deleted since
>>> the snapshot, we'll send only "deleted" as the intermediary states are
>>> irrelevant.
>>> For the second case, say the goal is to inspect fluctuations of price
>>> variations of some item, then the intermediary states are not
>>> irrelevant.
>>> 
>>> Which one will we want to solve? Both?
>> 
>> 
>> 
>> Looking at http://debezium.io/, it implies the second case.
> 
> That's what I'm asking which needs to be clarified.
> 
> If it's the second case, then while I appreciate the value of such a
> system I don't see it as a good fit for Infinispan.

If Infinispan were to allow a client to consume (within a reasonable amount of 
time) an event for every change, then Debezium would certainly then be able to 
capture these into a stream that is persisted for a much longer period of time.

OTOH, I think it’s reasonable for Infinispan to squash history as long as this 
doesn’t reorder changes and at least the last change is kept. Debezium can 
still work with this.

> 
>> 
>> "[...] Start it up, point it at your databases, and your apps can start
>> responding to all of the inserts, updates,
>> and deletes that other apps commit to your databases. [...] your apps can
>> respond quickly and never miss an event,
>> even when things go wrong."
>> 
>> IMO the choice between squashed/full history, and even retention time is
>> highly application specific. Deletes might
>> not even be involved, one may be interested on answering "what is the peak
>> value of a certain key during the day?"
> 
> Absolutely. And Infinispan might need to draw a line and clarify which
> problems it is meant to solve, and which problems are better solved
> with a different solution.

+1. Just be clear in what the listeners will see and what they won’t see.

And I guess we need to clarify what “never miss an event” means for Debezium: 
we capture every event that a source system exposes to us and will not lose any 
of them, but if using Kafka compaction then when replaying you’re guaranteed to 
see at least the most recent change for every key.

> 
> 
>>> Personally the attempt of solving the second one seems like a huge
>>> pivot of the project, the current data-structures and storage are not
>>> designed for this.
>> 
>> 
>> +1, as I wrote earlier about ditching the idea of event cache storage in
>> favor of Lucene.
> 
> Yes that's a great idea, but I'd like to discuss first were we want to get.
> 
>>> I see the value of such benefits, but maybe
>>> Infinispan is not the right tool for such a problem.
>>> 
>>> I'd prefer to focus on the benefits of the squashed history, and have
>>> versioned entries soon, but even in that case we need to define which
>>> versions need to be kept around, and how garbage collection /
>>> vacuuming is handled.
>> 
>> 
>> Is that proposal written/recorded somewhere? It'd be interesting to know how
>> a client interested on data
>> changes would consume those multi-versioned entries (push/pull with offset?,
>> sorted/unsorted?, client tracking/per key/per version?),
>> as it seems there is some storage impedance as well.
>> 
>>> 
>>> 
>>> In short, I'd like to see an agreement that analyzing e.g.
>>> fluctuations in stock prices would be a non-goal, if these are stored
>>> as {"stock name", value} key/value pairs. One could still implement
>>> such a thing by using a more sophisticated model, just don't expect to
>>> be able to see all intermediary values each entry has ever had since
>>> the key was first used.
>> 
>> 
>> 
>> Continuous Queries listens to data key/value data using a query, should it
>> not be expected to
>> see all the intermediary values when changes in the server causes an entry
>> to start/stop matching
>> the query?
> 
> That's exactly the doubt I'm raising: I'm not sure we set that
> expectations, and if we did then I don't agree with that choice, and I
> remember voicing concerns on feasibility of such aspects of CQ during
> early design.
> I might be a minority, but whatever the decision was I don't think
> this is now clear nor pro

Re: [infinispan-dev] Infinispan and change data capture

2016-12-09 Thread Randall Hauch
> On Dec 9, 2016, at 10:08 AM, Randall Hauch  wrote:
> 
>> 
>> On Dec 9, 2016, at 3:13 AM, Radim Vansa > <mailto:rva...@redhat.com>> wrote:
>> 
>> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote:
>>> 
>>> I recently updated a proposal [1] based on several discussions we had 
>>> in the past that is essentially about introducing an event storage 
>>> mechanism (write ahead log) in order to improve reliability, failover 
>>> and "replayability" for the remote listeners, any feedback greatly 
>>> appreciated.
>> 
>> Hi Gustavo,
>> 
>> while I really like the pull-style architecture and reliable events, I 
>> see some problematic parts here:
>> 
>> 1) 'cache that would persist the events with a monotonically increasing id'
>> 
>> I assume that you mean globally (for all entries) monotonous. How will 
>> you obtain such ID? Currently, commands have unique IDs that are 
>>  where the number part is monotonous per node. That's 
>> easy to achieve. But introducing globally monotonous counter means that 
>> there will be a single contention point. (you can introduce another 
>> contention points by adding backups, but this is probably unnecessary as 
>> you can find out the last id from the indexed cache data). Per-segment 
>> monotonous would be probably more scalabe, though that increases complexity.
> 
> It is complicated, but one way to do this is to have one “primary” node 
> maintain the log and to have other replicate from it. The cluster does need 
> to use consensus to agree which is the primary, and to know which secondary 
> becomes the primary if the primary is failing. Consensus is not trivial, but 
> JGroups Raft (http://belaban.github.io/jgroups-raft/ 
> <http://belaban.github.io/jgroups-raft/>) may be an option. However, this 
> approach ensures that the replica logs are identical to the primary since 
> they are simply recording the primary’s log as-is. Of course, another 
> challenge is what happens during a failure of the primary log node, and can 
> any transactions be performed/completed while the primary is unavailable.
> 
> Another option is to have each node maintain their own log, and to have an 
> aggregator log that merges/combines the various logs into one. Not sure how 
> feasible it is to merge logs by getting rid of duplicates and determining a 
> total order, but if it is then it may have better fault tolerance 
> characteristics.
> 
> Of course, it is possible to have node-specific monotonic IDs. For example, 
> MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and 
> then GTIDs consists of the node’s UUID plus a monotonically-increasing value 
> (e.g., “31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001”). The transaction log 
> contains a mix of GTIDs, and MySQL replication uses a “GTID set” to describe 
> the ranges of transactions known by a server (e.g., 
> “u1:1-100,u2:1-1,u3:3-5” where “u1”, “u2”, and “u3” are actually UUIDs). 
> So, when a MySQL replica connects, it says “I know about this GTID set", and 
> this tells the master where that client wants to start reading.

Emmanuel and I were talking offline. Another approach entirely is to have each 
node (optionally) write the changes it is making as a leader directly to Kafka, 
meaning that Kafka becomes the event log and delivery mechanism. Upon failure 
of that node, the node that becomes the new leader would write any of its 
events not already written by the former leader, and then continue writing new 
changes it is making as a leader. Thus, Infinispan would not be producing a 
single log with total order of all changes to a cache (which there isn’t one in 
Infinispan), but rather the total order of each key. (Kafka does this very 
nicely via topic partitions, where all changes for each key always get written 
to the same partition, and each partition has a total order.) This approach may 
still need separate “commit” events to reflect how Infinispan currently works 
internally.

Obviously Infinispan wouldn’t require this to be done, but when it’s enabled it 
might provide a much simpler way of capturing the history of changes to the 
events in an Infinispan cache. The HotRod client could consume the events 
directly from Kafka, or that could be left to a completely different 
client/utility. It does add a dependency on Kafka, but it means the Infinispan 
community doesn’t need to build much of the same functionality.

> 
>> 
>> 2) 'The write to the event log would be async in order to not affect 
>> normal data writes'
>> 
>> Who should write to the cache?
>> a) originator - what if originator crashes (despite the change has been 
>> a

Re: [infinispan-dev] Infinispan and change data capture

2016-12-09 Thread Randall Hauch

> On Dec 9, 2016, at 3:13 AM, Radim Vansa  wrote:
> 
> On 12/08/2016 10:13 AM, Gustavo Fernandes wrote:
>> 
>> I recently updated a proposal [1] based on several discussions we had 
>> in the past that is essentially about introducing an event storage 
>> mechanism (write ahead log) in order to improve reliability, failover 
>> and "replayability" for the remote listeners, any feedback greatly 
>> appreciated.
> 
> Hi Gustavo,
> 
> while I really like the pull-style architecture and reliable events, I 
> see some problematic parts here:
> 
> 1) 'cache that would persist the events with a monotonically increasing id'
> 
> I assume that you mean globally (for all entries) monotonous. How will 
> you obtain such ID? Currently, commands have unique IDs that are 
>  where the number part is monotonous per node. That's 
> easy to achieve. But introducing globally monotonous counter means that 
> there will be a single contention point. (you can introduce another 
> contention points by adding backups, but this is probably unnecessary as 
> you can find out the last id from the indexed cache data). Per-segment 
> monotonous would be probably more scalabe, though that increases complexity.

It is complicated, but one way to do this is to have one “primary” node 
maintain the log and to have other replicate from it. The cluster does need to 
use consensus to agree which is the primary, and to know which secondary 
becomes the primary if the primary is failing. Consensus is not trivial, but 
JGroups Raft (http://belaban.github.io/jgroups-raft/ 
) may be an option. However, this 
approach ensures that the replica logs are identical to the primary since they 
are simply recording the primary’s log as-is. Of course, another challenge is 
what happens during a failure of the primary log node, and can any transactions 
be performed/completed while the primary is unavailable.

Another option is to have each node maintain their own log, and to have an 
aggregator log that merges/combines the various logs into one. Not sure how 
feasible it is to merge logs by getting rid of duplicates and determining a 
total order, but if it is then it may have better fault tolerance 
characteristics.

Of course, it is possible to have node-specific monotonic IDs. For example, 
MySQL Global Transaction IDs (GTIDs) use a unique UUID for each node, and then 
GTIDs consists of the node’s UUID plus a monotonically-increasing value (e.g., 
“31fc48cd-ecd4-46ad-b0a9-f515fc9497c4:1001”). The transaction log contains a 
mix of GTIDs, and MySQL replication uses a “GTID set” to describe the ranges of 
transactions known by a server (e.g., “u1:1-100,u2:1-1,u3:3-5” where “u1”, 
“u2”, and “u3” are actually UUIDs). So, when a MySQL replica connects, it says 
“I know about this GTID set", and this tells the master where that client wants 
to start reading.

> 
> 2) 'The write to the event log would be async in order to not affect 
> normal data writes'
> 
> Who should write to the cache?
> a) originator - what if originator crashes (despite the change has been 
> added)? Besides, originator would have to do (async) RPC to primary 
> owner (which will be the primary owner of the event, too).
> b) primary owner - with triangle, primary does not really know if the 
> change has been written on backup. Piggybacking that info won't be 
> trivial - we don't want to send another message explicitly. But even if 
> we get the confirmation, since the write to event cache is async, if the 
> primary owner crashes before replicating the event to backup, we lost 
> the event
> c) all owners, but locally - that will require more complex 
> reconciliation if the event did really happen on all surviving nodes or 
> not. And backups could have some trouble to resolve order, too.
> 
> IIUC clustered listeners are called from primary owner before the change 
> is really confirmed on backups (@Pedro correct me if I am wrong, 
> please), but for this reliable event cache you need higher level of 
> consistency.

This could be handled by writing a confirmation or “commit” event to the log 
when the write is confirmed or the transaction is committed. Then, only those 
confirmed events/transactions would be exposed to client listeners. This 
requires some buffering, but this could be done in each HotRod client.

> 
> 3) The log will also have to filter out retried operations (based on 
> command ID - though this can be indexed, too). Though, I would prefer to 
> see per-event command-id log to deal with retries properly.

IIUC, a “commit” event would work here, too.

> 
> 4) Client should pull data, but I would keep push notifications that 
> 'something happened' (throttled on server). There could be use case for 
> rarely updated caches, and polling the servers would be excessive there.

IMO the clients should poll, but if the server has nothing to return it blocks 
until there is something or until a timeout occurs. This makes it e

Re: [infinispan-dev] Infinispan and change data capture

2016-12-08 Thread Randall Hauch
> On Dec 8, 2016, at 3:13 AM, Gustavo Fernandes  wrote:
> 
> On Wed, Dec 7, 2016 at 9:20 PM, Randall Hauch  <mailto:rha...@redhat.com>> wrote:
> Reviving this old thread, and as before I appreciate any help the Infinispan 
> community might provide. There definitely is interest in Debezium capturing 
> the changes being made to an Infinispan cluster. This isn’t as important when 
> Infinispan is used as a cache, but when Infinispan is used as a store then it 
> is important for other apps/services to be able to accurately keep up with 
> the changes being made in the store.
> 
>> On Jul 29, 2016, at 8:47 AM, Galder Zamarreño > <mailto:gal...@redhat.com>> wrote:
>> 
>> 
>> --
>> Galder Zamarreño
>> Infinispan, Red Hat
>> 
>>> On 11 Jul 2016, at 16:41, Randall Hauch >> <mailto:rha...@redhat.com>> wrote:
>>> 
>>>> 
>>>> On Jul 11, 2016, at 3:42 AM, Adrian Nistor >>> <mailto:anis...@redhat.com>> wrote:
>>>> 
>>>> Hi Randall,
>>>> 
>>>> Infinispan supports both push and pull access models. The push model is 
>>>> supported by events (and listeners), which are cluster wide and are 
>>>> available in both library and remote mode (hotrod). The notification 
>>>> system is pretty advanced as there is a filtering mechanism available that 
>>>> can use a hand coded filter / converter or one specified in jpql 
>>>> (experimental atm). Getting a snapshot of the initial data is also 
>>>> possible. But infinispan does not produce a transaction log to be used for 
>>>> determining all changes that happened since a previous connection time, so 
>>>> you'll always have to get a new full snapshot when re-connecting. 
>>>> 
>>>> So if Infinispan is the data store I would base the Debezium connector 
>>>> implementation on Infinispan's event notification system. Not sure about 
>>>> the other use case though.
>>>> 
>>> 
>>> Thanks, Adrian, for the feedback. A couple of questions.
>>> 
>>> You mentioned Infinispan has a pull model — is this just using the normal 
>>> API to read the entries?
>>> 
>>> With event listeners, a single connection will receive all of the events 
>>> that occur in the cluster, correct? Is it possible (e.g., a very 
>>> unfortunately timed crash) for a change to be made to the cache without an 
>>> event being produced and sent to listeners?
>> 
>> ^ Yeah, that can happen due to async nature of remote events. However, 
>> there's the possibility for clients, upon receiving a new topology, to 
>> receive the current state of the server as events, see [1] and [2]
>> 
>> [1] 
>> http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_state_consumption
>>  
>> <http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_state_consumption>
>> [2] 
>> http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_failure_handling
>>  
>> <http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_failure_handling>
> 
> It is critical that any change event stream is consistent with the store, and 
> the change event stream is worthless without it. Only when the change event 
> stream is an accurate representation of what changed can downstream consumers 
> use the stream to rebuild their own perfect copy of the upstream store and to 
> keep those copies consistent with the upstream store.
> 
> So, given that the events are handled asynchronously, in a cluster how are 
> multiple changes to a single entry handled. For example, if a client sets 
> entry , then a short time after that (or another) client sets entry 
> , is it guaranteed that a client listening to events will see  
> first and  some time later? Or is it possible that a client listening 
> might first see  and then ?
> 
>> 
>>> What happens if the network fails or partitions? How does cross site 
>>> replication address this?
>> 
>> In terms of cross-site, depends what the client is connected to. Clients can 
>> now failover between sites, so they should be able to deal with events too 
>> in the same as explained above.
>> 
>>> 
>>> Has there been any thought about adding to Infinispan a write ahead log or 
>>> transaction log to each node or, better yet, for the whole cluster?
>> 
>> Not that I'm aware of but we've recently added security audit log, so a 
>>

Re: [infinispan-dev] Infinispan and change data capture

2016-12-07 Thread Randall Hauch
Reviving this old thread, and as before I appreciate any help the Infinispan 
community might provide. There definitely is interest in Debezium capturing the 
changes being made to an Infinispan cluster. This isn’t as important when 
Infinispan is used as a cache, but when Infinispan is used as a store then it 
is important for other apps/services to be able to accurately keep up with the 
changes being made in the store.

> On Jul 29, 2016, at 8:47 AM, Galder Zamarreño  wrote:
> 
> 
> --
> Galder Zamarreño
> Infinispan, Red Hat
> 
>> On 11 Jul 2016, at 16:41, Randall Hauch  wrote:
>> 
>>> 
>>> On Jul 11, 2016, at 3:42 AM, Adrian Nistor  wrote:
>>> 
>>> Hi Randall,
>>> 
>>> Infinispan supports both push and pull access models. The push model is 
>>> supported by events (and listeners), which are cluster wide and are 
>>> available in both library and remote mode (hotrod). The notification system 
>>> is pretty advanced as there is a filtering mechanism available that can use 
>>> a hand coded filter / converter or one specified in jpql (experimental 
>>> atm). Getting a snapshot of the initial data is also possible. But 
>>> infinispan does not produce a transaction log to be used for determining 
>>> all changes that happened since a previous connection time, so you'll 
>>> always have to get a new full snapshot when re-connecting. 
>>> 
>>> So if Infinispan is the data store I would base the Debezium connector 
>>> implementation on Infinispan's event notification system. Not sure about 
>>> the other use case though.
>>> 
>> 
>> Thanks, Adrian, for the feedback. A couple of questions.
>> 
>> You mentioned Infinispan has a pull model — is this just using the normal 
>> API to read the entries?
>> 
>> With event listeners, a single connection will receive all of the events 
>> that occur in the cluster, correct? Is it possible (e.g., a very 
>> unfortunately timed crash) for a change to be made to the cache without an 
>> event being produced and sent to listeners?
> 
> ^ Yeah, that can happen due to async nature of remote events. However, 
> there's the possibility for clients, upon receiving a new topology, to 
> receive the current state of the server as events, see [1] and [2]
> 
> [1] 
> http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_state_consumption
>  
> <http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_state_consumption>
> [2] 
> http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_failure_handling
>  
> <http://infinispan.org/docs/dev/user_guide/user_guide.html#client_event_listener_failure_handling>

It is critical that any change event stream is consistent with the store, and 
the change event stream is worthless without it. Only when the change event 
stream is an accurate representation of what changed can downstream consumers 
use the stream to rebuild their own perfect copy of the upstream store and to 
keep those copies consistent with the upstream store.

So, given that the events are handled asynchronously, in a cluster how are 
multiple changes to a single entry handled. For example, if a client sets entry 
, then a short time after that (or another) client sets entry , 
is it guaranteed that a client listening to events will see  first and 
 some time later? Or is it possible that a client listening might first 
see  and then ?

> 
>> What happens if the network fails or partitions? How does cross site 
>> replication address this?
> 
> In terms of cross-site, depends what the client is connected to. Clients can 
> now failover between sites, so they should be able to deal with events too in 
> the same as explained above.
> 
>> 
>> Has there been any thought about adding to Infinispan a write ahead log or 
>> transaction log to each node or, better yet, for the whole cluster?
> 
> Not that I'm aware of but we've recently added security audit log, so a 
> transaction log might make sense too.

Without a transaction log, Debezium would have to use a client listener with 
includeCurrentState=true to obtain the state every time it reconnects. If 
Debezium just included all of this state in the event stream, then the stream 
might contain lots of superfluous or unnecessary events, then this impacts all 
downstream consumers by forcing them to spend a lot of time processing changes 
that never really happened. So the only way to avoid that would be for Debezium 
to use an external store to track the changes it has seen so far so that it 
doesn’t include unnecessary events in the change event stream. It’d b

Re: [infinispan-dev] Infinispan and change data capture

2016-07-11 Thread Randall Hauch

> On Jul 11, 2016, at 3:42 AM, Adrian Nistor  wrote:
> 
> Hi Randall,
> 
> Infinispan supports both push and pull access models. The push model is 
> supported by events (and listeners), which are cluster wide and are available 
> in both library and remote mode (hotrod). The notification system is pretty 
> advanced as there is a filtering mechanism available that can use a hand 
> coded filter / converter or one specified in jpql (experimental atm). Getting 
> a snapshot of the initial data is also possible. But infinispan does not 
> produce a transaction log to be used for determining all changes that 
> happened since a previous connection time, so you'll always have to get a new 
> full snapshot when re-connecting. 
> 
> So if Infinispan is the data store I would base the Debezium connector 
> implementation on Infinispan's event notification system. Not sure about the 
> other use case though.
> 

Thanks, Adrian, for the feedback. A couple of questions.

You mentioned Infinispan has a pull model — is this just using the normal API 
to read the entries?

With event listeners, a single connection will receive all of the events that 
occur in the cluster, correct? Is it possible (e.g., a very unfortunately timed 
crash) for a change to be made to the cache without an event being produced and 
sent to listeners? What happens if the network fails or partitions? How does 
cross site replication address this?

Has there been any thought about adding to Infinispan a write ahead log or 
transaction log to each node or, better yet, for the whole cluster?

Thanks again!

> Adrian
> 
> On 07/09/2016 04:38 PM, Randall Hauch wrote:
>> The Debezium project [1] is working on building change data capture 
>> connectors for a variety of databases. MySQL is available now, MongoDB will 
>> be soon, and PostgreSQL and Oracle are next on our roadmap. 
>> 
>> One way in which Debezium and Infinispan can be used together is when 
>> Infinispan is being used as a cache for data stored in a database. In this 
>> case, Debezium can capture the changes to the database and produce a stream 
>> of events; a separate process can consume these change and evict entries 
>> from an Infinispan cache.
>> 
>> If Infinispan is to be used as a data store, then it would be useful for 
>> Debezium to be able to capture those changes so other apps/services can 
>> consume the changes. First of all, does this make sense? Secondly, if it 
>> does, then Debezium would need an Infinispan connector, and it’s not clear 
>> to me how that connector might capture the changes from Infinispan.
>> 
>> Debezium typically monitors the log of transactions/changes that are 
>> committed to a database. Of course how this works varies for each type of 
>> database. For example, MySQL internally produces a transaction log that 
>> contains information about every committed row change, and MySQL ensures 
>> that every committed change is included and that non-committed changes are 
>> excluded. The MySQL mechanism is actually part of the replication mechanism, 
>> so slaves update their internal state by reading the master’s log. The 
>> Debezium MySQL connector [2] simply reads the same log.
>> 
>> Infinispan has several mechanisms that may be useful:
>> 
>> Interceptors - See [3]. This seems pretty straightforward and IIUC provides 
>> access to all internal operations. However, it’s not clear to me whether a 
>> single interceptor will see all the changes in a cluster (perhaps in local 
>> and replicated modes) or only those changes that happen on that particular 
>> node (in distributed mode). It’s also not clear whether this interceptor is 
>> called within the context of the cache’s transaction, so if a failure 
>> happens just at the wrong time whether a change might be made to the cache 
>> but is not seen by the interceptor (or vice versa).
>> Cross-site replication - See [4][5]. A potential advantage of this mechanism 
>> appears to be that it is defined (more) globally, and it appears to function 
>> if the remote backup comes back online after being offline for a period of 
>> time.
>> State transfer - is it possible to participate as a non-active member of the 
>> cluster, and to effectively read all state transfer activities that occur 
>> within the cluster?
>> Cache store - tie into the cache store mechanism, perhaps by wrapping an 
>> existing cache store and sitting between the cache and the cache store
>> Monitor the cache store - don’t monitor Infinispan at all, and instead 
>> monitor the store in which Infinispan is storing entries. (This is probably 
>> the least attractive, sinc

[infinispan-dev] Infinispan and change data capture

2016-07-09 Thread Randall Hauch
The Debezium project [1] is working on building change data capture connectors 
for a variety of databases. MySQL is available now, MongoDB will be soon, and 
PostgreSQL and Oracle are next on our roadmap. 

One way in which Debezium and Infinispan can be used together is when 
Infinispan is being used as a cache for data stored in a database. In this 
case, Debezium can capture the changes to the database and produce a stream of 
events; a separate process can consume these change and evict entries from an 
Infinispan cache.

If Infinispan is to be used as a data store, then it would be useful for 
Debezium to be able to capture those changes so other apps/services can consume 
the changes. First of all, does this make sense? Secondly, if it does, then 
Debezium would need an Infinispan connector, and it’s not clear to me how that 
connector might capture the changes from Infinispan.

Debezium typically monitors the log of transactions/changes that are committed 
to a database. Of course how this works varies for each type of database. For 
example, MySQL internally produces a transaction log that contains information 
about every committed row change, and MySQL ensures that every committed change 
is included and that non-committed changes are excluded. The MySQL mechanism is 
actually part of the replication mechanism, so slaves update their internal 
state by reading the master’s log. The Debezium MySQL connector [2] simply 
reads the same log.

Infinispan has several mechanisms that may be useful:

Interceptors - See [3]. This seems pretty straightforward and IIUC provides 
access to all internal operations. However, it’s not clear to me whether a 
single interceptor will see all the changes in a cluster (perhaps in local and 
replicated modes) or only those changes that happen on that particular node (in 
distributed mode). It’s also not clear whether this interceptor is called 
within the context of the cache’s transaction, so if a failure happens just at 
the wrong time whether a change might be made to the cache but is not seen by 
the interceptor (or vice versa).
Cross-site replication - See [4][5]. A potential advantage of this mechanism 
appears to be that it is defined (more) globally, and it appears to function if 
the remote backup comes back online after being offline for a period of time.
State transfer - is it possible to participate as a non-active member of the 
cluster, and to effectively read all state transfer activities that occur 
within the cluster?
Cache store - tie into the cache store mechanism, perhaps by wrapping an 
existing cache store and sitting between the cache and the cache store
Monitor the cache store - don’t monitor Infinispan at all, and instead monitor 
the store in which Infinispan is storing entries. (This is probably the least 
attractive, since some stores can’t be monitored, or because the store is 
persisting an opaque binary value.)

Are there other mechanism that might be used?

There are a couple of important requirements for change data capture to be able 
to work correctly:

Upon initial connection, the CDC connector must be able to obtain a snapshot of 
all existing data, followed by seeing all changes to data that may have 
occurred since the snapshot was started. If the connector is stopped/fails, 
upon restart it needs to be able to reconnect and either see all changes that 
occurred since it last was capturing changes, or perform a snapshot. 
(Performing a snapshot upon restart is very inefficient and undesirable.) This 
works as follows: the CDC connector only records the “offset” in the source’s 
sequence of events; what this “offset” entails depends on the source. Upon 
restart, the connector can use this offset information to coordinate with the 
source where it wants to start reading. (In MySQL and PostgreSQL, every event 
includes the filename of the log and position in that file. MongoDB includes in 
each event the monotonically increasing timestamp of the transaction.
No change can be missed, even when things go wrong and components crash.
When a new entry is added, the “after” state of the entity will be included. 
When an entry is updated, the “after” state will be included in the event; if 
possible, the event should also include the “before” state. When an entry is 
removed, the “before” state should be included in the event.

Any thoughts or advice would be greatly appreciated.

Best regards,

Randall


[1] http://debezium.io
[2] http://debezium.io/docs/connectors/mysql/
[3] 
http://infinispan.org/docs/stable/user_guide/user_guide.html#_custom_interceptors_chapter
[4] 
http://infinispan.org/docs/stable/user_guide/user_guide.html#CrossSiteReplication
[5] 
https://github.com/infinispan/infinispan/wiki/Design-For-Cross-Site-Replication___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Distributed Counter Discussion

2016-03-15 Thread Randall Hauch

> On Mar 15, 2016, at 2:12 AM, Bela Ban  wrote:
> 
> 
> 
> On 14/03/16 23:17, Randall Hauch wrote:
>> What are the requirements? What are the distributed counters for? Is the
>> counter to be monotonically increasing? Can there be any missed values?
>> Does the counter need to increment and decrement? What is the *smallest*
>> API you need initially?
>> 
>> There are two choices when implementing a distributed counter: use
>> central coordination (like JGroups counters), or use independent
>> counters on separate machines that will eventually converge to the
>> correct value (CRDTs). Coordinated counters are expensive and therefore
>> slow, and can suffer from problems during network or cluster problems.
> 
> The question is what do you get for this? If your app can't afford 
> duplicate counter values during a network partition, then - yes - there 
> is some overhead. CRDTs won't be able to guarantee this property. OTOH 
> CRDTs are fast when you only care about some sort of eventual 
> consistency, and don't need 'hard' consistency.

To be clear, I’m not saying they are interchangeable. They have very different 
properties, which is why the requirements will help determine which of them (if 
any) are applicable.

> 
> 
>> For example, what happens during a split brain? OTOH, CRDTs are
>> decentralized so therefore are very fast, easily merged, and fault
>> tolerant;
> 
> Yes, CRDTS are AP whereas jgroups-raft counters are CP. JGroups 
> counters, otoh, are CRAP (consistent, reliable, available and 
> partition-aware.
> 
> Take the last sentence with a grain of salt :-)
> 
>> they’re excellent when counting things that are occurring
>> independently and therefore may be more suited for
>> monitoring/metrics/accumulators/etc.  Both have very different behaviors
>> under ideal and failure scenarios, have different performance and
>> consistency guarantees, and are useful in different scenarios. Make sure
>> you choose accordingly.
>> 
>> For information about CRDTs, make sure you’ve read the CRDT paper by
>> Shapiro: http://hal.upmc.fr/inria-0088/document 
>> <http://hal.upmc.fr/inria-0088/document>
>> 
>> Randall
>> 
>>> On Mar 14, 2016, at 2:14 PM, Pedro Ruivo >> <mailto:pe...@infinispan.org>
>>> <mailto:pe...@infinispan.org <mailto:pe...@infinispan.org>>> wrote:
>>> 
>>> Hi everybody,
>>> 
>>> Discussion about distributed counters.
>>> 
>>> == Public API ==
>>> 
>>> interface Counter
>>> 
>>> String getName() //counter name
>>> long get()//current value. may return stale value due to concurrent
>>> operations to other nodes.
>>> void increment() //async or sync increment. default add(1)
>>> void decrement() //async or sync decrement. default add(-1)
>>> void add(long)   //async or sync add.
>>> void reset() //resets to initial value
>>> 
>>> Note: Tried to make the interface as simple as possible with support for
>>> sync and async operations. To avoid any confusion, I consider an async
>>> operation as happening somewhat in the future, i.e. eventually
>>> increments/decrements.
>>> The sync operation happens somewhat during the method execution.
>>> 
>>> interface AtomiCounter extends Counter
>>> 
>>> long addAndGet()   //adds a returns the new value. sync operation
>>> long incrementAndGet() //increments and returns the new value. sync
>>> operation. default addAndGet(1)
>>> long decrementAndGet() //decrements and returns the new value. sync
>>> operation. default addAndGet(-1)
>>> 
>>> interface AdvancedCounter extends Counter
>>> 
>>> long getMin/MaxThreshold() //returns the min and max threshold value
>>> void add/removeListener()  //adds a listener that is invoked when the
>>> value change. Can be extended to notify when it is "reseted" and when
>>> the threshold is reached.
>>> 
>>> Note: should this interface be splitted?
>>> 
>>> == Details ==
>>> 
>>> This is what I have in mind. Two counter managers: one based on JGroups
>>> counter and another one based on Infinispan cache.
>>> The first one creates AtomicCounters and it first perfectly. All
>>> counters are created with an initial value (zero by default)
>>> The second generates counters with all the options available. It can mix
>>> sync/async operation and all counters will be in the same cache. The
&g

Re: [infinispan-dev] Distributed Counter Discussion

2016-03-15 Thread Randall Hauch

> On Mar 15, 2016, at 3:26 AM, Radim Vansa  wrote:
> 
> I second Sanne's opinion about the sync/async API: make the API express 
> the synchronicity directly. I would even propose that for synchronous 
> methods, there's only the CompletableFuture or 
> CompletableFuture variant; we are already reaching the concurrency 
> limits of application that lets threads block, so let's give user a hint 
> that he should use such notification API. If anyone prefers the sync 
> variant, he can always use get().
> 
> Let's settle on some nomenclature, too, because these JGroups-counters 
> and RAFT-counters don't have commonly known properties. It almost seems 
> that the term *counter* is so overloaded that we shouldn't use it all.
> 
> There is a widely known j.u.c.AtomicLong, so if we want to implement 
> that (strict JMM-like properties), let's call it AtomicLong. Does not 
> have to follow j.u.c.AtomicLong API, but it's definitely CP.
> 
> Something providing unique values should be called Sequence. This will 
> be probably the one batching ranges (therefore possibly with gaps). By 
> default non-monotonic, but monotonicity could be a ctor arg (in a 
> similar way as fairness is set for j.u.concurrent.* classes).

+1 for differentiating in the API between a counter that a client increments 
(or resets) and a sequence generator that a client can use to obtain 
(non)monotonic values.

> 
> As for CRDTs, I can't imagine how this could be easily built on top of 
> current 'passive' cache (without any syncing). But as for names *CRDT 
> counter* is explanatory enough.
> 
> If someone needs quotas, let's create Quota according to their needs 
> (soft and hard threshold, fluent operation below soft, some jitter when 
> above that). It seems that this will be closest to Sequence by reserving 
> some ranges. Don't let them shoot themselves into foot with some liger 
> counter.

+1 for creating a specific interface to enable quotas. Perhaps it is 
implemented with some more general purpose functionality that might be exposed 
in the future, but for now why not keep it narrowly focused.

> 
> And I hope you'll build these on top of the functional API, with at most 
> one RPC per operation.
> 
> Radim
> 
> On 03/14/2016 10:27 PM, Sanne Grinovero wrote:
>> Great starting point!
>> Some comments inline:
>> 
>> On 14 March 2016 at 19:14, Pedro Ruivo  wrote:
>>> Hi everybody,
>>> 
>>> Discussion about distributed counters.
>>> 
>>> == Public API ==
>>> 
>>> interface Counter
>> As a user, how do I get a Counter instance? From a the CacheContainer 
>> interface?
>> 
>> Will they have their own configuration section in the configuration file?
>> 
>>> String getName() //counter name
>>> long get()   //current value. may return stale value due to concurrent
>>> operations to other nodes.
>> This is what puzzles me the most. I'm not sure if the feature is
>> actually useful, unless we can clearly state how far outdated the
>> value could be.
>> 
>> I think a slightly more formal definition would be in order. For
>> example I think it would be acceptable to say that this will return a
>> value from the range of values the primary owner of this counter was
>> holding in the timeframe between the method is being invoked and the
>> time the value is returned.
>> 
>> Could it optionally be integrated with Total Order ? Transactions?
>> 
>>> void increment() //async or sync increment. default add(1)
>>> void decrement() //async or sync decrement. default add(-1)
>>> void add(long)   //async or sync add.
>>> void reset() //resets to initial value
>>> 
>>> Note: Tried to make the interface as simple as possible with support for
>>> sync and async operations. To avoid any confusion, I consider an async
>>> operation as happening somewhat in the future, i.e. eventually
>>> increments/decrements.
>>> The sync operation happens somewhat during the method execution.
>>> 
>>> interface AtomiCounter extends Counter
>>> 
>>> long addAndGet()   //adds a returns the new value. sync operation
>>> long incrementAndGet() //increments and returns the new value. sync
>>> operation. default addAndGet(1)
>>> long decrementAndGet() //decrements and returns the new value. sync
>>> operation. default addAndGet(-1)
>>> 
>>> interface AdvancedCounter extends Counter
>>> 
>>> long getMin/MaxThreshold() //returns the min and max threshold value
>> "threshold" ??
>> 
>>> void add/removeListener()  //adds a listener that is invoked when the
>>> value change. Can be extended to notify when it is "reseted" and when
>>> the threshold is reached.
>>> 
>>> Note: should this interface be splitted?
>> I'd prefer a single interface, with reduced redundancy.
>> For example, is there really a benefit in having a "void increment()"
>> and also a "long addAndGet()" ? [Besides The fact that only the first
>> one can benefit of an async option]
>> 
>> Besides, I am no longer sure that it's good thing that methods in
>> Infinispan can be async vs sync depending on conf

Re: [infinispan-dev] Distributed Counter Discussion

2016-03-14 Thread Randall Hauch
What are the requirements? What are the distributed counters for? Is the 
counter to be monotonically increasing? Can there be any missed values? Does 
the counter need to increment and decrement? What is the *smallest* API you 
need initially?

There are two choices when implementing a distributed counter: use central 
coordination (like JGroups counters), or use independent counters on separate 
machines that will eventually converge to the correct value (CRDTs). 
Coordinated counters are expensive and therefore slow, and can suffer from 
problems during network or cluster problems. For example, what happens during a 
split brain? OTOH, CRDTs are decentralized so therefore are very fast, easily 
merged, and fault tolerant; they’re excellent when counting things that are 
occurring independently and therefore may be more suited for 
monitoring/metrics/accumulators/etc.  Both have very different behaviors under 
ideal and failure scenarios, have different performance and consistency 
guarantees, and are useful in different scenarios. Make sure you choose 
accordingly.

For information about CRDTs, make sure you’ve read the CRDT paper by Shapiro: 
http://hal.upmc.fr/inria-0088/document 
 

Randall

> On Mar 14, 2016, at 2:14 PM, Pedro Ruivo  wrote:
> 
> Hi everybody,
> 
> Discussion about distributed counters.
> 
> == Public API ==
> 
> interface Counter
> 
> String getName() //counter name
> long get() //current value. may return stale value due to concurrent 
> operations to other nodes.
> void increment() //async or sync increment. default add(1)
> void decrement() //async or sync decrement. default add(-1)
> void add(long)   //async or sync add.
> void reset() //resets to initial value
> 
> Note: Tried to make the interface as simple as possible with support for 
> sync and async operations. To avoid any confusion, I consider an async 
> operation as happening somewhat in the future, i.e. eventually 
> increments/decrements.
> The sync operation happens somewhat during the method execution.
> 
> interface AtomiCounter extends Counter
> 
> long addAndGet()   //adds a returns the new value. sync operation
> long incrementAndGet() //increments and returns the new value. sync 
> operation. default addAndGet(1)
> long decrementAndGet() //decrements and returns the new value. sync 
> operation. default addAndGet(-1)
> 
> interface AdvancedCounter extends Counter
> 
> long getMin/MaxThreshold() //returns the min and max threshold value
> void add/removeListener()  //adds a listener that is invoked when the 
> value change. Can be extended to notify when it is "reseted" and when 
> the threshold is reached.
> 
> Note: should this interface be splitted?
> 
> == Details ==
> 
> This is what I have in mind. Two counter managers: one based on JGroups 
> counter and another one based on Infinispan cache.
> The first one creates AtomicCounters and it first perfectly. All 
> counters are created with an initial value (zero by default)
> The second generates counters with all the options available. It can mix 
> sync/async operation and all counters will be in the same cache. The 
> cache will be configure by us and it would be an internal cache. This 
> will use all the features available in the cache.
> 
> Configuration-wise, I'm thinking about 2 parameters: number of backups 
> and timeout (for sync operations).
> 
> So, comment bellow and let me know alternatives, improvement or if I 
> missed something.
> 
> ps. I also consider implement a counter based on JGroups-raft but I 
> believe it is an overkill.
> ps2. sorry for the long email :( I tried to be shorter as possible.
> 
> Cheers,
> Pedro
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Atomic counters / sequences over Hot Rod

2015-12-18 Thread Randall Hauch
CRDTs, especially PNCounters, could be very valuable here to ensure eventual 
consistency of the counters. They don’t require total ordering of operations to 
be maintained, so this reduces the need for coordination and works better when 
stuff goes wrong. Sending requests more than once is still a problem, but no 
more so than with normal atomic counters.

> On Dec 18, 2015, at 4:24 AM, Sanne Grinovero  wrote:
> 
> Hi all,
> I'm well aware that we don't have support for counters generally
> speaking, but for Infinispan in embedded mode I could so far use some
> clumsy workarounds: inefficient solutions but at least I could get it
> to work.
> 
> I'm not as expert in using the remote client though. Could someone
> volunteer to draft me some possible solution?
> 
> I would still hope the team will attack the issue of having efficient
> atomic counters too, but at this very moment I'd be happy with a
> temporary, low performance workaround.
> Server side scripting maybe?
> 
> Thanks,
> Sanne
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] stabilize test suite

2014-06-13 Thread Randall Hauch
This may not at all be related, but maybe it might save some time. One problem 
ModeShape has run into is that Arquillian seems to only detect a Wildfly server 
when the test is run with IPv4.

On Jun 13, 2014, at 10:15 AM, Tristan Tarrant  wrote:

> I'm on https://issues.jboss.org/browse/ISPN-4403
> 
> And then I'll look into why arquillian doesn't detect the hotrod server
> 
> Tristan
> 
> On 13/06/14 16:45, William Burns wrote:
>> I am currently looking at https://issues.jboss.org/browse/ISPN-4389
>> which involves the StateTransferSuppressFor* tests.
>> 
>> Also after that I was planning on getting to
>> https://issues.jboss.org/browse/ISPN-4384 which is a test that fails
>> spuriously CacheNotifierInitialTransferDistTest
>> 
>> On Fri, Jun 13, 2014 at 9:52 AM, Mircea Markus  wrote:
>>> Hi,
>>> 
>>> The test suite is allover, so please STOP integrating into master till the 
>>> suite gets green.
>>> Also please start looking into the failures: in order not to overlap, reply 
>>> to this email with what you're looking at.
>>> 
>>> Cheers,
>>> --
>>> Mircea Markus
>>> Infinispan lead (www.infinispan.org)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> infinispan-dev mailing list
>>> infinispan-dev@lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Query.getResultSize() to be available on the simplified DSL?

2014-03-11 Thread Randall Hauch
Maybe a Long rather than an Integer? Ints are so last year. :-)

And, what about using a primitive that returns -1 when the method cannot 
determine the size (if allowed by the parameter). Just as easy to check -1 than 
it is to check null, IMO.

On Mar 11, 2014, at 2:21 PM, Emmanuel Bernard  wrote:

> It does not work, I think, because if you implement your query via some map 
> reduce and you do pagination, it will be costly to compute the size and you 
> might want not to return it.
> Hence my Accuracy idea to clarify the intend to the API user.
> 
> On 11 Mar 2014, at 19:18, Sanne Grinovero  wrote:
> 
>> what about we call it
>> 
>> int getEstimatedResultSize() ?
>> 
>> Having such a method occasionally return null looks very bad to me,
>> I'd rather remove the functionality.
>> 
>> -- Sanne
>> 
>> On 11 March 2014 19:08, Emmanuel Bernard  wrote:
>>> I agree with Randall.
>>> 
>>> I tend to be very conservative about my public APIs. And offering an API 
>>> that I think will block me in the future is something I tend to avoid.
>>> 
>>> Something like .guessNbrOfMatchingElements() / .guessResultSize() would 
>>> provide a better clue about the gamble the user takes. Note that the size 
>>> is irrespective of the pagination applied which renders this result quite 
>>> cool even if approximate.
>>> 
>>> I’d be tempted not to put getResultSize() with an exact value in the public 
>>> contract as iterating is probably going to as “fast”.
>>> 
>>> An alternative is something like that (needs to be refined though)
>>> 
>>> /**
>>> * Get the result size.
>>> * Approximate results are to be preferred as it is usually very cheap to 
>>> compute.
>>> * If the computation is too expensive, the approximate accuracy returns 
>>> null.
>>> *
>>> * Exact results are likely to be costly and require two queries.
>>> */
>>> Integer getResultSize(Accuracy);
>>> enum Accuracy { EXACT, APPROXIMATE_OR_NULL }
>>> 
>>> Emmanuel
>>> 
>>> On 11 Mar 2014, at 18:23, Randall Hauch  wrote:
>>> 
>>>> I disagree. Most developers have access to the JavaDoc, and if even 
>>>> moderately well-written, they will find out what the method returns and 
>>>> when. It’s no different than a method sometimes returning null rather than 
>>>> an object reference.
>>>> 
>>>> On Mar 11, 2014, at 12:16 PM, Dennis Reed  wrote:
>>>> 
>>>>> Providing methods that work sometimes and don't work other times is
>>>>> generally a bad idea.
>>>>> 
>>>>> No matter how much you document it, users *will* try to use it and
>>>>> expect it to always work
>>>>> (either because they didn't read the docs that say otherwise, they think
>>>>> they'll stick to a configuration where it does work, etc.)
>>>>> 
>>>>> And then when it doesn't work (because they pushed something to
>>>>> production which has a different configuration than dev, etc)
>>>>> it's a frustrating experience.
>>>>> 
>>>>> -Dennis
>>>>> 
>>>>> On 03/11/2014 09:37 AM, Randall Hauch wrote:
>>>>>> I’m struggling with this same question in ModeShape. The JCR API exposes 
>>>>>> a method that returns the number of results, but at least the spec 
>>>>>> allows the implementation to return -1 if the size is not known (or very 
>>>>>> expensive to compute). Yet this still does not satisfy all cases.
>>>>>> 
>>>>>> Depending upon the technology, computing the **exact size** ranges from 
>>>>>> very cheap to extremely expensive to calculate. For example, consider a 
>>>>>> system that has to take into account access control limitations of the 
>>>>>> user. My current opinion is that few applications actually need an exact 
>>>>>> size, and if they do there may be alternatives (like counting as they 
>>>>>> iterate over the results).
>>>>>> 
>>>>>> An alternative is to expose an **approximate size**, which is likely to 
>>>>>> be sufficient for generating display or other pre-computed information 
>>>>>> such as links or paging details. I think that this is sufficient for 
>>>>>> most needs, and that even an order of magnitude is sufficient

Re: [infinispan-dev] Query.getResultSize() to be available on the simplified DSL?

2014-03-11 Thread Randall Hauch
I disagree. Most developers have access to the JavaDoc, and if even moderately 
well-written, they will find out what the method returns and when. It’s no 
different than a method sometimes returning null rather than an object 
reference.

On Mar 11, 2014, at 12:16 PM, Dennis Reed  wrote:

> Providing methods that work sometimes and don't work other times is 
> generally a bad idea.
> 
> No matter how much you document it, users *will* try to use it and 
> expect it to always work
> (either because they didn't read the docs that say otherwise, they think 
> they'll stick to a configuration where it does work, etc.)
> 
> And then when it doesn't work (because they pushed something to 
> production which has a different configuration than dev, etc)
> it's a frustrating experience.
> 
> -Dennis
> 
> On 03/11/2014 09:37 AM, Randall Hauch wrote:
>> I’m struggling with this same question in ModeShape. The JCR API exposes a 
>> method that returns the number of results, but at least the spec allows the 
>> implementation to return -1 if the size is not known (or very expensive to 
>> compute). Yet this still does not satisfy all cases.
>> 
>> Depending upon the technology, computing the **exact size** ranges from very 
>> cheap to extremely expensive to calculate. For example, consider a system 
>> that has to take into account access control limitations of the user. My 
>> current opinion is that few applications actually need an exact size, and if 
>> they do there may be alternatives (like counting as they iterate over the 
>> results).
>> 
>> An alternative is to expose an **approximate size**, which is likely to be 
>> sufficient for generating display or other pre-computed information such as 
>> links or paging details. I think that this is sufficient for most needs, and 
>> that even an order of magnitude is sufficient. When the results are known to 
>> be small, the system might want to determine the exact size (e.g., by 
>> iterating).
>> 
>> So one option is to expose both methods, but allow the exact size method to 
>> return -1 if the system can’t determine the size or if doing so is very 
>> expensive. This allows the system a way out for large/complex queries and 
>> flexibility in the implementation technology. The approximate size method 
>> probably always needs to return at least some usable value.
>> 
>> BTW, computing an exact size by iterating can be expensive unless you can 
>> keep all the results in memory. That’s not ideal - a query with large 
>> results could fill up available memory. If you don’t keep all results in 
>> memory, then if you’re going to allow clients to access the results more 
>> than once you have to provide a way to buffer the results.
>> 
>> 
>> On Mar 10, 2014, at 7:23 AM, Sanne Grinovero  wrote:
>> 
>>> Hi all,
>>> we are exposing a nice feature inherited from the Search engine via
>>> the "simple" DSL version, the one which is also available via Hot Rod:
>>> 
>>> org.infinispan.query.dsl.Query.getResultSize()
>>> 
>>> To be fair I hadn't noticed we do expose this, I just noticed after a
>>> recent PR review and I found it surprising.
>>> 
>>> This method returns the size of the full resultset, disregarding
>>> pagination options; you can imagine it fit for situations like:
>>> 
>>>   "found 6 million matches, these are the top 20: "
>>> 
>>> A peculiarity of Hibernate Search is that the total number of matches
>>> is extremely cheap to figure out as it's generally a side effect of
>>> finding the 20 results. Essentially we're just exposing an int value
>>> which was already computed: very cheap, and happens to be useful in
>>> practice.
>>> 
>>> This is not the case with a SQL statement, in this case you'd have to
>>> craft 2 different SQL statements, often incurring the cost of 2 round
>>> trips to the database. So this getResultSize() is not available on the
>>> Hibernate ORM Query, only on our FullTextQuery extension.
>>> 
>>> Now my doubt is if it is indeed a wise move to expose this method on
>>> the simplified DSL. Of course some people might find it useful, still
>>> I'm wondering how much we'll be swearing at needing to maintain this
>>> feature vs its usefulness when we'll implement alternative execution
>>> engines to run queries, not least on Map/Reduce based filtering, and
>>> ultimately hybrid strategies.
>>> 
>>> In case of Map/

Re: [infinispan-dev] Query.getResultSize() to be available on the simplified DSL?

2014-03-11 Thread Randall Hauch
I’m struggling with this same question in ModeShape. The JCR API exposes a 
method that returns the number of results, but at least the spec allows the 
implementation to return -1 if the size is not known (or very expensive to 
compute). Yet this still does not satisfy all cases.

Depending upon the technology, computing the **exact size** ranges from very 
cheap to extremely expensive to calculate. For example, consider a system that 
has to take into account access control limitations of the user. My current 
opinion is that few applications actually need an exact size, and if they do 
there may be alternatives (like counting as they iterate over the results).

An alternative is to expose an **approximate size**, which is likely to be 
sufficient for generating display or other pre-computed information such as 
links or paging details. I think that this is sufficient for most needs, and 
that even an order of magnitude is sufficient. When the results are known to be 
small, the system might want to determine the exact size (e.g., by iterating).

So one option is to expose both methods, but allow the exact size method to 
return -1 if the system can’t determine the size or if doing so is very 
expensive. This allows the system a way out for large/complex queries and 
flexibility in the implementation technology. The approximate size method 
probably always needs to return at least some usable value.

BTW, computing an exact size by iterating can be expensive unless you can keep 
all the results in memory. That’s not ideal - a query with large results could 
fill up available memory. If you don’t keep all results in memory, then if 
you’re going to allow clients to access the results more than once you have to 
provide a way to buffer the results.


On Mar 10, 2014, at 7:23 AM, Sanne Grinovero  wrote:

> Hi all,
> we are exposing a nice feature inherited from the Search engine via
> the "simple" DSL version, the one which is also available via Hot Rod:
> 
> org.infinispan.query.dsl.Query.getResultSize()
> 
> To be fair I hadn't noticed we do expose this, I just noticed after a
> recent PR review and I found it surprising.
> 
> This method returns the size of the full resultset, disregarding
> pagination options; you can imagine it fit for situations like:
> 
>   "found 6 million matches, these are the top 20: "
> 
> A peculiarity of Hibernate Search is that the total number of matches
> is extremely cheap to figure out as it's generally a side effect of
> finding the 20 results. Essentially we're just exposing an int value
> which was already computed: very cheap, and happens to be useful in
> practice.
> 
> This is not the case with a SQL statement, in this case you'd have to
> craft 2 different SQL statements, often incurring the cost of 2 round
> trips to the database. So this getResultSize() is not available on the
> Hibernate ORM Query, only on our FullTextQuery extension.
> 
> Now my doubt is if it is indeed a wise move to expose this method on
> the simplified DSL. Of course some people might find it useful, still
> I'm wondering how much we'll be swearing at needing to maintain this
> feature vs its usefulness when we'll implement alternative execution
> engines to run queries, not least on Map/Reduce based filtering, and
> ultimately hybrid strategies.
> 
> In case of Map/Reduce I think we'll need to keep track of possible
> de-duplication of results, in case of a Teiid integration it might
> need a second expensive query; so in this case I'd expect this method
> to be lazily evaluated.
> 
> Should we rather remove this functionality?
> 
> Sanne
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Design change in Infinispan Query

2014-02-05 Thread Randall Hauch

On Feb 5, 2014, at 1:34 PM, Emmanuel Bernard  wrote:

> What I am saying is that the idea of search across caches as
> appealing as it is is is not the whole story.
> 
> People search, read, navigate and M/R their data in interleaved ways.
> You need to project and think about a 100-200 lines of code that would
> use that feature in combination with other related features to see if
> that will be useful in the end (or gimmicky) and if the user experience
> (API mostly in our case) will be good or make people kill themselves.
> 

What is the plan for supporting joins across entity types? ___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] help with Infinispan OSGi

2013-12-06 Thread Randall Hauch
I’d love to see this work proceed for Infinispan, since we want to do the same 
thing for ModeShape, which uses (but does not hide or encapsulate) Infinispan.


On Dec 6, 2013, at 10:56 AM, Brett Meyer  wrote:

> Sorry, forgot the link:
> 
> [1] https://hibernate.atlassian.net/browse/HHH-8214
> 
> Brett Meyer
> Software Engineer
> Red Hat, Hibernate ORM
> 
> - Original Message -
> From: "Brett Meyer" 
> To: "Randall Hauch" , "infinispan -Dev List" 
> 
> Cc: "Pete Muir" , "Steve Jacobs" 
> Sent: Friday, December 6, 2013 11:51:33 AM
> Subject: Re: [infinispan-dev] help with Infinispan OSGi
> 
> Randall, that is *definitely* the case and is certainly true for Hibernate.  
> The work involved:
> 
> * correctly resolving ClassLoaders based on the activated bundles
> * supporting multiple containers and contexts (container-managed JPA, 
> un-managed JPA/native, etc.)
> * fully supporting OSGi/Blueprint services (both for internal services as 
> well as externally-registered)
> * bundle scanning
> * generally working towards supporting the dynamic nature
> * full unit-tests with Arquillian and an OSGi container
> 
> It's a matter of holistically supporting the "OSGi way" (for better or 
> worse), as opposed to simply ensuring the library's manifest is correct.
> 
> There were a bloody ton of gotchas and caveats I hit along the way.  That's 
> more along the lines of where I might be able to help.
> 
> I'm even more interested in this effort so that we can support 
> hibernate-infinispan 2nd level caching within ORM.  On the first attempt, I 
> hit  ClassLoader issues [1].  Some of that may already be resolved.
> 
> The next step may simply be giving hibernate-infinispan another shot and 
> correcting things as I find them.  In parallel, feel free to let me know if 
> there's anything else!  ORM supports lots of OSGi-enabled extension points, 
> etc. that are powerful for users, but obviously I don't have the Infinispan 
> knowledge to know what would be necessary.
> 
> Thanks!
> 
> Brett Meyer
> Software Engineer
> Red Hat, Hibernate ORM
> 
> - Original Message -
> From: "Randall Hauch" 
> To: "infinispan -Dev List" 
> Cc: "Pete Muir" , "Brett Meyer" 
> Sent: Friday, December 6, 2013 10:57:23 AM
> Subject: Re: [infinispan-dev] help with Infinispan OSGi
> 
> Brett, correct me if I’m wrong, but isn’t there a difference in making some 
> library *work* in an OSGi environment and making that library *naturally fit 
> well* in an OSGi-enabled application? For example, making the JAR’s be OSGi 
> bundles is easy and technically makes it possible to deploy a JAR into an 
> OSGi env, but that’s not where the payoff is. IIUC what you really want is a 
> BundleActivator or Declarative Services [1] so that the library’s components 
> are readily available in a naturally-OSGi way.
> 
> [1] 
> http://blog.knowhowlab.org/2010/10/osgi-tutorial-4-ways-to-activate-code.html
> 
> On Dec 6, 2013, at 7:30 AM, Mircea Markus  wrote:
> 
>> + infinispan-dev
>> 
>> Thanks for offering to look into this Brett!
>> We're already producing OSGi bundles for our modules, but these are not 
>> tested extensively so if you'd review them and test them a bit would be 
>> great!
>> Tristan can get you up to speed with this.
>> 
>> 
>>>> Sanne/Galder/Pete,
>>>> 
>>>> Random question: what's the current state of making Infinispan OSGi 
>>>> friendly?  I'm definitely interested in helping, if it's still a need.  
>>>> This past year, I went through the exercise of making Hibernate work well 
>>>> in OSGi, so all of challenges (read: *many* of them) are still fairly 
>>>> fresh on my mind.  Plus, I'd love for hibernate-infinispan to work in OSGi.
>>>> 
>>>> If you're up for it, fill me in?  I'm happy to pull everything down and 
>>>> start working with it.
>>>> 
>>>> Brett Meyer
>>>> Software Engineer
>>>> Red Hat, Hibernate ORM
>>>> 
>>> 
>> 
>> Cheers,
>> -- 
>> Mircea Markus
>> Infinispan lead (www.infinispan.org)
>> 
>> 
>> 
>> 
>> 
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] help with Infinispan OSGi

2013-12-06 Thread Randall Hauch
Brett, correct me if I’m wrong, but isn’t there a difference in making some 
library *work* in an OSGi environment and making that library *naturally fit 
well* in an OSGi-enabled application? For example, making the JAR’s be OSGi 
bundles is easy and technically makes it possible to deploy a JAR into an OSGi 
env, but that’s not where the payoff is. IIUC what you really want is a 
BundleActivator or Declarative Services [1] so that the library’s components 
are readily available in a naturally-OSGi way.

[1] 
http://blog.knowhowlab.org/2010/10/osgi-tutorial-4-ways-to-activate-code.html

On Dec 6, 2013, at 7:30 AM, Mircea Markus  wrote:

> + infinispan-dev
> 
> Thanks for offering to look into this Brett!
> We're already producing OSGi bundles for our modules, but these are not 
> tested extensively so if you'd review them and test them a bit would be great!
> Tristan can get you up to speed with this.
> 
> 
>>> Sanne/Galder/Pete,
>>> 
>>> Random question: what's the current state of making Infinispan OSGi 
>>> friendly?  I'm definitely interested in helping, if it's still a need.  
>>> This past year, I went through the exercise of making Hibernate work well 
>>> in OSGi, so all of challenges (read: *many* of them) are still fairly fresh 
>>> on my mind.  Plus, I'd love for hibernate-infinispan to work in OSGi.
>>> 
>>> If you're up for it, fill me in?  I'm happy to pull everything down and 
>>> start working with it.
>>> 
>>> Brett Meyer
>>> Software Engineer
>>> Red Hat, Hibernate ORM
>>> 
>> 
> 
> Cheers,
> -- 
> Mircea Markus
> Infinispan lead (www.infinispan.org)
> 
> 
> 
> 
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] [Cloudtm-discussion] Transactional Distributed B+Tree over ISPN

2013-10-10 Thread Randall Hauch
Also, it looks like https://github.com/cloudtm/sti-bt is licensed under 
LGPL2.1. Would you might asking if they'd consider relicensing under ASL2.0 so 
that it's more compatible with the relicensed ISPN?

On Oct 10, 2013, at 5:25 PM, Randall Hauch  wrote:

> "scalable, distributed transactional index (B+tree) over ISPN"
> 
> This is **REALLY** interesting. Mark, can you send me the paper if it can't 
> be attached to the mailing list?
> 
> On Oct 3, 2013, at 12:38 PM, Mark Little  wrote:
> 
>> FYI I presented on the current state of cloud-TM at HPTS a week or so ago 
>> and there was much interest. I pointed people at our website.
>> 
>> Sent from my iPad
>> 
>> On 3 Oct 2013, at 18:16, Paolo Romano  wrote:
>> 
>>> Hi all,
>>> 
>>> even the Cloud-TM project is officially over, we thought to share with you 
>>> one of our last efforts, which unfortunately were a tad too late to make it 
>>> into the submitted version of the platform (and deliverables etc). 
>>> 
>>> This is a scalable, distributed transactional index (B+tree) over ISPN, 
>>> which combines a number of optimizations (in areas like data locality, 
>>> concurrency, load balancing/elastic scaling) and builds over previous work 
>>> (in particular, GMU [1] and Bumper [2]) that made it possible to achieve 
>>> linear scalability up to 100 VMs even in update intensive workloads.
>>> 
>>> Hot features:
>>> - at most 1 remote data access per each index operation thanks to :
>>>  i) transaction migration, 
>>> ii) combined use of full and partial replication (transparent and 
>>> self-tuning depending on cluster size), 
>>> iii) optimized data placement via customi hash functions
>>> - almost total avoidance of data contention thanks to the exploitation of 
>>> commutativity operations on the index (via dirty reads and delayed actions)
>>> - it's built directly on top of ISPN (it does not depend on Fenix, unlike 
>>> the collections' implementation that were used, e.g., by GeoGraph ).
>>> 
>>> Details in the attached paper!
>>> 
>>> We believe that this index implementation could be something generally 
>>> useful for the ISPN community, especially given all the recent efforts in 
>>> the areas of query. On the other hand, we should point out that the current 
>>> implementation [3]:
>>> i) depends on transactional features (transaction migration, dirty reads, 
>>> delayed actions) that have not been integrated in the official version of 
>>> ISPN;
>>> ii) has been for the moment implemented as a Radargun extension, i.e. no 
>>> effort was spent to modularize it/polish its API.
>>> 
>>> ...so it would take some effort to have it fully integrated in the master 
>>> version of ISPN  but you know the saying: no pain no gain ;-)
>>> 
>>> We'd love to hear your feedback of course!
>>> 
>>> Nuno & Paolo
>>> 
>>> [1] Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, and 
>>> Luis Rodrigues,When Scalability Meets Consistency: Genuine Multiversion 
>>> Update Serializable Partial Data Replication, 32nd International Conference 
>>> on Distributed Computing Systems (ICDCS 2012)
>>> 
>>> [2] Nuno Diegues and Paolo Romano,Bumper: Sheltering Transactions from 
>>> Conflicts,The 32th IEEE Symposium on Reliable Distributed Systems (SRDS 
>>> 2013), Braga, Portugal, Oct. 2013
>>> 
>>> [3] https://github.com/cloudtm/sti-bt
>>> 
>>> --
>>> October Webinars: Code for Performance
>>> Free Intel webinars can help you accelerate application performance.
>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most 
>>> from 
>>> the latest Intel processors and coprocessors. See abstracts and register >
>>> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
>>> ___
>>> Cloudtm-discussion mailing list
>>> cloudtm-discuss...@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cloudtm-discussion
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] [Cloudtm-discussion] Transactional Distributed B+Tree over ISPN

2013-10-10 Thread Randall Hauch
"scalable, distributed transactional index (B+tree) over ISPN"

This is **REALLY** interesting. Mark, can you send me the paper if it can't be 
attached to the mailing list?

On Oct 3, 2013, at 12:38 PM, Mark Little  wrote:

> FYI I presented on the current state of cloud-TM at HPTS a week or so ago and 
> there was much interest. I pointed people at our website.
> 
> Sent from my iPad
> 
> On 3 Oct 2013, at 18:16, Paolo Romano  wrote:
> 
>> Hi all,
>> 
>> even the Cloud-TM project is officially over, we thought to share with you 
>> one of our last efforts, which unfortunately were a tad too late to make it 
>> into the submitted version of the platform (and deliverables etc). 
>> 
>> This is a scalable, distributed transactional index (B+tree) over ISPN, 
>> which combines a number of optimizations (in areas like data locality, 
>> concurrency, load balancing/elastic scaling) and builds over previous work 
>> (in particular, GMU [1] and Bumper [2]) that made it possible to achieve 
>> linear scalability up to 100 VMs even in update intensive workloads.
>> 
>> Hot features:
>> - at most 1 remote data access per each index operation thanks to :
>>  i) transaction migration, 
>> ii) combined use of full and partial replication (transparent and 
>> self-tuning depending on cluster size), 
>> iii) optimized data placement via customi hash functions
>> - almost total avoidance of data contention thanks to the exploitation of 
>> commutativity operations on the index (via dirty reads and delayed actions)
>> - it's built directly on top of ISPN (it does not depend on Fenix, unlike 
>> the collections' implementation that were used, e.g., by GeoGraph ).
>> 
>> Details in the attached paper!
>> 
>> We believe that this index implementation could be something generally 
>> useful for the ISPN community, especially given all the recent efforts in 
>> the areas of query. On the other hand, we should point out that the current 
>> implementation [3]:
>> i) depends on transactional features (transaction migration, dirty reads, 
>> delayed actions) that have not been integrated in the official version of 
>> ISPN;
>> ii) has been for the moment implemented as a Radargun extension, i.e. no 
>> effort was spent to modularize it/polish its API.
>> 
>> ...so it would take some effort to have it fully integrated in the master 
>> version of ISPN  but you know the saying: no pain no gain ;-)
>> 
>> We'd love to hear your feedback of course!
>> 
>> Nuno & Paolo
>> 
>> [1] Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, and 
>> Luis Rodrigues,When Scalability Meets Consistency: Genuine Multiversion 
>> Update Serializable Partial Data Replication, 32nd International Conference 
>> on Distributed Computing Systems (ICDCS 2012)
>> 
>> [2] Nuno Diegues and Paolo Romano,Bumper: Sheltering Transactions from 
>> Conflicts,The 32th IEEE Symposium on Reliable Distributed Systems (SRDS 
>> 2013), Braga, Portugal, Oct. 2013
>> 
>> [3] https://github.com/cloudtm/sti-bt
>> 
>> --
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most 
>> from 
>> the latest Intel processors and coprocessors. See abstracts and register >
>> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
>> ___
>> Cloudtm-discussion mailing list
>> cloudtm-discuss...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cloudtm-discussion
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] replacing the (FineGrained)AtomicMap with grouping

2013-09-20 Thread Randall Hauch
IMO, the primary benefit of the FGAM is that you can aggregate your entries 
into a single entry that is a real aggregate: read the map and you get all the 
FGAM's entries in one fell swoop. IIUC, your proposal loses this capability for 
a single read of all aggregate parts. Is that right?

On Sep 20, 2013, at 11:49 AM, Mircea Markus  wrote:

> Hi,
> 
> Most of the FGAM functionality can be achieved with grouping, by using the 
> FGAM key as a grouping key.
> The single bit that seems to be missing from grouping to equivalent the 
> functionality of FGAM is obtaining all the entries under a single group. IOW 
> a method like:
> 
> Map groupedKeys = Cache.getGroup(groupingKey, KeyFilter);
> 
> This can be relatively easily implemented with the same performance as an 
> AtomicMap lookup.
> 
> Some other differences worth mentioning:
> - the cache would contain more entries in the grouping API approach. Not sure 
> if this is really a problem though.
> - in order to assure REPEATABLE_READ, the AM (including values) is brought on 
> the node that reads it (does't apply to FGAM). Not nice.
> - people won't be able to lock an entire group (the equivalent of locking a 
> AM key). I don't think this is a critical requirement, and also can be worked 
> around. Or added as a built in function if needed.
> 
> I find the idea of dropping FGAM and only using grouping very tempting:
> - there is logic duplication between Grouping and (FG)AM (the locality, fine 
> grained locking) that would be removed
> - FGAM and AM semantic is a bit ambiguous in corner cases
> - having a Cache.getGroup does make sense in a general case
> - reduce the code base
> 
> What do people think? 
> 
> Cheers,
> -- 
> Mircea Markus
> Infinispan lead (www.infinispan.org)
> 
> 
> 
> 
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] stack trace in my logs

2013-09-18 Thread Randall Hauch

On Sep 18, 2013, at 7:52 AM, Galder Zamarreño  wrote:

> 
> On Sep 12, 2013, at 7:21 PM, Kurt T Stam  wrote:
> 
>> Hi guys,
>> 
>> I'm running on OSX with ModeShape 3.2.0.Final which depends on ISPN 
>> version 5.1.6.FINAL
>> and sometime I see this stack in my log. Everything seems to be working 
>> ok anyway though.
>> 
>> http://pastebin.test.redhat.com/163600
>> 
>> Is this something you guys have seen before?
> 
> This is something coming from Infinispan's backported classes from JDK8. This 
> is code that comes from [1].
> 
> Since Infinispan 5.2.6 we've moved to more recent versions of these JDK8 
> backported classes. I'd suggest trying with a newer Infinsipan release 
> (5.3.0.Final) or pre-release (6.0.0.Alpha4).


Kurt, ModeShape 3.5.0.Final has moved to Infinispan 5.2.6.Final (what is in EAP 
6.1.0), and ModeShape 3.6 will use Infinispan 5.2.7.Final (what is in EAP 
6.1.1). If you can upgrade, please do.
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] where is the user forum?

2013-09-12 Thread Randall Hauch
https://community.jboss.org/en/infinispan

This is the new URL pattern for project forums on Jboss.org

On Sep 12, 2013, at 12:18 PM, Kurt T Stam  wrote:

> Your link on http://www.jboss.org/infinispan/forums 
> (http://www.jboss.org/index.html?module=bb&op=viewforum&f=309)
>  seems to go nowhere..
> 
> --Kurt
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Cache store for MapDB (what used to be JDBM)

2013-09-12 Thread Randall Hauch
Cool! I hope to try it out soon.

On Sep 12, 2013, at 3:10 PM, Ray Tsang  wrote:

> I readapted the offheap store to mapdb store.  
> https://github.com/saturnism/infinispan-cachestore-mapdb
> 
> check it out!
> 
> 
> On Wed, Sep 4, 2013 at 7:56 AM, Randall Hauch  wrote:
> Yes, but the API is completely different. Ray's is actually pretty close, 
> except that it assume that MapDB should use only the direct memory option 
> (there are quite a few others). A combination of both would probably work 
> really well.
> 
> On Sep 4, 2013, at 4:49 AM, Manik Surtani  wrote:
> 
>> MapDB is a progression on JDBM, right?  
>> 
>> https://github.com/infinispan/infinispan-cachestore-jdbm
>> 
>> 
>> On 3 Sep 2013, at 16:55, Ray Tsang  wrote:
>> 
>>> Good timing - I just started an off-heap store for fun that uses mapdb 
>>> direct memory db.
>>> https://github.com/saturnism/infinispan-cachestore-offheap
>>> 
>>> Please take a look!
>>> 
>>> 
>>> On Tue, Sep 3, 2013 at 6:13 AM, Randall Hauch  wrote:
>>> Has anyone looked at writing a cache store that uses MapDB? It provides 
>>> Maps, Sets and Queues backed by disk storage or off-heap memory, with MVCC 
>>> and (non-JTA) transactions. The author previously wrote JDBM (multiple 
>>> versions), and has recently ventured out on his own to focus on MapDB 
>>> full-time. It's only at 0.9.5, but progressing quite nicely. I've been 
>>> looking at it for other uses, and quite enjoy it.
>>> 
>>> http://mapdb.org
>>> ___
>>> infinispan-dev mailing list
>>> infinispan-dev@lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> ___
>>> infinispan-dev mailing list
>>> infinispan-dev@lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> --
>> Manik Surtani
>> 
>> 
>> 
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Cache store for MapDB (what used to be JDBM)

2013-09-04 Thread Randall Hauch
Yes, but the API is completely different. Ray's is actually pretty close, 
except that it assume that MapDB should use only the direct memory option 
(there are quite a few others). A combination of both would probably work 
really well.

On Sep 4, 2013, at 4:49 AM, Manik Surtani  wrote:

> MapDB is a progression on JDBM, right?  
> 
> https://github.com/infinispan/infinispan-cachestore-jdbm
> 
> 
> On 3 Sep 2013, at 16:55, Ray Tsang  wrote:
> 
>> Good timing - I just started an off-heap store for fun that uses mapdb 
>> direct memory db.
>> https://github.com/saturnism/infinispan-cachestore-offheap
>> 
>> Please take a look!
>> 
>> 
>> On Tue, Sep 3, 2013 at 6:13 AM, Randall Hauch  wrote:
>> Has anyone looked at writing a cache store that uses MapDB? It provides 
>> Maps, Sets and Queues backed by disk storage or off-heap memory, with MVCC 
>> and (non-JTA) transactions. The author previously wrote JDBM (multiple 
>> versions), and has recently ventured out on his own to focus on MapDB 
>> full-time. It's only at 0.9.5, but progressing quite nicely. I've been 
>> looking at it for other uses, and quite enjoy it.
>> 
>> http://mapdb.org
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> --
> Manik Surtani
> 
> 
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Cache store for MapDB (what used to be JDBM)

2013-09-03 Thread Randall Hauch
Looks good. Perhaps I'm missing something, but I don't see a way to persist to 
local disk with your cache store. Is that right? I see the advantage of having 
an off heap memory-based cache store, but your code is actually not too far 
away from a cache store that persists to a (memory-mapped) file. I'd love to 
see how that compares to a cache store based upon LevelDB. 

On Sep 3, 2013, at 10:55 AM, Ray Tsang  wrote:

> Good timing - I just started an off-heap store for fun that uses mapdb direct 
> memory db.
> https://github.com/saturnism/infinispan-cachestore-offheap
> 
> Please take a look!
> 
> 
> On Tue, Sep 3, 2013 at 6:13 AM, Randall Hauch  wrote:
> Has anyone looked at writing a cache store that uses MapDB? It provides Maps, 
> Sets and Queues backed by disk storage or off-heap memory, with MVCC and 
> (non-JTA) transactions. The author previously wrote JDBM (multiple versions), 
> and has recently ventured out on his own to focus on MapDB full-time. It's 
> only at 0.9.5, but progressing quite nicely. I've been looking at it for 
> other uses, and quite enjoy it.
> 
> http://mapdb.org
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

[infinispan-dev] Cache store for MapDB (what used to be JDBM)

2013-09-03 Thread Randall Hauch
Has anyone looked at writing a cache store that uses MapDB? It provides Maps, 
Sets and Queues backed by disk storage or off-heap memory, with MVCC and 
(non-JTA) transactions. The author previously wrote JDBM (multiple versions), 
and has recently ventured out on his own to focus on MapDB full-time. It's only 
at 0.9.5, but progressing quite nicely. I've been looking at it for other uses, 
and quite enjoy it.

http://mapdb.org
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Integrating Karsten's FCS

2013-07-11 Thread Randall Hauch
On Jul 11, 2013, at 4:53 AM, Mircea Markus  wrote:
> On 11 Jul 2013, at 10:22, Galder Zamarreño  wrote:
>>> I'd remove it in Final, it cannot be seriously used either. Users can 
>>> migrate the data using rolling upgrades.
>> 
>> ^ I don't think you can remove it in 6.0 Final. If you wanna do any kind of 
>> migration or rolling upgrade from the curren FCS to the new one, you need to 
>> have the current FCS around, to be able to read data in that format/setup 
>> and then write it to the new cache store. The moment you can remove it is 
>> the moment you've decided that no more migrations are gonna happen.
> 
> You can have the logic around but don't expose it to the users as a shipped 
> cache store. E.g. move it to a different package and rename it to whatever 
> you want.

Let's look at this from a user's perspective. Is it possible that the new file 
store **replaces** the old one, so that the user doesn't (really) need to know 
the difference? 

How similar are the configuration properties of each? Might it be possible to:
rename and deprecate the existing one (e.g., 
"org.infinispan.loaders.file.LegacyFileCacheStore"), changing it to 'package' 
or 'protected' (iff the new one is stable enough that the old one should 
definitely not be used anymore), 
rename the new one to "org.infinispan.loaders.file.FileCacheStore" (so that 
existing configurations work without change), 
have some code in the new one detect the old file structure and automatically 
run a migration to convert from the old file structure to the new format (and 
possibly move the old files into a subdirectory)

If this is possible, this is by far the best option for users since it provides 
a simple (perhaps even transparent) migration path with little if any changes 
required to configurations, but at the same time it allows the developers to 
get what they want: a new FCS that replaces the old one (which goes away).

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Integrating Karsten's FCS

2013-07-09 Thread Randall Hauch
ModeShape does have a backup/restore feature that scans the whole contents of a 
repository to a file system, which later can be read back in to populate an 
empty installation. I'm just surprised that Infinispan doesn't provide or want 
to provide something like this out-of-the-box.

Also, is it just me, or does KFCS remind me of fried chicken? (Get it, KFCS … 
KFC Store … fried chicken. :-)


On Jul 9, 2013, at 8:44 AM, Sanne Grinovero  wrote:

> Hi Randall,
> I would agree with you that this should be a priority, but keep in
> mind that just migrating data from a CacheStore to another won't be
> enough: as I pointed out in my previous mail, binary encoding also
> changed, making it impossible to deserialize the values.
> 
> I'm not sure if the encoding change was meant to happen, but
> apparently there is currently no effort in place to test for this kind
> of backward compatibility.
> 
> If you need such a thing for ModeShape it would likely be easier for
> you to provide such a tool in ModeShape to extract all data and dump
> it to some external file, than to provide the hooks in Infinispan as
> the transcoding tool would need to depend on multiple Infinispan
> versions.
> 
> Sanne
> 
> 
> On 9 July 2013 14:37, Randall Hauch  wrote:
>> 
>> On Jul 9, 2013, at 4:07 AM, Radim Vansa  wrote:
>> 
>>> - Original Message -
>>> | From: "Galder Zamarreño" 
>>> | Shall we keep the current FCS implementation, deprecate it, and get rid 
>>> of it
>>> | in the next minor/major version? Some users might have data stored in the
>>> | current FCS and would be quite abrupt to just get rid of it right now.
>>> 
>>> I'd remove it in Final, it cannot be seriously used either. Users can 
>>> migrate the data using rolling upgrades.
>> 
>> Can you describe this process, especially for how it can be accomplished 
>> with a single (local) cache?
>> 
>> A migration mechanism is absolutely a must. There are ModeShape users that 
>> have used the FCS simply because it suits their needs - primarily because 
>> there are no extra dependencies and no additional "system" underneath. There 
>> HAS to be a migration path, especially if the old FCS is going to be removed.
>> 
>> Why does Infinispan not have a general-purpose mechanism for converting from 
>> one (offline) cache store to another (offline) cache store?
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Cachestores performance

2013-07-09 Thread Randall Hauch

On Jul 8, 2013, at 3:03 PM, Mircea Markus  wrote:

> 
> On 2 Jul 2013, at 19:39, Erik Salter  wrote:
> 
>> I concur with part of the below, but with a few changes:
>> 
>> - The cache is the primary storage, similar to Sanne's case. (DIST mode)
>> - My customers are not interested in extra components to the system, like 
>> databases or Cassandra nodes.  They wonder why they can't simply use the 
>> existing file system on the nodes they have.
> +1 for a no-dep cache store impl :-)

+1000



___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Integrating Karsten's FCS

2013-07-09 Thread Randall Hauch

On Jul 9, 2013, at 4:07 AM, Radim Vansa  wrote:

> - Original Message -
> | From: "Galder Zamarreño" 
> | Shall we keep the current FCS implementation, deprecate it, and get rid of 
> it
> | in the next minor/major version? Some users might have data stored in the
> | current FCS and would be quite abrupt to just get rid of it right now.
> 
> I'd remove it in Final, it cannot be seriously used either. Users can migrate 
> the data using rolling upgrades.

Can you describe this process, especially for how it can be accomplished with a 
single (local) cache?

A migration mechanism is absolutely a must. There are ModeShape users that have 
used the FCS simply because it suits their needs - primarily because there are 
no extra dependencies and no additional "system" underneath. There HAS to be a 
migration path, especially if the old FCS is going to be removed.

Why does Infinispan not have a general-purpose mechanism for converting from 
one (offline) cache store to another (offline) cache store?
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] mongodb cache store added in Infinispan 5.3 - curtesy of Guillaume Scheibel

2013-06-03 Thread Randall Hauch
Did you see my earlier email on the use of Flapdoodle's embedded MongoDB 
library? It's a great way for a test case to start up (and then shut down) an 
embedded MongoDB instance.

http://lists.jboss.org/pipermail/infinispan-dev/2013-May/012920.html

On Jun 3, 2013, at 1:53 PM, Guillaume SCHEIBEL  
wrote:

> I'm investigating, but quick question is there a MongoDB instance running 
> somewhere on which the CI test runner send all the MongoDB queries (and so 
> on) ?
> 
> Guillaume
> 
> 
> 2013/6/3 Dan Berindei 
> There's a link 'Login as guest' at the bottom of the login form at 
> ci.infinispan.org, you can use that and you'll see all the builds. You can 
> also create a user for yourself.
> 
> 
> On Mon, Jun 3, 2013 at 6:01 PM, Guillaume SCHEIBEL 
>  wrote:
> I don't think I have access to the CI platform. If I do, with kind of 
> credentials do I have to use ?
> 
> Guillaume
> 
> 
> 2013/6/3 Galder Zamarreño 
> Btw, seems like since the introduction of this, 21 failures have gone in 
> coming from the MongoDBCacheStoreTest [1].
> 
> Guillaume, can you please look into this?
> 
> Cheers,
> 
> [1] 
> http://ci.infinispan.org/viewLog.html?buildId=1650&tab=buildResultsDiv&buildTypeId=bt2
> 
> On Jun 3, 2013, at 4:02 PM, Guillaume SCHEIBEL  
> wrote:
> 
> > Thanks guys, I'm glad you like it :)
> > BTW, if you would like to have other implementation, ... (I like doing this 
> > translation work)
> >
> > Guillaume
> >
> >
> > 2013/6/3 Galder Zamarreño 
> > Indeed great stuff Guillaume!! :)
> >
> > On May 31, 2013, at 2:18 PM, Sanne Grinovero  wrote:
> >
> > > Looks great, thanks Guillaume!
> > >
> > > On 31 May 2013 10:27, Guillaume SCHEIBEL  
> > > wrote:
> > >> I have updated the main page but I haven't found how to delete the child
> > >> page.
> > >> The blog post has been updated you can publish whenever you want to.
> > >>
> > >> Guillaume
> > >>
> > >>
> > >> 2013/5/30 Mircea Markus 
> > >>>
> > >>> Also better place the content in the parent document directly instead of
> > >>> making it a separate child document (see the rest of cache stores):
> > >>> https://docs.jboss.org/author/display/ISPN/Cache+Loaders+and+Stores
> > >>>
> > >>> Thanks!
> > >>>
> > >>> On 30 May 2013, at 21:49, Mircea Markus  wrote:
> > >>>
> > >>>> Thanks, looks good!
> > >>>> I think it would be nice to add a code snippet with configuring the
> > >>>> store using the fluent API as well.
> > >>>> And release the blog :-)
> > >>>>
> > >>>> On 30 May 2013, at 17:05, Guillaume SCHEIBEL
> > >>>>  wrote:
> > >>>>
> > >>>>> Hello,
> > >>>>>
> > >>>>> I have wrote a small documentation page here:
> > >>>>> https://docs.jboss.org/author/display/ISPN/MongoDB+CacheStore
> > >>>>>
> > >>>>> WDYT ?
> > >>>>>
> > >>>>> about the blog post a gist (script tag) is already there but it's not
> > >>>>> visible in the editor view.
> > >>>>>
> > >>>>>
> > >>>>> Guillaume
> > >>>>>
> > >>>>>
> > >>>>> 2013/5/27 Mircea Markus 
> > >>>>>
> > >>>>> On 22 May 2013, at 11:01, Guillaume SCHEIBEL
> > >>>>>  wrote:
> > >>>>>
> > >>>>>> Mircea,
> > >>>>>>
> > >>>>>> I have just created a quick blog post titled "Using MongoDB as a 
> > >>>>>> cache
> > >>>>>> store". May you have a look at it ? If it suits you, the 
> > >>>>>> documentation will
> > >>>>>> follow soon.
> > >>>>>
> > >>>>> Thank you Guillaume. Looks good to me once the link to documentation
> > >>>>> and the sample configuration snippet is in.
> > >>>>>
> > >>>>>>
> > >>>>>> Cheers
> > >>>>>> Guillaume
> > >>>>>>
> > >>>>>>
> > >>>>>> 2013/5/22 Randall Hauch 
&g

Re: [infinispan-dev] mongodb cache store added in Infinispan 5.3 - curtesy of Guillaume Scheibel

2013-05-21 Thread Randall Hauch
There is a way to download (via Maven) and run MongoDB locally from within 
Java, via Flapdoodle's Embedded MongoDB:

https://github.com/flapdoodle-oss/embedmongo.flapdoodle.de

ModeShape uses this in our builds in support of our storage of binary values 
inside MongoDB. The relevant Maven POM parts and JUnit test case are:


https://github.com/ModeShape/modeshape/blob/master/modeshape-jcr/pom.xml#L147

https://github.com/ModeShape/modeshape/blob/master/modeshape-jcr/src/test/java/org/modeshape/jcr/value/binary/MongodbBinaryStoreTest.java



On May 21, 2013, at 1:04 PM, Mircea Markus  wrote:

> Thanks to Guillaume Scheibel, Infinispan now has an mongodb cache store that 
> will be shipped as part of 5.3.0.CR1.
> 
> The test for the mongodb cache store are not run by default. In order to be 
> able to run them you need to:
> - install mongodb locally
> - run "mongodb" profile
> The cache store was add in the CI build on all 5.3 configs (together with a 
> running instance of mongodb).
> 
> Guillaume, would you mind adding a blog entry describing this new 
> functionality? (I've invited you to be a member of the 
> infinispan.blogpsot.com team.)
> Also can you please update the user doc: 
> https://docs.jboss.org/author/display/ISPN/Cache+Loaders+and+Stores
> 
> 
> Cheers,
> -- 
> Mircea Markus
> Infinispan lead (www.infinispan.org)
> 
> 
> 
> 
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] data interoperability and remote querying

2013-04-10 Thread Randall Hauch
Although I think generally the indexing functionality should be transparent to 
clients, ModeShape does need more control over how the indexable information is 
extracted from the cached values.

Therefore, it would be great if there were a way for clients to specify the 
actual "metadata" representation (perhaps another POJO) that could be processed 
as discussed earlier.

The simple reason why ModeShape needs something like this is that the value 
objects that ModeShape puts into the Infinispan cache are DeltaAware objects 
that each wrap a single a JSON/BSON document, and there's no POJO with 
annotations that Hibernate Search can directly understand. Also, the fields 
within the JSON/BSON documents contain namespaced values, and ModeShape's 
namespace registry can change at any time, so any "bridge" object created by 
Infinispan would need a reference to the ModeShape repository instance.



On Apr 10, 2013, at 2:57 PM, Sanne Grinovero  wrote:

> Weird, when I wrote my previous reply there where no other answers and
> the rest of the thread appeared to me just now.
> 
> Good to see that Emmanuel had replied highlighting the same problems..
> we can continue from there on this topic,
> just read mine to understand that there are a lot of options that need
> to be defined for each field: specifying it's a "varchar" is not
> enough.
> 
> some more thoughts inline:
> 
> On 10 April 2013 19:46, Emmanuel Bernard  wrote:
>> On Wed 2013-04-10 18:55, Manik Surtani wrote:
>>> 
>>> On 10 Apr 2013, at 18:18, Emmanuel Bernard  wrote:
>>> 
 I favor the first options for a few reasons:
 
 - much easier client side implementations
 Frankly rewriting the analyzer logic of Lucene in every languages is
 not a piece of cake and you are out of luck for custom analysers
>>> 
>>> I'm not suggesting all the analyser logic.  Just the extraction of indexed 
>>> fields into name/value pairs, to be sent alongside the blob value.
>> 
>> Which means you make a selection already and possibly already reduce
>> your precision for a given field. Which makes reindexing impossible.
> 
> +1
> It also adds larger payloads, and complexity and overhead to the
> clients, while the user might not be able to scale the client compute
> capability as it can with the data grid.
> 
>> 
>>> 
 - more robust client implementation: if we change how indexing is done
 clients don't have to change
 - reindexing: if there is a need to rebuild the index, or if the user
 decides to reindex data differently, you must be able to read the data
 on the server side
 - validation: if you want to implement (cross entry) validation, the
 server needs to be able to read the data.
 - async, validation and indexing can be done in an async way on the
 server and avoid perceived latency from a client requiest to the
 result
>>> 
>>> Valid points above though.
>>> 
 I'm not sure JSON should be the format though. As you said it's quite
 verbose and string is not exactly the most efficient way to process
 data.
>>> 
>>> What would that format be, then?
>> 
>> Good question :) BSON is not necessarily smaller than JSON, it is meant
>> to be more parseable afair. I did use Avro in Hibernate Search as I find
>> ProtBuffer and the others too rigid for my needs to pass arbitrary
>> datasets. But if we have a schema and expect a given object type, then
>> we can start saving space a lot.
>> On other words, no idea that needs to be investigated.
> 
> Right, let's keep this to collecting requirements:
> - being able to upgrade the server without losing data
> - being able to change the (soft) schema on the server
> - read/write fields from different languages
> - deal with multi-version control of values (i.e. being able to read
> an older value through an evoluted schema, doing comparisons of same
> value even if it was stored using different schema generations)
> 
> Sanne
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] data interoperability and remote querying

2013-04-10 Thread Randall Hauch

On Apr 10, 2013, at 1:46 PM, Emmanuel Bernard  wrote:
>>> I'm not sure JSON should be the format though. As you said it's quite
>>> verbose and string is not exactly the most efficient way to process
>>> data.
>> 
>> What would that format be, then?
> 
> Good question :) BSON is not necessarily smaller than JSON, it is meant
> to be more parseable afair. I did use Avro in Hibernate Search as I find
> ProtBuffer and the others too rigid for my needs to pass arbitrary
> datasets. But if we have a schema and expect a given object type, then
> we can start saving space a lot.

Actually, I would suspect that the JSON compresses much smaller than the size 
of the BSON. The advantage of BSON, however, is the additional types that are 
supported, including binary, timestamps, etc.
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


[infinispan-dev] Infinispan and JPA entities

2013-02-14 Thread Randall Hauch
I've been wondering about this particular use case for a while:

A client application simply uses get, put, and query for objects stored in 
Infinispan, where the objects really are mapped to a real schema in a 
relational database. If the objects were JPA-like entities, the database 
mapping could be defined via a subset of the JPA annotations. Essentially, 
Infinispan becomes a key-value store on top of a traditional database with a 
domain-specific schema. Add some JAXB annotations, and it quickly becomes 
possible to expose these entities via a simple RESTful service. A new cache 
store implementation could persist the entities to JDBC using the annotations.

This may seem a bit odd at first. Why not just use JPA directly? IMO, for a 
certain class of applications, this scenario is architecturally easier to 
understand. Plus, if you put this on top of Teiid's ability to create a virtual 
database (with a virtual schema that matches what you want the objects to be), 
then you could put these new entities on top of an existing database with a 
schema that doesn't necessarily mirror the entity structure. 

Is this crazy? Is there a better way of achieving this?

Randall___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Missing javadoc for indexing annotations

2013-01-29 Thread Randall Hauch
IIRC, aggregating JavaDocs like that causes a mess of the build system, because 
everything will be built and tests run twice. See 
https://issues.jboss.org/browse/MODE-1104 for background (including the 
pull-request). It's probably not easy to figure out how it works, but I could 
explain it fairly quickly if you cared.

Isn't it much easier to simply have the Infinispan JavaDoc be able to link to 
the Hibernate Search JavaDoc site? It's not ideal, but it's far simpler. 
Perhaps the Infinispan Query JavaDocs could link back to the Hibernate Search 
JavaDocs? (Yes, there would be a chicken-and-egg problem, but the links in the 
first would simply not be available until the second were built.)

This is how ModeShape JavaDocs are built to link to several other JavaDoc sites:

https://github.com/ModeShape/modeshape/blob/master/modeshape-distribution/pom.xml#L427

On Jan 29, 2013, at 5:12 PM, Sanne Grinovero  wrote:

> The javadoc of Infinispan does not have references to the Hibernate
> Search API; this is quite uncomfortable for Infinispan Query users: I
> think we should at least import the annotations and the DSL docs.
> 
> http://docs.jboss.org/infinispan/5.2/apidocs/
> http://docs.jboss.org/hibernate/search/4.2/api/
> 
> Would you agree in merging the docs from hibernate-search-engine ?
> 
> Also, I was looking at how this could be done [1]. I have been playing
> with the maven-javadoc-plugin configuration section under the
> distribution profile, but it seems to be ignored.
> 
>> From the Infinispan source code root I'm running
>> mvn javadoc:aggregate -Pdistribution
> 
> this does produce javadoc in the /target/site/apidocs but seems to
> ignore any configuration option.
> what am I missing?
> 
> Cheers,
> Sanne
> 
> 1 - 
> http://maven.apache.org/plugins/maven-javadoc-plugin/examples/aggregate-dependency-sources.html
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] ApplyDeltaCommand: question about affected keys

2013-01-24 Thread Randall Hauch
Actually, it is both, depending upon the situation.

When a new process is started and joins the cluster, the new process will via 
state transfer receive its entries as externalized whole values, not deltas. 
However, from that point forward, I believe that all changes to a DeltaAware 
entry will be sent to other processes in the cluster via the delta.

This is fairly obvious if you think about it.

On Jan 24, 2013, at 11:42 AM, Vladimir Blagojevic  wrote:

> Should be just the deltas. That was the whole point after all :-)
> 
> Regards,
> Vladimir
> On 13-01-24 12:19 PM, Ray Tsang wrote:
>> Ah that explains it!
>> 
>> Btw - after the delta, in replicated/distributed cache, does it send over 
>> the whole object or just the deltas?  I'm assuming the former.
>> 
>> Thanks!
>> 
>> On Thu, Jan 24, 2013 at 4:04 AM, Manik Surtani  wrote:
>> Sorry, I misread the question - the link below shows how Deltas are applied 
>> when a delta is shipped around as a part of a put().  If you are explicitly 
>> using the ApplyDeltaCommand, this is Vlad's code and he should know.
>> 
>> I could trace through usage in my IDE but then so could anybody - Vlad, do 
>> you have any specific insight to add here?
>> 
>> Cheers
>> Manik
>> 
>> On 24 Jan 2013, at 12:01, Manik Surtani  wrote:
>> 
>> > I believe this is what you guys are looking for.
>> >
>> > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/commands/write/PutKeyValueCommand.java#L100
>> >
>> >
>> > On 23 Jan 2013, at 21:02, Vladimir Blagojevic  wrote:
>> >
>> >> On 13-01-23 3:36 PM, Ray Tsang wrote:
>> >>> speaking of ApplyDeltaCommand - where does it actually perform the
>> >>> delta operation?  perform() simply returns null.  Any pointers?
>> >>>
>> >>> Thanks,
>> >> Look for use of ApplyDeltaCommand class in IDE (in Eclipse highlight
>> >> class name + right click + References->Workspace)
>> >> Current uses I found are in EntryWrappingInterceptor,
>> >> OptimisticLockingInterceptor and so on
>> >>
>> >> Regards,
>> >> Vladimir
>> >> ___
>> >> infinispan-dev mailing list
>> >> infinispan-dev@lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >
>> > --
>> > Manik Surtani
>> > ma...@jboss.org
>> > twitter.com/maniksurtani
>> >
>> > Platform Architect, JBoss Data Grid
>> > http://red.ht/data-grid
>> >
>> >
>> > ___
>> > infinispan-dev mailing list
>> > infinispan-dev@lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> --
>> Manik Surtani
>> ma...@jboss.org
>> twitter.com/maniksurtani
>> 
>> Platform Architect, JBoss Data Grid
>> http://red.ht/data-grid
>> 
>> 
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> 
>> ___
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> ___
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Question about replication with 5.2.0.CR1

2013-01-15 Thread Randall Hauch
I logged this as https://issues.jboss.org/browse/ISPN-2712

On Jan 15, 2013, at 7:44 AM, Randall Hauch  wrote:

> I would tend to say that it does not behave like it is passivating some of 
> the data, because some of the entries are simply missing in the second 
> process after startup completes. In other words, we call "get" with a key 
> that was clearly transferred, but we get null back.
> 
> My understanding of passivation is that the entry will be *either* in-memory 
> or persisted in the cache store (but not both). It appears in our tests that 
> while most entries were added to the cache store (of the second process), 
> some do not exist in the cache store *or* in-memory.
> 
> On Jan 14, 2013, at 4:19 PM, Ray Tsang  wrote:
> 
>> Does it behave like what's described here? 
>> https://docs.jboss.org/author/display/ISPN/Cache+Loaders+and+Stores#CacheLoadersandStores-CachePassivation
>> 
>> Thanks,
>> 
>> On Mon, Jan 14, 2013 at 6:53 AM, Randall Hauch  wrote:
>> We're using MODE-1745 to track this problem (from ModeShape's perspective): 
>> https://issues.jboss.org/browse/MODE-1745
>> 
>> Here's an update.
>> 
>> After some more extensive debugging sessions, found the following in the 
>> scenario when the 2nd cluster nodes is started only after the first 
>> clustered node has completed initialization:
>> 
>> • state transfer seems to be sending all the data across to node 2 
>> from node 1 - in total 293 PutKeyValueCommand
>> • when the systemNode == null check is performed on node 2, the 
>> cache only has 222 entries and there is nothing persisted in the file cache 
>> store
>> 
>> This seems to be a case of nodes being evicted from node 2, without being 
>> persisted on the underlying cache store. Disabling eviction via the 
>> configuration file make this scenario pass. With the latter set to NONE, all 
>> the 293 entries received by node 2 are placed in the data container. With 
>> the eviction set to
>> 
>> 
>> 
>> only 222 entries (out of 293 total) are placed in the cache. This indicates 
>> to Horia and I that there's a possible bug 
>> aroundorg.infinispan.container.DefaultDataContainer#put 
>> /org.infinispan.util.concurrent.BoundedConcurrentHashMap (the actual runtime 
>> instance) and eviction.
>> 
>> Running with:
>> 
>> 
>> 
>> produces the same problem, so ATM we're able to run successfully **ONLY** 
>> with eviction disabled.
>> 
>> Can someone familiar with Infinispan's internals please take a look at this 
>> to see if we're correct?
>> 
>> 
>> On Jan 11, 2013, at 12:03 PM, Randall Hauch  wrote:
>> 
>> > I'm trying to debug some problems that ModeShape is having in clustered 
>> > situations when using 5.2.0.CR1. I don't have a standalone test case, but 
>> > hopefully I can explain what I'm doing.
>> >
>> > I'm working with a replicated cache and two processes. The cache 
>> > configuration (see attached) uses a cache store (with 
>> > fetchPersistentState=true) and eviction (though the value of 'maxEntries' 
>> > is high enough that neither process hits it).
>> >
>> > The first process starts up fine and both ModeShape and Infinispan work 
>> > fine. After some period of time (~20 seconds), I start up the second 
>> > process. It joins the cluster, and receives the initial state transfer 
>> > from the first process. I can see from the logs that an entry with a 
>> > particular key has been transferred and is complete. The second process 
>> > (as part of the ModeShape initialization code) attempts to look up the 
>> > entry with this particular key, but it doesn't find it. At this point, 
>> > ModeShape starts mis-behaving because this particular entry is critical in 
>> > knowing if ModeShape needs to initialize the repository content (by 
>> > creating several hundred entries). Upon finding no such node, it attempts 
>> > to recreate it and a hundred other entries. Some succeed, but others fail 
>> > because existing entries are found when they weren't expected to be found. 
>> > I've replicated this problem on two different machines with different 
>> > operating systems.
>> >
>> > We're using explicit locks for writes, and we're using a cache with 
>> > SKIP_REMOTE_LOOKUP and DELTA_WRITE flags when writing, but no particular 
>> > flags when reading. (See below for why we

Re: [infinispan-dev] Question about replication with 5.2.0.CR1

2013-01-15 Thread Randall Hauch
I would tend to say that it does not behave like it is passivating some of the 
data, because some of the entries are simply missing in the second process 
after startup completes. In other words, we call "get" with a key that was 
clearly transferred, but we get null back.

My understanding of passivation is that the entry will be *either* in-memory or 
persisted in the cache store (but not both). It appears in our tests that while 
most entries were added to the cache store (of the second process), some do not 
exist in the cache store *or* in-memory.

On Jan 14, 2013, at 4:19 PM, Ray Tsang  wrote:

> Does it behave like what's described here? 
> https://docs.jboss.org/author/display/ISPN/Cache+Loaders+and+Stores#CacheLoadersandStores-CachePassivation
> 
> Thanks,
> 
> On Mon, Jan 14, 2013 at 6:53 AM, Randall Hauch  wrote:
> We're using MODE-1745 to track this problem (from ModeShape's perspective): 
> https://issues.jboss.org/browse/MODE-1745
> 
> Here's an update.
> 
> After some more extensive debugging sessions, found the following in the 
> scenario when the 2nd cluster nodes is started only after the first clustered 
> node has completed initialization:
> 
> • state transfer seems to be sending all the data across to node 2 
> from node 1 - in total 293 PutKeyValueCommand
> • when the systemNode == null check is performed on node 2, the cache 
> only has 222 entries and there is nothing persisted in the file cache store
> 
> This seems to be a case of nodes being evicted from node 2, without being 
> persisted on the underlying cache store. Disabling eviction via the 
> configuration file make this scenario pass. With the latter set to NONE, all 
> the 293 entries received by node 2 are placed in the data container. With the 
> eviction set to
> 
> 
> 
> only 222 entries (out of 293 total) are placed in the cache. This indicates 
> to Horia and I that there's a possible bug 
> aroundorg.infinispan.container.DefaultDataContainer#put 
> /org.infinispan.util.concurrent.BoundedConcurrentHashMap (the actual runtime 
> instance) and eviction.
> 
> Running with:
> 
> 
> 
> produces the same problem, so ATM we're able to run successfully **ONLY** 
> with eviction disabled.
> 
> Can someone familiar with Infinispan's internals please take a look at this 
> to see if we're correct?
> 
> 
> On Jan 11, 2013, at 12:03 PM, Randall Hauch  wrote:
> 
> > I'm trying to debug some problems that ModeShape is having in clustered 
> > situations when using 5.2.0.CR1. I don't have a standalone test case, but 
> > hopefully I can explain what I'm doing.
> >
> > I'm working with a replicated cache and two processes. The cache 
> > configuration (see attached) uses a cache store (with 
> > fetchPersistentState=true) and eviction (though the value of 'maxEntries' 
> > is high enough that neither process hits it).
> >
> > The first process starts up fine and both ModeShape and Infinispan work 
> > fine. After some period of time (~20 seconds), I start up the second 
> > process. It joins the cluster, and receives the initial state transfer from 
> > the first process. I can see from the logs that an entry with a particular 
> > key has been transferred and is complete. The second process (as part of 
> > the ModeShape initialization code) attempts to look up the entry with this 
> > particular key, but it doesn't find it. At this point, ModeShape starts 
> > mis-behaving because this particular entry is critical in knowing if 
> > ModeShape needs to initialize the repository content (by creating several 
> > hundred entries). Upon finding no such node, it attempts to recreate it and 
> > a hundred other entries. Some succeed, but others fail because existing 
> > entries are found when they weren't expected to be found. I've replicated 
> > this problem on two different machines with different operating systems.
> >
> > We're using explicit locks for writes, and we're using a cache with 
> > SKIP_REMOTE_LOOKUP and DELTA_WRITE flags when writing, but no particular 
> > flags when reading. (See below for why we're using these flags.)
> >
> > My understanding is that, once the initial state transfer completes, the 
> > second process' cache store should contain all of the transferred entries, 
> > and any attempt to look up an entry by key will obviously check local 
> > memory and, if not found, will consult the cache store.
> >
> > I've attached the log file for this second process. Here are some of the 
> > key points in the file:
> &

Re: [infinispan-dev] Question about replication with 5.2.0.CR1

2013-01-14 Thread Randall Hauch
We're using MODE-1745 to track this problem (from ModeShape's perspective): 
https://issues.jboss.org/browse/MODE-1745

Here's an update.

After some more extensive debugging sessions, found the following in the 
scenario when the 2nd cluster nodes is started only after the first clustered 
node has completed initialization:

• state transfer seems to be sending all the data across to node 2 from 
node 1 - in total 293 PutKeyValueCommand
• when the systemNode == null check is performed on node 2, the cache 
only has 222 entries and there is nothing persisted in the file cache store

This seems to be a case of nodes being evicted from node 2, without being 
persisted on the underlying cache store. Disabling eviction via the 
configuration file make this scenario pass. With the latter set to NONE, all 
the 293 entries received by node 2 are placed in the data container. With the 
eviction set to



only 222 entries (out of 293 total) are placed in the cache. This indicates to 
Horia and I that there's a possible bug 
aroundorg.infinispan.container.DefaultDataContainer#put 
/org.infinispan.util.concurrent.BoundedConcurrentHashMap (the actual runtime 
instance) and eviction.

Running with:



produces the same problem, so ATM we're able to run successfully **ONLY** with 
eviction disabled.

Can someone familiar with Infinispan's internals please take a look at this to 
see if we're correct?


On Jan 11, 2013, at 12:03 PM, Randall Hauch  wrote:

> I'm trying to debug some problems that ModeShape is having in clustered 
> situations when using 5.2.0.CR1. I don't have a standalone test case, but 
> hopefully I can explain what I'm doing.
> 
> I'm working with a replicated cache and two processes. The cache 
> configuration (see attached) uses a cache store (with 
> fetchPersistentState=true) and eviction (though the value of 'maxEntries' is 
> high enough that neither process hits it). 
> 
> The first process starts up fine and both ModeShape and Infinispan work fine. 
> After some period of time (~20 seconds), I start up the second process. It 
> joins the cluster, and receives the initial state transfer from the first 
> process. I can see from the logs that an entry with a particular key has been 
> transferred and is complete. The second process (as part of the ModeShape 
> initialization code) attempts to look up the entry with this particular key, 
> but it doesn't find it. At this point, ModeShape starts mis-behaving because 
> this particular entry is critical in knowing if ModeShape needs to initialize 
> the repository content (by creating several hundred entries). Upon finding no 
> such node, it attempts to recreate it and a hundred other entries. Some 
> succeed, but others fail because existing entries are found when they weren't 
> expected to be found. I've replicated this problem on two different machines 
> with different operating systems.
> 
> We're using explicit locks for writes, and we're using a cache with 
> SKIP_REMOTE_LOOKUP and DELTA_WRITE flags when writing, but no particular 
> flags when reading. (See below for why we're using these flags.)
> 
> My understanding is that, once the initial state transfer completes, the 
> second process' cache store should contain all of the transferred entries, 
> and any attempt to look up an entry by key will obviously check local memory 
> and, if not found, will consult the cache store.
> 
> I've attached the log file for this second process. Here are some of the key 
> points in the file:
> 
> 1) Starting with line 39, the log shows that the cache is started, joins the 
> existing cluster, and waits for the initial state transfer. This is follows 
> by lots of lines showing the details of the state transfer.
> 
> 2) On line 159, one of the state transfer DEBUG lines shows a 
> PutKeyValueCommand with the entry of interest, with key 
> "cb80206317f1e7jcr:system" and who's value looks as expected:
> 
> OOB-1,Machine1-27258 2013-01-11 11:31:26,819 TRACE 
> statetransfer.StateTransferInterceptor - handleTopologyAffectedCommand for 
> command PutKeyValueCommand{key=cb80206317f1e7jcr:system, 
> value=SchematicEntryLiteral{ "metadata" : { "id" : "cb80206317f1e7jcr:system" 
> , "contentType" : "application/json" } , "content" : { "key" : 
> "cb80206317f1e7jcr:system" , "parent" : [ "cb80206317f1e7/" , 
> "cb80206cd556c0/" ] , "properties" : { "http://www.jcp.org/jcr/1.0"; : { 
> "primaryType" : { "$name" : "mode:system" } } } , "children" : [ { "key" : 
> "cb80206317f1