Following up on my setup for the GridFilesystem + custom FileCacheStore... There is a single node (master) that is responsible to reading/writing to the file system. There will be many more nodes (slaves) that will fetch File content when needed. So my setup has two configuration files: * cache-master.xml, and * cache-slave.xml
The cache-master.xml defines (using the std jgroups-udp.xml conf): ---------------------------------------------------------------------- <namedCache name="type-metadata"> <clustering mode="replication"> <stateRetrieval timeout="20000" fetchInMemoryState="true" alwaysProvideInMemoryState="true" /> <sync replTimeout="20000" /> </clustering> <loaders passivation="false" shared="true" preload="true"> <loader class="com.my.cache.loaders.FileMetadataCacheStore" fetchPersistentState="false" purgeOnStartup="false"> <properties> <property name="location" value="/data" /> </properties> </loader> </loaders> </namedCache> <namedCache name="type-data"> <clustering mode="invalidation"> <sync replTimeout="20000" /> </clustering> <loaders passivation="false" shared="true" preload="false"> <loader class="com.my.cache.loaders.FileDataCacheStore" fetchPersistentState="false" purgeOnStartup="false"> <properties> <property name="location" value="/data" /> </properties> </loader> </loaders> </namedCache> ---------------------------------------------------------------------- And here is the cache-slave.xml (also using the std jgroups-udp.xml conf): ---------------------------------------------------------------------- <namedCache name="type-metadata"> <clustering mode="replication"> <stateRetrieval timeout="20000" fetchInMemoryState="true" alwaysProvideInMemoryState="true" /> <sync replTimeout="20000" /> </clustering> <loaders preload="true"> <loader class="org.infinispan.loaders.cluster.ClusterCacheLoader"> <properties> <property name="remoteCallTimeout" value="20000" /> </properties> </loader> </loaders> </namedCache> <namedCache name="type-data"> <clustering mode="invalidation"> <sync replTimeout="20000" /> </clustering> <loaders preload="false"> <loader class="org.infinispan.loaders.cluster.ClusterCacheLoader"> <properties> <property name="remoteCallTimeout" value="20000" /> </properties> </loader> </loaders> </namedCache> ---------------------------------------------------------------------- The master starts up fine, but when starting the 1st slave I get: java.net.NoRouteToHostException: No route to host at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:432) at java.net.Socket.connect(Socket.java:529) at org.jgroups.util.Util.connect(Util.java:276) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:510) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223) at org.jgroups.protocols.FRAG2.up(FRAG2.java:189) at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:891) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246) at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613) at org.jgroups.protocols.UNICAST.up(UNICAST.java:294) Any ideas? thanks, -- yuri On Mon, Jul 11, 2011 at 8:58 PM, Yuri de Wit <yde...@gmail.com> wrote: > Hi Galder, > > Thanks for your reply. Let me continue this discussion here first to > validate my thinking before I create any issues in JIRA (forgive me > for the lengthy follow up). > > First of all, thanks for this wonderful project! I started looking > into Ehcache as the default caching implementation, but found it > lacking on some key features when using JGroups. My guess is that all > the development there is going towards the Terracotta distribution > instead of JGroups. Terracotta does seems like a wonderful product, > but I was hoping to stick to JGroups based caching impl. So I was > happy to have found Infinispan. > > I need to create a distributed cache that loads data from the file > system. It's a tree of folders/files containing mostly metadata info > that changes seldom, but changes. Our mid-term goal is to move the > metadata away from the file system and into a database, but that is > not feasible now due to a tight deadline and the risks of refactoring > too much of the code base. > > So I was happy to see the GridFilesystem implementation in Infinispan > and the fact that clustered caches can be lazily populated (the > metadata tree in the FS can be large and having all nodes in the > cluster preloaded with all the data would not work for us). However, > it defines it's own persistence scheme with specific file names and > serialized buckets, which would require us to have a cache-aside > strategy to read our metadata tree and populate the GridFilesystem > with it. > > What I am looking for is to be able to plug into the GridFilesystem a > new FileCacheStore that can load directly from an existing directory > tree, transparently. This will basically automatically lazy load FS > content across the cluster without having to pre-populate the > GridFilesystem programatically. > > At first I was hoping to extend the existing FileCacheStore to support > this (hence why I was asking for a GripInputStream.skip()/available() > implementation and make the constructors protected instead package > level access), but I later realized that what I needed was an entire > new implementation since the buckets abstraction there is not really > appropriate. > > The good news is that I am close to 75% complete with the impl here. > It is working, with a few caveats, beautifully for a single node, but > I am facing some issues trying to launch a second node in the cluster > (most of it my ignorance, I am sure). > > ** Do you see any issues with this approach that I am not aware of? > > In addition, I am having a couple of issues launching the second node > in the cluster. A couple of NPEs and an exception > "java.net.NoRouteToHostException: No route to host". I will send the > details to these exceptions in a follow up email. > > This is where I am stuck at the moment. In my setup I have two > configuration files: > * cache-master.xml > * cache-slave.xml > Both define data and the metadata caches required by GridFilesystem > but -master.xml configures the custom FileCacheStore I implemented and > -slave.xml uses the ClusterCacheLoader. > > These are some of the items/todo's for this custom FileCacheStore impl: > ** Implement chunked writes with a special chunking protocol to > trigger when the last chunk has been delivered > ** custom configuration to simplify it for GridFilesystem. > > regards, > -- yuri > > > > > > > > With the exception of supporting a safe chunk write (for now I am > sending the whole file content when writing to the cache, since > chunked write would require additional changes to GridFS such as a > protocol to let the loader know that the current chunk is the last one > and so it can finally update the underlying file as a whole, etc). > can parsed and translate into file read on the real > > > Any change to implement the skip() and available() methods in > GridInputStream or make the constructors in the GridFileSystem package > public so I can easily extend them? > > I am trying to plug in a custom FileMetadataCacheStore and > FileDataCacheStore implementation under the metadata/data caches used > by the GridFS so that loading from an existing FS it completely > transparent and lazy (I'll be happy to contribute if it makes sense). > The problem is that any BufferedInputStream's wrapped around the > GridInputStream call available/skip but they are not implemented in > gridfs. > > Do you also see any issues with the above approach? > > regards, > > > On Mon, Jul 11, 2011 at 12:06 PM, galderz > <reply+m-9778632-9f501737dc4435143baf6908afcd349935f88...@reply.github.com> > wrote: >> Hey Yuri, >> >> Why do you need two new file based stores? Can't you plug Infinispan with a >> file based cache store to give you FS persistence? >> >> Anyway, I'd suggest you discuss it in the Infinispan dev list >> (http://lists.jboss.org/pipermail/infinispan-dev/) and in parallel, create >> an issue in https://issues.jboss.org/browse/ISPN >> >> Cheers, >> Galder >> >> -- >> Reply to this email directly or view it on GitHub: >> http://github.com/inbox/9770320#reply >> > _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev