Re: Question about JCLOUDS-658 Jira issue
Hi Adrian/Andrew, thank you very much for your replies. This information will be really helpful to continue my evaluation of jclouds. Cheers, Gabriel 2014-11-12 0:34 GMT-08:00 Andrew Gaul : > On Mon, Nov 10, 2014 at 05:45:10PM -0800, Gabriel Lavoie wrote: > > I'm currently reviewing the jclouds filesystem blobstore API. I > > noticed that user metadata doesn't get saved with the 1.8.1 version, > > but that a comment exists in the code regarding Java 7 and NIO > > filestore attributes. I also found out with the JCLOUDS-658 that > > jclouds 2.0 (unreleased) fixes the issue by using the NIO filestore > > attributes. > > > > Regarding this issue and resolution: > > - Is the filesystem API considered "production" safe or only suggested > for > > testing/debugging? > > - I'm not sure about the solution of using the filesystem metadata store > > to store metadata. Many (most?) filesystem archival/backup/explorer are > > not aware of this metadata and it may get lost without the user knowing > > about it. I would consider using this for testing, but never in a > > production environment. > > - Could an alternate way of storing metadata be implemented, for > > example in a properties file named .metadata stored alongside > > with the object file? > > - I could implement a file metadata storage by wrapping > > FilesystemStorageStrategyImpl, but I don't think this is a good idea to > > wrap it as I have to re-implement a few classes to have the dependency > > injection work correctly. > > The filesystem blobstore has traditionally been used for testing > purposes and has a few caveats for production use. However, it should > perform well enough for applications with a small number of objects > (tens of thousands) and small object sizes (gigabytes). > > My primary concern for production use is that a filesystem is not a > blobstore; a single node lacks the performance and reliability that a > multi-node Swift installation provides. While Swift uses a filesystem > to stores its blob data, it uses XFS with several tuning parameters and > hashes object names into directory shards to avoid scalability limits. > > The filesystem blobstore has a few known issues; we will gladly accept > patches for these: > > * scalability: returning a paginated list of objects creates the > entire list in-memory > * compatibility: lacks multi-part upload, some listing options not > supported > * error paths: writing partial objects or concurrently overwriting > objects has different semantics than real object stores, especially > on Windows > * performance: excessive system calls on various code paths > * metadata: 1.8.1 does not support user metadata and other attributes. > The approach for future releases only works on Linux and Windows. > Current JDK on Mac OS X does not support xattr (JDK-8030048). > > Using file extended attributes is the "right" way to store object > metadata and allows other tools to interact with these objects instead > of using a special format that only jclouds understands. However, I > appreciate the practical reasons to prefer another approach; could you > explore this issue further? Every solution has a trade-offs, for > example storing in .metadata file requires extra work when listing > objects. You may want to consider using a second directory for metadata > instead. Finally you might want to look at Jimfs which creates an > in-memory filesystem provider; perhaps you could create an on-disk > provider which stores extended attributes in some other way? > > -- > Andrew Gaul > http://gaul.org/ > -- Gabriel Lavoie glav...@gmail.com
Re: Question about JCLOUDS-658 Jira issue
On Mon, Nov 10, 2014 at 05:45:10PM -0800, Gabriel Lavoie wrote: > I'm currently reviewing the jclouds filesystem blobstore API. I > noticed that user metadata doesn't get saved with the 1.8.1 version, > but that a comment exists in the code regarding Java 7 and NIO > filestore attributes. I also found out with the JCLOUDS-658 that > jclouds 2.0 (unreleased) fixes the issue by using the NIO filestore > attributes. > > Regarding this issue and resolution: > - Is the filesystem API considered "production" safe or only suggested for > testing/debugging? > - I'm not sure about the solution of using the filesystem metadata store > to store metadata. Many (most?) filesystem archival/backup/explorer are > not aware of this metadata and it may get lost without the user knowing > about it. I would consider using this for testing, but never in a > production environment. > - Could an alternate way of storing metadata be implemented, for > example in a properties file named .metadata stored alongside > with the object file? > - I could implement a file metadata storage by wrapping > FilesystemStorageStrategyImpl, but I don't think this is a good idea to > wrap it as I have to re-implement a few classes to have the dependency > injection work correctly. The filesystem blobstore has traditionally been used for testing purposes and has a few caveats for production use. However, it should perform well enough for applications with a small number of objects (tens of thousands) and small object sizes (gigabytes). My primary concern for production use is that a filesystem is not a blobstore; a single node lacks the performance and reliability that a multi-node Swift installation provides. While Swift uses a filesystem to stores its blob data, it uses XFS with several tuning parameters and hashes object names into directory shards to avoid scalability limits. The filesystem blobstore has a few known issues; we will gladly accept patches for these: * scalability: returning a paginated list of objects creates the entire list in-memory * compatibility: lacks multi-part upload, some listing options not supported * error paths: writing partial objects or concurrently overwriting objects has different semantics than real object stores, especially on Windows * performance: excessive system calls on various code paths * metadata: 1.8.1 does not support user metadata and other attributes. The approach for future releases only works on Linux and Windows. Current JDK on Mac OS X does not support xattr (JDK-8030048). Using file extended attributes is the "right" way to store object metadata and allows other tools to interact with these objects instead of using a special format that only jclouds understands. However, I appreciate the practical reasons to prefer another approach; could you explore this issue further? Every solution has a trade-offs, for example storing in .metadata file requires extra work when listing objects. You may want to consider using a second directory for metadata instead. Finally you might want to look at Jimfs which creates an in-memory filesystem provider; perhaps you could create an on-disk provider which stores extended attributes in some other way? -- Andrew Gaul http://gaul.org/
Re: Question about JCLOUDS-658 Jira issue
Hi, Gabriel. I'll give you one set of answers. Maybe you'll get another :) > - Is the filesystem API considered "production" safe or only suggested for > testing/debugging? Put a gun to my head, and I'd say no. For example, it is unsafe to run multithreaded as there are race conditions in the impl [1]. Also, it only runs a subset our our existing integration tests. [2] Production means something different for different people, but I wouldn't suggest it for production. > - Could an alternate way of storing metadata be implemented, for > example in a properties file named .metadata stored alongside > with the object file? Yeah I've seen this approach in the past, and that would certainly be more portable. > - I could implement a file metadata storage by wrapping > FilesystemStorageStrategyImpl, but I don't think this is a good idea to > wrap it as I have to re-implement a few classes to have the dependency > injection work correctly. Perhaps there's a way to isolate the metadata aspect. Personally, I would raise a Jira [3] asking to remove the requirement of extended attributes. -A [1] https://github.com/jclouds/jclouds/blob/master/apis/filesystem/src/main/java/org/jclouds/filesystem/strategy/internal/FilesystemStorageStrategyImpl.java [2] partially implemented https://github.com/jclouds/jclouds/tree/master/apis/filesystem/src/test/java/org/jclouds/filesystem/integration vs https://github.com/jclouds/jclouds/tree/master/blobstore/src/test/java/org/jclouds/blobstore/integration/internal (conceding blob signing or public acl are not relevant) [3] https://issues.apache.org/jira/browse/JCLOUDS