Re: Question about JCLOUDS-658 Jira issue

2014-11-12 Thread Gabriel Lavoie
Hi Adrian/Andrew,
thank you very much for your replies. This information will be really
helpful to continue my evaluation of jclouds.

Cheers,


Gabriel

2014-11-12 0:34 GMT-08:00 Andrew Gaul :

> On Mon, Nov 10, 2014 at 05:45:10PM -0800, Gabriel Lavoie wrote:
> > I'm currently reviewing the jclouds filesystem blobstore API. I
> > noticed that user metadata doesn't get saved with the 1.8.1 version,
> > but that a comment exists in the code regarding Java 7 and NIO
> > filestore attributes. I also found out with the JCLOUDS-658 that
> > jclouds 2.0 (unreleased) fixes the issue by using the NIO filestore
> > attributes.
> >
> > Regarding this issue and resolution:
> > - Is the filesystem API considered "production" safe or only suggested
> for
> > testing/debugging?
> > - I'm not sure about the solution of using the filesystem metadata store
> > to store metadata. Many (most?) filesystem archival/backup/explorer are
> > not aware of this metadata and it may get lost without the user knowing
> > about it. I would consider using this for testing, but never in a
> > production environment.
> >  - Could an alternate way of storing metadata be implemented, for
> > example in a properties file named .metadata stored alongside
> > with the object file?
> > - I could implement a file metadata storage by wrapping
> > FilesystemStorageStrategyImpl, but I don't think this is a good idea to
> > wrap it as I have to re-implement a few classes to have the dependency
> > injection work correctly.
>
> The filesystem blobstore has traditionally been used for testing
> purposes and has a few caveats for production use.  However, it should
> perform well enough for applications with a small number of objects
> (tens of thousands) and small object sizes (gigabytes).
>
> My primary concern for production use is that a filesystem is not a
> blobstore; a single node lacks the performance and reliability that a
> multi-node Swift installation provides.  While Swift uses a filesystem
> to stores its blob data, it uses XFS with several tuning parameters and
> hashes object names into directory shards to avoid scalability limits.
>
> The filesystem blobstore has a few known issues; we will gladly accept
> patches for these:
>
>   * scalability: returning a paginated list of objects creates the
> entire list in-memory
>   * compatibility: lacks multi-part upload, some listing options not
> supported
>   * error paths: writing partial objects or concurrently overwriting
> objects has different semantics than real object stores, especially
> on Windows
>   * performance: excessive system calls on various code paths
>   * metadata: 1.8.1 does not support user metadata and other attributes.
> The approach for future releases only works on Linux and Windows.
> Current JDK on Mac OS X does not support xattr (JDK-8030048).
>
> Using file extended attributes is the "right" way to store object
> metadata and allows other tools to interact with these objects instead
> of using a special format that only jclouds understands.  However, I
> appreciate the practical reasons to prefer another approach; could you
> explore this issue further?  Every solution has a trade-offs, for
> example storing in .metadata file requires extra work when listing
> objects.  You may want to consider using a second directory for metadata
> instead.  Finally you might want to look at Jimfs which creates an
> in-memory filesystem provider; perhaps you could create an on-disk
> provider which stores extended attributes in some other way?
>
> --
> Andrew Gaul
> http://gaul.org/
>



-- 
Gabriel Lavoie
glav...@gmail.com


Re: Question about JCLOUDS-658 Jira issue

2014-11-12 Thread Andrew Gaul
On Mon, Nov 10, 2014 at 05:45:10PM -0800, Gabriel Lavoie wrote:
> I'm currently reviewing the jclouds filesystem blobstore API. I
> noticed that user metadata doesn't get saved with the 1.8.1 version,
> but that a comment exists in the code regarding Java 7 and NIO
> filestore attributes. I also found out with the JCLOUDS-658 that
> jclouds 2.0 (unreleased) fixes the issue by using the NIO filestore
> attributes.
> 
> Regarding this issue and resolution:
> - Is the filesystem API considered "production" safe or only suggested for
> testing/debugging?
> - I'm not sure about the solution of using the filesystem metadata store
> to store metadata. Many (most?) filesystem archival/backup/explorer are
> not aware of this metadata and it may get lost without the user knowing
> about it. I would consider using this for testing, but never in a
> production environment.
>  - Could an alternate way of storing metadata be implemented, for
> example in a properties file named .metadata stored alongside
> with the object file?
> - I could implement a file metadata storage by wrapping
> FilesystemStorageStrategyImpl, but I don't think this is a good idea to
> wrap it as I have to re-implement a few classes to have the dependency
> injection work correctly.

The filesystem blobstore has traditionally been used for testing
purposes and has a few caveats for production use.  However, it should
perform well enough for applications with a small number of objects
(tens of thousands) and small object sizes (gigabytes).

My primary concern for production use is that a filesystem is not a
blobstore; a single node lacks the performance and reliability that a
multi-node Swift installation provides.  While Swift uses a filesystem
to stores its blob data, it uses XFS with several tuning parameters and
hashes object names into directory shards to avoid scalability limits.

The filesystem blobstore has a few known issues; we will gladly accept
patches for these:

  * scalability: returning a paginated list of objects creates the
entire list in-memory
  * compatibility: lacks multi-part upload, some listing options not
supported
  * error paths: writing partial objects or concurrently overwriting
objects has different semantics than real object stores, especially
on Windows
  * performance: excessive system calls on various code paths
  * metadata: 1.8.1 does not support user metadata and other attributes.
The approach for future releases only works on Linux and Windows.
Current JDK on Mac OS X does not support xattr (JDK-8030048).

Using file extended attributes is the "right" way to store object
metadata and allows other tools to interact with these objects instead
of using a special format that only jclouds understands.  However, I
appreciate the practical reasons to prefer another approach; could you
explore this issue further?  Every solution has a trade-offs, for
example storing in .metadata file requires extra work when listing
objects.  You may want to consider using a second directory for metadata
instead.  Finally you might want to look at Jimfs which creates an
in-memory filesystem provider; perhaps you could create an on-disk
provider which stores extended attributes in some other way?

-- 
Andrew Gaul
http://gaul.org/


Re: Question about JCLOUDS-658 Jira issue

2014-11-10 Thread Adrian Cole
Hi, Gabriel. I'll give you one set of answers. Maybe you'll get another :)

> - Is the filesystem API considered "production" safe or only suggested for
> testing/debugging?
Put a gun to my head, and I'd say no. For example, it is unsafe to run
multithreaded as there are race conditions in the impl [1]. Also, it
only runs a subset our our existing integration tests. [2]  Production
means something different for different people, but I wouldn't suggest
it for production.
>  - Could an alternate way of storing metadata be implemented, for
> example in a properties file named .metadata stored alongside
> with the object file?
Yeah I've seen this approach in the past, and that would certainly be
more portable.
> - I could implement a file metadata storage by wrapping
> FilesystemStorageStrategyImpl, but I don't think this is a good idea to
> wrap it as I have to re-implement a few classes to have the dependency
> injection work correctly.
Perhaps there's a way to isolate the metadata aspect. Personally, I
would raise a Jira [3] asking to remove the requirement of extended
attributes.

-A

[1] 
https://github.com/jclouds/jclouds/blob/master/apis/filesystem/src/main/java/org/jclouds/filesystem/strategy/internal/FilesystemStorageStrategyImpl.java
[2] partially implemented
https://github.com/jclouds/jclouds/tree/master/apis/filesystem/src/test/java/org/jclouds/filesystem/integration
vs 
https://github.com/jclouds/jclouds/tree/master/blobstore/src/test/java/org/jclouds/blobstore/integration/internal
(conceding blob signing or public acl are not relevant)
[3] https://issues.apache.org/jira/browse/JCLOUDS