1) by 'storage' I presume you mean the main database storage ?  To do that 
would require using a network filesystem of some sort (HDFS , NFS, GFS etc) and 
a remote server cluster hosting the FS.
ML does support S3 for native storage but it does not support transaction 
journaling ( due to limitations on S3 ).

Depending on your network connectivity and 'devices' between you and AWS the 
performance will vary - but you cannot break the speed of light wrt  latency.   
Without a local caching component I can't imagine a performant configuration.

Depending on the rationale and requirements for this, and your budget -- this 
is such a component:
https://aws.amazon.com/storagegateway/

It’s a 'appliance' like a SAN that caches to S3.     
I don’t know of any tests with this device but it has the necessary features.

2) Best left to the networking experts (wrt SAN).   My non-expert thoughts on 
this is that SAN is designed to be accessible by multiple nodes,
if you only have 1 node per SAN why not just directly connect the disks ?

3) "Same DataCenter" is less important than latency and bandwidth.
Nodes in a clusters perform better if they are 'close' to each other wrt to 
networking latency.

An example where this distinction matters is on AWS in a given 'Region'  (say 
us-east-1) there are 5 "Zones" comprised of > 10 'datacenters'
Network latency between nodes in different zones is less than the latency from 
the CPU to a local hard disk.  ( approx. 2ms ).
(general reference)
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html
http://www.experts-exchange.com/articles/18666/AWS-Regions-Availability-Zones-and-Edge-Locations-explained.html

Our 'white paper' recommended architecture for AWS for a 3 node cluster is to 
have each node in a different zone in the same region.
This gives you good fault tolerance with minimal if any performance degradation.
Region to Region latency however is much higher -- limited by speed of light -- 
(and # of hops etc).  10ms-100ms or more

Here's one example (just google for 'aws region latency'
https://shaun.net/posts/speed-and-latency-between-aws-sydney-and-other-regions

Between geographic regions ML I would recommend separate clusters using foreign 
replication.  The protocol is designed for larger latency connectivity.


4) -- ( not on list )
"On Premise" and "In Cloud"  isn't always orthogonal. You can provision your 
datacenters with high speed 'Direct Connect' connections to AWS and logically 
join the networks.   This allows you to view the whole system as one network 
and migrate workloads and services as you see fit.
https://aws.amazon.com/directconnect/



-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Dennis Garlick
Sent: Thursday, January 07, 2016 4:58 PM
To: [email protected]
Subject: [MarkLogic Dev General] Marklogic hosting options?

Hi,

Without going through all of the time and expense to test various options, I’m 
wondering what are the possible drawbacks (or even
feasibility) of using the following to host a Marklogic environment:

•       Is it feasible to use Amazon Web Services just for storage,
while the server is on premises (as opposed to having the server in the cloud 
as well)? I’m guessing this is possible, but would it really hurt performance?
•       If you have a 3-server Marklogic cluster, does it make sense for
them to connect to a single SAN storage, or should they each have their own SAN 
storage?
•       Is it feasible to have a cluster where nodes in the cluster are
located in different locations such as different states (assuming that data on 
one node will not be replicated on the other nodes)? Or would performance 
demands mean that the servers of a cluster should ideally (or preferably) 
reside in the same data center?

Thanks,

Dennis
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to