Re: [MarkLogic Dev General] Marklogic Cluster Setup (Khan, Kashif)

2013-02-12 Thread Khan, Kashif
Thanks Aaron and Mike for the detailed email. Here is what we are trying to do.


  1.  We are trying to set up a Marklogic Cluster of 3 servers for Faileover.
  2.  We do not have GFS/Clustered file system.
  3.  We are trying to find out what are our best options.

From the email chain this is what I understand our options are.

Scenario 1:

  1.  Use dedicated NAS one for each server. A total of 3 dedicated NAS will be 
needed.
  2.  Configure a file replication service to replicate forests among all 3 
instances of  NAS.

Question: is there any documentation on how to configure the replication 
service for Marklogic forest replication?

Scenario 2:

  1.  Use local storage on all three servers
  2.  Configure a file replication service to replicate forests among all 3 
instances of  local storage.

Question: In this scenario if the local storage reached its capacity we can 
not increase the local storage. What are the options if  local storage gets 
maxed out?


Suggestions are most welcome.

__
Kashif Khan




On 2/9/13 6:41 PM, Aaron Rosenbaum 
aaron.rosenb...@marklogic.commailto:aaron.rosenb...@marklogic.com wrote:

Yes, you can use NAS. Like SAN, the key is adequate performance. This is the 
tricky part because getting that performance is very difficult and very 
expensive. When internal policies and infrastructure dictate SAN or NAS, 
dedicated high quality NAS can often be preferable to shared, under provisioned 
SAN (while being cheaper.)

As Mike pointed out, can you maintain HA with your NAS setup? This is 
particular to the unit.

Without a clustered file system, you won't have multiple nodes pointing at same 
volume. Each node should receive dedicated pools and bandwidth.

You should not stripe across all volumes then thin provision out of a single 
pool.

No CIFS, windows shares, SMB. NFS has performance limitations even with 10g 
under Linux. Test, test, test.

It is often overlay services of fancy NAS that kill performance - dedup, 
compression, site-to-site replication, etc that kill performance.

Is this a shared resource? If so, how do ensure enough bandwidth for the 
MarkLogic nodes? How do you ensure you don't destroy the performance of other 
nodes?  You should have explicit visibility and control of each volume.

An example of successful SLA's can be found in Amazons Provisioned IOPS 
storage. While neither SAN nor NAS, it's sets a standard for what you should 
expect/demand from shared storage:
- explicit bandwidth guarantee to the storage pool (110 mb/sec for most high 
end instances - coincidently the practical throughput limit for many NFS 
limitations.)
- guaranteed IOPS at large block sizes for each volume. You need 20 mb/sec per 
forest. 16 forests a node, not unreasonable for a nice system with local 
storage, would need 240,000 IOPS at 4k blocks from your NAS. I think you'll 
find local storage much more cost effective.
- sustained SLA compliance even if maxing out all guarantees. A typical pattern 
sometimes is that a MarkLogic user will ask for that much bandwidth (80K 4k 
IOPS per node) then get laughed at by the storage admins. It's out of band with 
everything they have experience with. MarkLogic can end up looking more like a 
video streaming load than like Oracle. It really uses that much bandwidth and 
if the total provided is less, performance can drop off a cliff.

We are developing guidelines now for AWS storage but one rule of thumb is 
probably useful for NAS also. If you can, provision one volume per forest so 
you can track an allocate performance by volume/forest with less effort.  It 
also will make reallocation of load easier.

Local disk replication will move the copies of forests around for HA. Don't try 
to do that with the disk subsystem.

If you pass along more details as to planned configurations, I may be of more 
help.

Aaron Rosenbaum
Director, Product Management
aaron.rosenb...@marklogic.commailto:aaron.rosenb...@marklogic.com


Sent from my iPhone



An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20130208/53611d26/attachment-0001.html
--
Message: 2
Date: Fri, 8 Feb 2013 14:51:30 -0800
From: Michael Blakeley m...@blakeley.commailto:m...@blakeley.com
Subject: Re: [MarkLogic Dev General] Marklogic Cluster Setup
To: MarkLogic Developer Discussion 
general@developer.marklogic.commailto:general@developer.marklogic.com
Message-ID: 
ef106553-25fd-4ee4-9615-4cf50b0e3...@blakeley.commailto:ef106553-25fd-4ee4-9615-4cf50b0e3...@blakeley.com
Content-Type: text/plain; charset=windows-1252
The question which is faster? is impossible to answer generically. It's 
possible to design local storage so that it is slower or faster than a given 
NAS. It's possible to design NAS so that it is slower or faster than given 
local storage. But in most cases it is cheaper to build out similar levels of 
performance from local disk than from NAS (or SAN).

Re: [MarkLogic Dev General] Marklogic Cluster Setup (Khan, Kashif)

2013-02-12 Thread Aaron Rosenbaum
Whether NAS or Local Storage, MarkLogic handles the node-to-node replication. 
Take that out of the scenarios.

For NAS:
Use a single NAS server for all three servers.
Configure dedicated volumes with adequate bandwidth for each server/volume
Use local disk failover (in the docs) to manage node-to-node replication.

For Local Storage:
Add forests/Migrate forests on other hosts before running out of storage.  If 
you've run out of storage on all hosts, there MarkLogic won't function properly 
(or at all.)


I'm referring to NAS here with multiple IO ports on the host, enough spindles 
to keep up, management, etc…not 4 drives with an ethernet port on the back….

-Aaron
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Marklogic Cluster Setup (Khan, Kashif)

2013-02-12 Thread Michael Blakeley
As Aaron pointed out, MarkLogic forest replication is the right way to do this. 
A filesystem-based replication service won't know how to integrate with the 
MarkLogic cluster failover mechanism, and could leave you with a corrupt 
replica just when you need to fail over to it.

What do you do when your local storage is full? Managing local storage isn't 
fundamentally different than adding NAS storage. Add more storage. When the 
chassis fills up, add another chassis. Be sure to pay attention to the 
controller too. I often see systems where the disks themselves are not very 
busy, but the controller is overloaded.

All this underlines the need to do sizing up-front. You should be able to 
create a reasonable sizing model that balances your CPU, memory, disk, and 
network needs in a way that will last for the useful lifetime of the CPUs. Once 
the CPUs are obsolete, it will probably be time to rebuild the application 
anyway. You can migrate to newer hardware at the same time.

-- Mike

On 12 Feb 2013, at 14:13 , Khan, Kashif kashif.k...@hmhco.com wrote:

 Thanks Aaron and Mike for the detailed email. Here is what we are trying to 
 do. 
 
   • We are trying to set up a Marklogic Cluster of 3 servers for 
 Faileover.
   • We do not have GFS/Clustered file system.
   • We are trying to find out what are our best options.
 From the email chain this is what I understand our options are. 
 
 Scenario 1:
   • Use dedicated NAS one for each server. A total of 3 dedicated NAS 
 will be needed.
   • Configure a file replication service to replicate forests among all 3 
 instances of  NAS.
 Question: is there any documentation on how to configure the replication 
 service for Marklogic forest replication?
 
 Scenario 2:
   • Use local storage on all three servers
   • Configure a file replication service to replicate forests among all 3 
 instances of  local storage.
 Question: In this scenario if the local storage reached its capacity we 
 can not increase the local storage. What are the options if  local storage 
 gets maxed out?
 
 
 Suggestions are most welcome.
 
 __
 Kashif Khan
 
 
 
 
 On 2/9/13 6:41 PM, Aaron Rosenbaum aaron.rosenb...@marklogic.com wrote:
 
 Yes, you can use NAS. Like SAN, the key is adequate performance. This is the 
 tricky part because getting that performance is very difficult and very 
 expensive. When internal policies and infrastructure dictate SAN or NAS, 
 dedicated high quality NAS can often be preferable to shared, under 
 provisioned SAN (while being cheaper.)
 
 As Mike pointed out, can you maintain HA with your NAS setup? This is 
 particular to the unit.
 
 Without a clustered file system, you won't have multiple nodes pointing at 
 same volume. Each node should receive dedicated pools and bandwidth.
 
 You should not stripe across all volumes then thin provision out of a single 
 pool.
 
 No CIFS, windows shares, SMB. NFS has performance limitations even with 10g 
 under Linux. Test, test, test.
 
 It is often overlay services of fancy NAS that kill performance - dedup, 
 compression, site-to-site replication, etc that kill performance.
 
 Is this a shared resource? If so, how do ensure enough bandwidth for the 
 MarkLogic nodes? How do you ensure you don't destroy the performance of 
 other nodes?  You should have explicit visibility and control of each 
 volume.  
 
 An example of successful SLA's can be found in Amazons Provisioned IOPS 
 storage. While neither SAN nor NAS, it's sets a standard for what you should 
 expect/demand from shared storage:
 - explicit bandwidth guarantee to the storage pool (110 mb/sec for most high 
 end instances - coincidently the practical throughput limit for many NFS 
 limitations.)
 - guaranteed IOPS at large block sizes for each volume. You need 20 mb/sec 
 per forest. 16 forests a node, not unreasonable for a nice system with local 
 storage, would need 240,000 IOPS at 4k blocks from your NAS. I think you'll 
 find local storage much more cost effective.
 - sustained SLA compliance even if maxing out all guarantees. A typical 
 pattern sometimes is that a MarkLogic user will ask for that much bandwidth 
 (80K 4k IOPS per node) then get laughed at by the storage admins. It's out 
 of band with everything they have experience with. MarkLogic can end up 
 looking more like a video streaming load than like Oracle. It really uses 
 that much bandwidth and if the total provided is less, performance can drop 
 off a cliff.
 
 We are developing guidelines now for AWS storage but one rule of thumb is 
 probably useful for NAS also. If you can, provision one volume per forest so 
 you can track an allocate performance by volume/forest with less effort.  It 
 also will make reallocation of load easier.
 
 Local disk replication will move the copies of forests around for HA. Don't 
 try to do that with the disk subsystem.
 
 If you pass along more details as to planned configurations,