I am looking for input on Sedna configuration and deployment.
I apologize about bring up the hosting issue up again, and the length of the
post. I have update details on the problem space, and wonder if anybody has
an input for an ideal solution.
Problem space:
* 2000 individual companies.
* Each company contains 500-2000 MB spread over 60 files that are inserted
on initialization of the company.
* The files are simple text XML files. The largest xml files can go to about
250 MB.
* No more than 200-400 users are using the system at any time.
* Insert about 1-5 MB per-xml a day per-company
* Update about 1-5 MB per-xml a day per-company
Options (that I can think of):
1. Single system:
2000 catalogs == 2000 individual companies.
Pros: Ease to manage.
Cons:
Can the Sedna data file become fragmented? eXist-db has data file issues
with the catalog approach.
Are there any boundary conditions surrounding a 1-2 TB data file and 2000
catalogs?
2. Multiple databases:
2000 databases == 2000 individual companies.
Pros:
* Don't have to worry about fragmentation.
* If a company drops off, just delete the individual database.
* Can tell clients that they have their own database, no sharing.
Cons:
* More difficult to manage than a single system.
* Had to modify and recompile Sedna
* Memory limitations. 100 MB of RAM per-company, 50 companies running = 5+
GB.
(Can't easily meet 200-400 users without modifications and a bigger box))
* Memory fragments over time that is easily resolved by re-booting the
server.
3. Amazon micro instances:
2000 micro ec2 instances == 2000 individual companies.
Roughly 500 MB of usable memory per-instance.
Pros:
* Only have the instance running when someone is using the system.
* Turn off and on the instance on for the specific company.
* Company drops off, just delete the individual instance.
Cons:
* Very difficult to manage.
* Amazon has account limits on instances per-region.
FYI: The Amazon micro instance is not a good approach if the client trying
to access the database via the client. Amazon micro instances have very low
bandwidth. Micro instances work if there is a proxy between the client and
database, and the proxy is in the same region as the micro instance.
4. Hybrid:
* A hybrid approach that combines several techniques. Possibly 10 databases
with 200 catalogs each.
Any input is appreciated.
Thanks,
Malcolm
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion