The cluster is now online in US East2 and US West. I did the following:
- Changed the default-distributed-db-config.json to:
{
"replication": true,
"autoDeploy": true,
"hotAlignment": false,
"resyncEvery": 15,
"clusters": {
"internal": {
"replication": false
},
"index": {
"replication": false
},
"*": {
"replication": true,
"readQuorum": 1,
"writeQuorum": 1,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"partitioning": {
"strategy": "round-robin",
"default": 0,
"partitions": [
[ "<NEW_NODE>" ]
]
}
}
}
}
- Deleted the distributed-config.json file from each database folder and
restarted each node in the cluster.
Now, when I connect to one of the nodes and try to delete a vertex, I
receive the following error:
com.orientechnologies.orient.server.distributed.ODistributedException:
Error on executing distributed request (id=141
from=odb02uw task=command_sql(delete vertex #42:2) userName=) against
database 'vis.[]' to nodes [odb02ue2, odb02uw,
odb01uw, odb01ue2] -->
com.orientechnologies.orient.server.distributed.ODistributedException:
Quorum 4 not reached for
request (id=141 from=odb02uw task=command_sql(delete vertex #42:2)
userName=). Timeout=407ms Servers in timeout/
conflict are: - odb02ue2:
com.orientechnologies.orient.core.exception.OCommandExecutionException:
Error on execution
of command: sql.delete vertex #42:2 - odb01ue2:
com.orientechnologies.orient.core.exception.
OCommandExecutionException: Error on execution of command: sql.delete
vertex #42:2 - odb01uw: com.orientechnologies.
orient.core.exception.OCommandExecutionException: Error on execution of
command: sql.delete vertex #42:2 Received:
{odb02uw=com.orientechnologies.orient.core.exception.OCommandExecutionException:
Error on execution of command: sql.
delete vertex #42:2,
odb01uw=com.orientechnologies.orient.core.exception.OCommandExecutionException:
Error on
execution of command: sql.delete vertex #42:2,
odb02ue2=com.orientechnologies.orient.core.exception.
OCommandExecutionException: Error on execution of command: sql.delete
vertex #42:2, odb01ue2=com.orientechnologies.
orient.core.exception.OCommandExecutionException: Error on execution of
command: sql.delete vertex #42:2}
Why am I not able to delete a vertex?
Amir.
On Tuesday, March 24, 2015 at 12:20:37 PM UTC-5, Colin wrote:
>
> That latency should be fine so long as it's consistent.
>
> -Colin
>
> On Tuesday, March 24, 2015 at 11:52:58 AM UTC-5, Amir Khawaja wrote:
>>
>> Hi Colin,
>>
>> I checked the latency prior to posting and between regions it is about
>> 65ms on average. What should I set the latency to for Hazelcast?
>>
>> Amir.
>>
>> On Tuesday, March 24, 2015 at 11:49:25 AM UTC-5, Colin wrote:
>>>
>>> Hi Amir,
>>>
>>> You might also do a ping and a traceroute between the machines and see
>>> what kind of latency you're getting, just in case it's a timeout issue with
>>> Hazelcast.
>>>
>>> -Colin
>>>
>>> On Tuesday, March 24, 2015 at 11:32:21 AM UTC-5, Amir Khawaja wrote:
>>>>
>>>> Hi Colin,
>>>>
>>>> Thank you for the prompt response.
>>>>
>>>> I'm a little confused as you say "the US West node will not come online
>>>>> telling me that the database is not yet online. At that point, I kill
>>>>> the
>>>>> process and then eventually the database comes online."
>>>>
>>>> Do you mean you kill the database process and then restart it and then
>>>>> it starts communicating?
>>>>
>>>>
>>>> Yes. I kill the database process on the cluster node where the OrientDB
>>>> is not coming online.
>>>>
>>>> Can you see on each machine when Hazelcast 'sees' all the members? Are
>>>>> all the members showing up?
>>>>
>>>>
>>>> Yes. I see the databases are talking to each other as the IP address of
>>>> the nodes show up in the log of each database server.
>>>>
>>>> I will try setting hotAlignment to false and report my results on this
>>>> thread.
>>>>
>>>> Amir.
>>>>
>>>>
>>>> On Tuesday, March 24, 2015 at 11:25:16 AM UTC-5, Colin wrote:
>>>>>
>>>>> Hi Amir,
>>>>>
>>>>> Is it consistently a problem between the same machines not seeing each
>>>>> other?
>>>>>
>>>>> I'm a little confused as you say "the US West node will not come
>>>>> online telling me that the database is not yet online. At that point, I
>>>>> kill the process and then eventually the database comes online."
>>>>>
>>>>> Do you mean you kill the database process and then restart it and then
>>>>> it starts communicating?
>>>>>
>>>>> In your distributed json file, try setting "hotAlignment" to false.
>>>>>
>>>>> Can you see on each machine when Hazelcast 'sees' all the members?
>>>>> Are all the members showing up?
>>>>>
>>>>> -Colin
>>>>>
>>>>> Orient Technologies
>>>>>
>>>>> The Company behind OrientDB
>>>>>
>>>>> On Tuesday, March 24, 2015 at 11:19:05 AM UTC-5, Amir Khawaja wrote:
>>>>>>
>>>>>> Greetings, everyone. Has anyone had much success running an OrientDB
>>>>>> 2.0.5 cluster in Azure? I created a cluster in Windows Azure with 4
>>>>>> nodes
>>>>>> using CentOS 7 and OrientDB Community 2.0.4 -- 2 nodes in US East2 and 2
>>>>>> nodes in US West. There is a Site-to-Site VPN connection between the two
>>>>>> regions in Azure and data is flowing between machines across the
>>>>>> network. I
>>>>>> have three databases that I have currently deployed and testing. I find
>>>>>> that many times the synchronization between databases does not occur.
>>>>>> For
>>>>>> instance, if I startup the first node in US East2 and once that comes
>>>>>> online, fire up the second node in US West, the US West node will not
>>>>>> come
>>>>>> online telling me that the database is not yet online. At that point, I
>>>>>> kill the process and then eventually the database comes online. I even
>>>>>> have
>>>>>> to go so far as to delete the databases in the database path folder. I
>>>>>> do
>>>>>> this a few times and eventually the server may startup. Sometimes, I
>>>>>> will
>>>>>> have three of the four nodes working and the fourth just refuses to come
>>>>>> online.
>>>>>>
>>>>>> The VM size selected for each node in the cluster is a D4 (4 cores,
>>>>>> 28GB RAM). This should be more than sufficient to handle most loads.
>>>>>> Surely, I must be missing something as this is not acceptable production
>>>>>> behavior. For reference, I am pasting the hazelcast.xml and
>>>>>> default-distributed-db-config.json files here in hopes that someone has
>>>>>> some pointers for me.
>>>>>>
>>>>>> *** hazelcast.xml ***
>>>>>>
>>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>> <!-- ~ Copyright (c) 2008-2012, Hazel Bilisim Ltd. All Rights
>>>>>> Reserved. ~
>>>>>> ~ Licensed under the Apache License, Version 2.0 (the "License"); ~
>>>>>> you may
>>>>>> not use this file except in compliance with the License. ~ You may
>>>>>> obtain
>>>>>> a copy of the License at ~ ~
>>>>>> http://www.apache.org/licenses/LICENSE-2.0 ~
>>>>>> ~ Unless required by applicable law or agreed to in writing, software
>>>>>> ~ distributed
>>>>>> under the License is distributed on an "AS IS" BASIS, ~ WITHOUT
>>>>>> WARRANTIES
>>>>>> OR CONDITIONS OF ANY KIND, either express or implied. ~ See the
>>>>>> License for
>>>>>> the specific language governing permissions and ~ limitations under
>>>>>> the License. -->
>>>>>>
>>>>>> <hazelcast
>>>>>> xsi:schemaLocation="http://www.hazelcast.com/schema/config
>>>>>> hazelcast-config-3.0.xsd"
>>>>>> xmlns="http://www.hazelcast.com/schema/config" xmlns:xsi="
>>>>>> http://www.w3.org/2001/XMLSchema-instance">
>>>>>> <group>
>>>>>> <name>[name]</name>
>>>>>> <password>[password]</password>
>>>>>> </group>
>>>>>> <network>
>>>>>> <port auto-increment="true">2434</port>
>>>>>> <join>
>>>>>> <multicast enabled="false">
>>>>>> <multicast-group>235.1.1.1</multicast-group>
>>>>>> <multicast-port>2434</multicast-port>
>>>>>> </multicast>
>>>>>> <tcp-ip enabled="true">
>>>>>> <member>10.0.0.4</member>
>>>>>> <member>10.0.0.5</member>
>>>>>> <member>10.1.0.4</member>
>>>>>> <member>10.1.0.5</member>
>>>>>> </tcp-ip>
>>>>>> </join>
>>>>>> </network>
>>>>>> <executor-service>
>>>>>> <pool-size>16</pool-size>
>>>>>> </executor-service>
>>>>>> </hazelcast>
>>>>>>
>>>>>>
>>>>>> *** default-distributed-db-config.json ***
>>>>>>
>>>>>> {
>>>>>> "autoDeploy": true,
>>>>>> "hotAlignment": true,
>>>>>> "executionMode": "synchronous",
>>>>>> "readQuorum": 1,
>>>>>> "writeQuorum": 3,
>>>>>> "failureAvailableNodesLessQuorum": false,
>>>>>> "readYourWrites": true,
>>>>>> "clusters": {
>>>>>> "internal": {
>>>>>> },
>>>>>> "index": {
>>>>>> },
>>>>>> "*": {
>>>>>> "servers" : [ "<NEW_NODE>" ]
>>>>>> }
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> Thank you for any assistance you can offer.
>>>>>>
>>>>>> Amir.
>>>>>>
>>>>>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.