Hi Naveen,
I think just stopping updates is not enough to make a consistent snapshot
of the partition stores.
You must ensure that all updates are also checkpointed to disk. Otherwise,
to restore a valid snapshot you must copy WAL as well as partition stores.
You can try to deactivate the source
ion-td34823.html
>
> Now it seems that the relevant explanations are confusing?
> 在 2020/12/11 下午8:21, Pavel Kovalenko 写道:
>
> Hi,
>
> I think it's wrong information on the wiki that PME is not triggered for
> some cases. It should be fixed.
> Actually, PME is triggered in all ca
Your thoughts are right.
If cache exists no PME will be started.
If doesn't exist - getOrCreate() method will create it and start PME,
cache() method will throw an exception or return null (doesn't remember
what exactly)
пт, 11 дек. 2020 г. в 17:48, VeenaMithare :
> HI Pavel,
>
> Thank you for
Hi,
I think it's wrong information on the wiki that PME is not triggered for
some cases. It should be fixed.
Actually, PME is triggered in all cases but for some of them it doesn't
block cache operations or the time of blocking is minimized.
Most optimizations for minimizing the blocking time of
Hello,
I don't clearly understand from your message, but have the exchange finally
finished? Or you were getting this WARN message all the time?
пт, 1 мая 2020 г. в 12:32, Ilya Kasnacheev :
> Hello!
>
> This description sounds like a typical hanging Partition Map Exchange, but
> you should be
Hi Ibrahim,
I see you have 317 cache groups in your cluster `Full map updating for 317
groups performed in 105 ms.`
Each cache group has own partition map and affinity map that require memory
which resides in old-gen.
During cache creation, a distributed PME happens and all partition and
affinity
Ibrahim,
I've checked logs and found the following issue:
[2019-09-27T15:00:06,164][ERROR][sys-stripe-32-#33][atomic] Received
message without registered handler (will ignore)
[msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
arr=[6389728]]],
Ibrahim,
Could you please also share the cache configuration that is used for
dynamic creation?
чт, 10 окт. 2019 г. в 19:09, Pavel Kovalenko :
> Hi Ibrahim,
>
> I see that one node didn't send acknowledgment during cache creation:
> [2019-09-27T15:00:17,727][WARN
> ][excha
Hi Ibrahim,
I see that one node didn't send acknowledgment during cache creation:
[2019-09-27T15:00:17,727][WARN
][exchange-worker-#219][GridDhtPartitionsExchangeFuture] Unable to await
partitions release latch within timeout: ServerLatch [permits=1,
Mahesh,
Assertion error occurs if you run node with enabled assertions (jvm flag
-ea). If assertions are disabled it leads to NullPointer exception as you
have in logs.
сб, 5 окт. 2019 г. в 16:47, Mahesh Renduchintala <
mahesh.renduchint...@aline-consulting.com>:
> Pavel, I don't have the logs
Mahesh,
Do you have logs from the following thick client?
TcpDiscoveryNode [id=5204d16d-e6fc-4cc3-a1d9-17edf59f961e,
addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.1.171],
sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.1.171:0],
discPort=0, order=1146, intOrder=579,
Mahesh,
According to your logs and exception what I see, the issue you mentioned is
not related to your problem.
The similar with IGNITE-10010 problem is
https://issues.apache.org/jira/browse/IGNITE-9562
You have thick client join to server topology:
Hi Mahesh,
Your problem is described here:
https://issues.apache.org/jira/browse/IGNITE-12255
The section starts with "This solution showed the existing race between
client node join and concurrent cache destroy."
According to your logs, I see concurrent client node join and stop caches
Denis,
You can't set page size greater than 16Kb due to our page memory
limitations.
чт, 22 авг. 2019 г. в 22:34, Denis Magda :
> How about setting page size to more KBs or MBs based on the average value?
> That should work perfectly fine.
>
> -
> Denis
>
>
> On Thu, Aug 22, 2019 at 8:11 AM
This sounds strange. There definetely should be a cause of such behaviour.
Rebalancing is happened only after an topology change (node join/leave,
deactivation/activation).
Could you please share logs from node with exception you mentioned in
message, node with id
Hello,
It means that node with id "5423e6b5-c9be-4eb8-8f68-e643357ec2b3" has
outdated data (possibly due to restart) and started to rebalance missed
updates from a node with up-to-date data (where you have exception) using
WAL.
WAL rebalance is used when the number of entries in some partition
Hello,
Could you please attach additional logs from a coordinator node? An id of
that node you may notice in "Unable to await partitions release latch"
message.
Also, it would be good to have logs from the client machine and from any
other server node in the cluster.
пн, 24 дек. 2018 г. в 09:13,
Hello Wangsan,
Seems it's known issue https://issues.apache.org/jira/browse/IGNITE-9493 .
пн, 12 нояб. 2018 г. в 18:06, wangsan :
> I have a server node in zone A ,then I start a client from zone B, Now
> access
> between A,B was controlled by firewall,The acl is B can access A,but A can
> not
Hi Naveen and Andrey,
We've recently done major optimization
https://issues.apache.org/jira/browse/IGNITE-9420 that will speed-up
activation time in your case.
Iteration over WAL now happens only on a node start-up, so it will not
affect activation anymore.
Partitions state restoring (which is
Hello Eugene,
1) Split brain resolver takes into account only server nodes (not client).
No difference between in-memory only or with persistence.
2) It's no necessary to immediately remove a node from baseline topology
after split-brain. If you lost backup factor for some partitions (All
n one of the nodes. I would also expect the dead node to be
> removed from the cluster, and no longer take part in PME.
>
>
>
> On Wed, Sep 12, 2018 at 11:25 AM Pavel Kovalenko
> wrote:
>
>> Hi Eugene,
>>
>> Sorry, but I didn't catch the meaning of your quest
Hi Eugene,
I've reproduced your problem and filed a ticket for that:
https://issues.apache.org/jira/browse/IGNITE-9562
As a temporary workaround, I can suggest you delete persistence data
(cache.dat and partition files) related to that cache in starting node work
directory or don't destroy
Hi Eugene,
Sorry, but I didn't catch the meaning of your question about Zookeeper
Discovery. Could you please re-phrase it?
ср, 12 сент. 2018 г. в 17:54, Ilya Lantukh :
> Pavel K., can you please answer about Zookeeper discovery?
>
> On Wed, Sep 12, 2018 at 5:49 PM, eugene miretsky <
>
Hello Evgeny,
Could you please attach full logs from both nodes in your case #2? Make
sure, that quiet mode is disabled (-DIGNITE_QUIET=false) to have full info
logs.
пт, 7 сент. 2018 г. в 17:41, es70 :
> I have a cluster of 2 ignite (version 2.6) nodes with enabled persistence
> (at
> the time
Hello Engrdean,
You should enable persistence on your DataRegionConfiguration to make it
possible to evict files metadata pages from memory to disk.
2018-08-09 19:49 GMT+03:00 engrdean :
> I've been struggling to find a configuration that works successfully for
> IGFS
> with hadoop filesystem
Hello Jose,
Did you consider Mongo DB for your use case?
2018-08-08 10:13 GMT+03:00 joseheitor :
> Hi Ignite Team,
>
> Any tips and recommendations...?
>
> Jose
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>
Hello Ray,
I'm glad that your problem was resolved. I just want to add that on PME
beginning phase we're waiting for all current client operations finishing,
new operations are freezed till PME end. After node finishes all ongoing
client operations it counts down latch that you see in logs which
Hello Ray,
Without explicit errors in the log, it's not so easy to guess what was
that.
Because I don't see any errors, it should be a recoverable failure (even
taking a long time).
If you have such option, could you please enable DEBUG log level
for
Hello Ray,
It's hard to say that the issue you mentioned is the cause of your problem.
To determine it, it will be very good if you get thread dumps on next such
network glitch both from server and client nodes (using jstack e.g.).
I'm not aware of Ignite Spark DataFrames implementation features,
Hello Ray,
According to your attached log, It seems that you have some network
problems. Could you please also share logs from nodes with temporary ids =
[429edc2b-eb14-414f-a978-9bfe35443c8c, 6783732c-9a13-466f-800a-ad4c8d9be3bf].
The root cause should be on those nodes.
2018-07-25 13:03
David,
No this problem exists in older versions also.
пт, 15 июн. 2018 г. в 17:54, David Harvey :
> Is https://issues.apache.org/jira/browse/IGNITE-8780 a regression in 2.5 ?
>
> On Thu, Jun 14, 2018 at 7:03 AM, Pavel Kovalenko
> wrote:
>
>> DocDVZ,
>>
DocDVZ,
Most probably you faced with the following issue
https://issues.apache.org/jira/browse/IGNITE-8780.
You can try to remove END file marker, in this case node will be recovered
using WAL.
чт, 14 июн. 2018 г. в 12:00, DocDVZ :
> As i see, last checkpoint-end file, that invoked the problem,
Hello DocDVZ,
What is your hardware environment? Do you use external / network storage
device?
2018-06-09 15:14 GMT+03:00 DocDVZ :
> Raw text blocks were discarded from message:
> Service parameters:
> ignite.sh -J-Xmx6g -J-Xms6g -J-XX:+AlwaysPreTouch -J-XX:+UseG1GC
>
Hello,
Most probably there is no actual rebalancing started and we fire
REBALANCE_STARTED event ahead of time. Could you please turn on INFO log
level for Ignite classes and check that after node shutdown a message
"Skipping rebalancing" appears in logs?
2018-04-25 7:55 GMT+03:00 moon-duck
34 matches
Mail list logo