Re: Node permanently in maintenance mode

2021-07-30 Thread Piotr Jagielski
Ok I managed to start the node by deleting maintenance_tasks.mntc file and 
corrupted cache from node storage directory

On 2021/07/30 11:09:01, Piotr Jagielski  wrote: 
> Hi,
> We recently switched our cluster to 2.10.
> We had problems with one of the caches before the upgrade, and deleted some 
> data.
> After the upgrade Ignite has run for 1-2 day and today one of the nodes 
> stated in "Maintenance mode" after the restart:
> 
> 2021-07-30 12:59:41 WARN  Maintenance task found, stop restoring memory
> 2021-07-30 12:59:41 INFO  Node requires maintenance, non-empty set of 
> maintenance tasks is found: [corrupted-cache-data-files-task]
> 
> The problem is that this task never ends and the node cannot join the rest of 
> the cluster. 
> Is there any option we can move to normal mode or cancel this task?
> 
> Regards,
> Piotr
> 
> 


Re: Ignite High memory usage though very less disk size

2021-07-30 Thread Zhenya Stanilovsky

hi, Devakumar J
There is not enough information for analysis.
Do you have any monitoring ? If no — plz enable it and try to understand how 
huge cpu consumption and possibly gc pauses correlates with you tasks.
Do you have enough heap (-Xmx param) ? What kind of processes are consume most 
heap ?
Without all these info we can`t move forward in analysis.
 
thanks ! 
 
>Hi,
>
>We have 3server+2client cluster setup. Also we have 2 completely different 
>clusters for different regions.
>
>Both has similar set of integrations in terms of SQL queries/ CQ listeners/ 
>Client connections.
>
>Also the VM hardware/OS settings also same.
>
>In cluster 1 through we have disk of 20GB but the cluster performance is 
>really good and heap usage/CPU usage is optimal.
>
>In cluster 2 we do have less data only in disk but there is heavy fluctuations 
>in heap usage and lot FULL GC happening pausing JVM for 7 to 8 secs every 
>minute. Only restart helps in this case.
>
>
>Only difference noticed between machines is memory page cache utilization. We 
>have done page cache cleanup and restarted the cluster and page cache 
>utilization become 105 GB out of 126GB RAM with in a day.
>
>Please find the metrics below and suggest any debugging steps to carry 
>out/document to refer.
>
>
>Cluster 1:
>
>Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>    ^-- Node [id=27529ecd, name=server-node-3, uptime=2 days, 17:48:09.803]
>    ^-- H/N/C [hosts=3, nodes=5, CPUs=24]
>    ^-- CPU [cur=1.33%, avg=3.05%, GC=0%]
>    ^-- PageMemory [pages=3375870]
>    ^-- Heap [used=4372MB, free=73.31%, comm=5600MB]
>    ^-- Off-heap [used=13341MB, free=20.03%, comm=16584MB]
>    ^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
>    ^--   metastoreMemPlc region [used=0MB, free=99.85%, comm=0MB]
>    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
>    ^--   DefaultRegion region [used=13341MB, free=18.57%, comm=16384MB]
>    ^-- Ignite persistence [used=20052MB]
>    ^--   sysMemPlc region [used=0MB]
>    ^--   metastoreMemPlc region [used=0MB]
>    ^--   TxLog region [used=0MB]
>    ^--   DefaultRegion region [used=20052MB]
>    ^-- Outbound messages queue [size=0]
>    ^-- Public thread pool [active=0, idle=0, qSize=0]
>    ^-- System thread pool [active=0, idle=7, qSize=0]
>
>Cluster 2:
>Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>    ^-- Node [id=5905afb7, name=server-node-1, uptime=2 days, 05:49:04.925]
>    ^-- H/N/C [hosts=3, nodes=5, CPUs=24]
>    ^-- CPU [cur=1.23%, avg=6.4%, GC=0%]
>    ^-- PageMemory [pages=1173731]
>    ^-- Heap [used=13043MB, free=20.39%, comm=16384MB]
>    ^-- Off-heap [used=4638MB, free=72.2%, comm=16584MB]
>    ^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
>    ^--   metastoreMemPlc region [used=0MB, free=99.91%, comm=0MB]
>    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
>    ^--   DefaultRegion region [used=4638MB, free=71.69%, comm=16384MB]
>    ^-- Ignite persistence [used=5423MB]
>    ^--   sysMemPlc region [used=0MB]
>    ^--   metastoreMemPlc region [used=0MB]
>    ^--   TxLog region [used=0MB]
>    ^--   DefaultRegion region [used=5422MB]
>    ^-- Outbound messages queue [size=0]
>    ^-- Public thread pool [active=0, idle=0, qSize=0]
>    ^-- System thread pool [active=0, idle=5, qSize=0]
> 
>  Thanks & Regards ,
>Devakumar J
> 
>Virus-free.  www.avast.com 
 
 
 
 

Node permanently in maintenance mode

2021-07-30 Thread Piotr Jagielski
Hi,
We recently switched our cluster to 2.10.
We had problems with one of the caches before the upgrade, and deleted some 
data.
After the upgrade Ignite has run for 1-2 day and today one of the nodes stated 
in "Maintenance mode" after the restart:

2021-07-30 12:59:41 WARN  Maintenance task found, stop restoring memory
2021-07-30 12:59:41 INFO  Node requires maintenance, non-empty set of 
maintenance tasks is found: [corrupted-cache-data-files-task]

The problem is that this task never ends and the node cannot join the rest of 
the cluster. 
Is there any option we can move to normal mode or cancel this task?

Regards,
Piotr



RE: System.Net.Sockets.SocketException: Only one usage of each socket address

2021-07-30 Thread satyajit.mandal
Hi  Pavel,

There  are  multiple  threads  running  and  trying  to  access  the  endpoint  
.   As such  in  the  logs  it  is  giving interim  exception.  I  tried  to  
increase  the  port  range  but  still  its  giving  exception  when  messages  
are  flowing  in.

System.AggregateException: Failed to establish Ignite thin client connection, 
examine inner exceptions for details. ---> System.Net.Sockets.SocketException: 
Only one usage of each socket address (protocol/network address/port) is 
normally permitted xx.xx.xx.xx:10800

Could  you  suggest  any  optimization  we  can  do  at  cache  configuration  
level  to  avoid  this  issue?
Thanks
Satyajit



From: Pavel Tupitsyn 
Sent: 27 July 2021 13:04
To: user 
Subject: Re: System.Net.Sockets.SocketException: Only one usage of each socket 
address


CAUTION: This email originated from outside our organisation - 
ptupit...@apache.org Do not click on links, open 
attachments, or respond unless you recognize the sender and can validate the 
content is safe.
Looks like another program occupies this port.
Try "netstat -a" to find out.

On Tue, Jul 27, 2021 at 10:28 AM 
mailto:satyajit.man...@barclays.com>> wrote:
Hi  Team,


Does  anyone  know  about  this  error and  possible  fix?

System.AggregateException: Failed to establish Ignite thin client connection, 
examine inner exceptions for details. ---> System.Net.Sockets.SocketException: 
Only one usage of each socket address (protocol/network address/port) is 
normally permitted xx.xx.xx.xx:10800


Thanks
Satyajit



_
“This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is not intended for retail customer use. Intended for 
recipient only. This message is subject to the terms at: 
www.barclays.com/emaildisclaimer.

For important disclosures, please see: 
www.barclays.com/salesandtradingdisclaimer
 regarding market commentary from Barclays Sales and/or Trading, who are active 
market participants; 
https://www.investmentbank.barclays.com/disclosures/barclays-global-markets-disclosures.html
 regarding our standard terms for the Investment Bank of Barclays where we 
trade with you in principal-to-principal wholesale markets transactions; and in 
respect of Barclays Research, including disclosures relating to specific 
issuers, please see 
http://publicresearch.barclays.com.”
_
If you are incorporated or operating in Australia, please see 
https://www.home.barclays/disclosures/importantapacdisclosures.html
 for important disclosure.
_
How we use personal information  see our privacy notice 
https://www.investmentbank.barclays.com/disclosures/personalinformationuse.html
_

_
�This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is n