Re: WAL and WAL Archive volume size recommendation

2020-11-12 Thread facundo.maldonado
Ok, will do that. 

It's not clear at least for me why.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: WAL and WAL Archive volume size recommendation

2020-11-06 Thread Mahesh Renduchintala
Dennis

"The WAL archive is used to store WAL segments that may be needed to recover 
the node after a crash. The number of segments kept in the archive is such that 
the total size of all segments does not exceed the specified size of the WAL 
archive"

Given the above in documentation, if we disable WAL-Archive as mentioned in the 
docs, will we have trouble recovering the data in work folder on the node 
reboot?

regards
mahesh


Re: WAL and WAL Archive volume size recommendation

2020-11-05 Thread Denis Magda
Hello Facundo,

Just go ahead and disable the WAL archives. You need the archives for the
point-in-time-recovery feature that is supported by GridGain. I'll check
with the community why we have the archives enabled by default in a
separate discussion.
https://ignite.apache.org/docs/latest/persistence/native-persistence#disabling-wal-archive

-
Denis


On Thu, Nov 5, 2020 at 11:37 AM facundo.maldonado <
maldonadofacu...@gmail.com> wrote:

> Well, I found some useful numbers between two pages in the documentation.
>
> "By default, there are 10 active segments."  wal ref
> <
> https://ignite.apache.org/docs/latest/persistence/native-persistence#write-ahead-log>
>
>
> "The number of segments kept in the archive is such that the total size of
> all segments does not exceed the specified size of the WAL archive.
> By default, the maximum size of the WAL archive (total space it occupies on
> disk) is defined as 4 times the size of the checkpointing buffer."
> wal-archive ref
> <
> https://ignite.apache.org/docs/latest/persistence/native-persistence#wal-archive>
>
>
> "The default buffer size is calculated as a function of the data region
> size:
>
> Data Region Size   Default Checkpointing Buffer Size
> < 1 GB MIN (256 MB, Data_Region_Size)
> between 1 GB and 8 GB Data_Region_Size / 4
> > 8 GB 2 GB"   checkpoint buffer size
> > <
> https://ignite.apache.org/docs/latest/persistence/persistence-tuning#adjusting-checkpointing-buffer-size>
>
>
> So, if i have:
> data region max size: 5Gb
> storage vol size: 10Gi
> I can set:
> WAL vol size: 1Gb  # WAL size is 10 * wal segment 64Mb
> WAL archive vol size: 5Gi
> # 4 times checkpoint size
> # region < 8Gb, checkpoint size is region/4 --> wal archive size is equals
> to region size
> # region > 8Gb, checkpoint is 2 Gb --> wal archive is at least 4*2Gb == 8GB
>
> With those settings, I can keep the test running some more time but the pod
> keeps crashing.
> At least, it seems that I'm not getting the same error as before.
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: WAL and WAL Archive volume size recommendation

2020-11-05 Thread facundo.maldonado
Well, I found some useful numbers between two pages in the documentation.

"By default, there are 10 active segments."  wal ref

  

"The number of segments kept in the archive is such that the total size of
all segments does not exceed the specified size of the WAL archive.
By default, the maximum size of the WAL archive (total space it occupies on
disk) is defined as 4 times the size of the checkpointing buffer." 
wal-archive ref

  

"The default buffer size is calculated as a function of the data region
size:

Data Region Size   Default Checkpointing Buffer Size
< 1 GB MIN (256 MB, Data_Region_Size)
between 1 GB and 8 GB Data_Region_Size / 4
> 8 GB 2 GB"   checkpoint buffer size
> 
>   

So, if i have:
data region max size: 5Gb
storage vol size: 10Gi
I can set: 
WAL vol size: 1Gb  # WAL size is 10 * wal segment 64Mb
WAL archive vol size: 5Gi
# 4 times checkpoint size
# region < 8Gb, checkpoint size is region/4 --> wal archive size is equals
to region size
# region > 8Gb, checkpoint is 2 Gb --> wal archive is at least 4*2Gb == 8GB

With those settings, I can keep the test running some more time but the pod
keeps crashing.
At least, it seems that I'm not getting the same error as before.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


WAL and WAL Archive volume size recommendation

2020-11-05 Thread Pelado
Hi everyone, I'm running a POC on a small deployment in a kubernetes
environment and after a few minutes of load testing, the data node fails
with this message:

ss o.a.i.i.processors.cache.persistence.StorageException:* Failed to
archive WAL segment*
[srcFile=/opt/work/wal/node00-ef1e49d3-1c67-4527-9a24-bae580a5ed91/0005.wal,
dstFile=/opt/work/walarchive/node00-ef1e49d3-1c67-4527-9a24-bae580a5ed91/0065.wal.tmp]]]
org.apache.ignite.internal.processors.cache.persistence.StorageException:
Failed to archive WAL segment
[srcFile=/opt/work/wal/node00-ef1e49d3-1c67-4527-9a24-bae580a5ed91/0005.wal,
dstFile=/opt/work/walarchive/node00-ef1e49d3-1c67-4527-9a24-bae580a5ed91/0065.wal.tmp]
.
Caused by: java.nio.file.FileSystemException:
/opt/work/wal/node00-ef1e49d3-1c67-4527-9a24-bae580a5ed91/0005.wal
->
/opt/work/walarchive/node00-ef1e49d3-1c67-4527-9a24-bae580a5ed91/0065.wal.tmp:
*No space left on device*

I have one data node, with a cache, persistence enabled and I have 3 PVC
one for each of storage, WALand WALarchive.
I load data from a kafka topic using a Kafka Streamer running in a
different pod.
Incoming load (at the topic) is about 5K records per second.
Average record size is 1.8 Kb.

Data region is configured with a maxSize of 5 Gb
Storage volumen with 10 GB
Wal volumen with 2 GB
Wal archive with 2 GB. (also tried 3 and 4)

The rest of the settings  (page size, wal segment size, etc) are with
default values.
Ignite version is 2.9.0.

My question is, Is there some recommendation on the size these volumes
should have respective on the storage size, record size or some other
factor?
Maybe wal segment? If I increase the wal segment from 64Mb (default size)
to lets say 512 Mb, How much should I increase WAL and WAL archive volumes?

Thanks,
-- 
Facundo Maldonado